Teradata Tips: Skewness

Friday, September 11, 2009

Skewness

How will you determine which column will be appropriate for Index, that can avoid skewness?

I have given below the syntax. Before assigning a column as Index, make sure you check data in that column is not skewed.

SELECT
HASHAMP(HASHBUCKET(HASHROW(col whose amp distribution that needs to be checked; eg: PI)))
,CAST(COUNT(*) AS DECIMAL(18,0))
FROM
DBNAME.TBLNAME
GROUP BY 1
ORDER BY 1
;

PS: This alone cannot let you determine PI. As, you may sometimes have to decide based on how frequently it is used for Joining purposes etc.

3 comments:

bradySeptember 7, 2010 at 12:22 PM
Hi meena here iam not comment
but i want to ask you some questions on that
1 what is statistics in teradata.
2 what is the use of it
3 on what col's we hava to create stats
4 can we have same kind of stats in other RDBMS like oracle ,sqlserver and db2
thank for reply
ReplyDelete
Replies
Meena AthmanathanNovember 9, 2010 at 10:15 PM
Hello Brady,

#1 Statistics is metadata information, by which a Parsing Engine would understand how to access data most efficiently. The most updated your statistics are, the better performance of your query will be.
#2 Like I mentioned before, it is used internally by Parsing Engine. By collecting statistics, you tell the parsing Engine how the data is distributed across available AMPS.
#3 You need to decide based on frequency of access/Join
#4 To my knowledge, this is specific to the database architecture, so the usage would vary accordingly.
ReplyDelete
Replies
AnonymousJanuary 11, 2011 at 6:44 AM
HI Meena

Yhis is Venkat
May i know impartant functions which are using in real time.Can you please give those functions with the scenarios.No blog is providing these functions.So please provide the functions with the used situations.

Thank you
ReplyDelete
Replies

Add comment