[20170603]12c Top Frequency histogram.txt
---恢復(fù)內(nèi)容開始---
[20170603]12c Top Frequency histogram.txt
--//個(gè)人對(duì)直方圖了解很少,以前2種直方圖類型對(duì)于目前的許多應(yīng)用來講已經(jīng)足夠,或者講遇到的問題很少.
--//抽一點(diǎn)點(diǎn)時(shí)間,簡單探究12c Top Frequency histogram.
--//以前的頻率直方圖Frequency histogram,受限bucket(桶的大小),如果有255個(gè)不同值,oracle分析后不會(huì)建立頻率直方圖,而是建立高
--//度直方圖.這樣的情況會(huì)導(dǎo)致一些流行值的統(tǒng)計(jì)在顯示執(zhí)行計(jì)劃時(shí)差距很大.而12c引入了Top Frequency histogram,注意這里的top,
--//我的理解就是流行值(popular),也就是這樣建立的直方圖僅僅包括popular,其他non-popular不考慮,這樣在sql語句的查詢這些
--//popular時(shí),顯示的統(tǒng)計(jì)信息相對(duì)準(zhǔn)確,從而有利于oracle選擇正確的執(zhí)行計(jì)劃.
--//以下是我的學(xué)習(xí)筆記,也許會(huì)存在許多錯(cuò)誤,僅僅做一個(gè)記錄.我也看了許多別人的blog.^_^.而且我目前的環(huán)境只有12.0.0.1(版本太
--//低).
1.環(huán)境:
SCOTT@test01p> @ ver1
PORT_STRING??????????????????? VERSION??????? BANNER?????????????????????????????????????????????????????????????????????????????? CON_ID
------------------------------ -------------- -------------------------------------------------------------------------------- ----------
IBMPC/WIN_NT64-9.1.0?????????? 12.1.0.1.0???? Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production????????????? 0
--//如果要建立Top Frequency histogram必須要滿足幾個(gè)條件:
--//鏈接 raajeshwaran.blogspot.co.id/2016/06/top-frequency-histogram-in-12c.html
The database creates a Top frequency histogram, when the following criteria are met.
NDV is greater than n, where n is the requested number of buckets (default 254)
The percentage of rows occupied by Top-frequent values is greater than or equal to the threshold p where p is (1-(1/n)*100).
The estimate_percent parameter in dbms_stats gathering procedure should be auto_sample_size (set to default)
--//翻譯過來NDV(也就是字段的不同值)大于N(指bucket的數(shù)量).
--//流行值(popular)在Top-frequent中合計(jì)數(shù)量/總計(jì)數(shù)量之比要大于(1-(1/n)*100).如果建立10個(gè)桶,這樣流行值的總計(jì)必須在90%以上
2.首先驗(yàn)證(1-(1/n)*100)比值是否正確:
SCOTT@test01p> create table t as select * from dba_objects;
Table created.
select column_name,num_distinct,density,histogram,SAMPLE_SIZE
? from user_tab_col_statistics
? where table_name ='T'
? and column_name ='OWNER';
COLUMN_NAME????????? NUM_DISTINCT??? DENSITY HISTOGRAM?????? SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER????????????????????????? 32???? .03125 NONE????????????????? 91695
--//12c ctas 建立統(tǒng)計(jì)信息,但是不會(huì)建立直方圖.density 1/32=.03125.
SCOTT@test01p> select count(*) from t;
? COUNT(*)
----------
???? 91695
--//隨手寫的sql語句:
with a as (select distinct owner,count(*) over(partition by owner) n1 ,count(*) over () n2 from t order by 2 desc ),
b as (select owner,n1,n2,sum(n1) over (order by n1 desc) n3? from a order by n1 desc)
select rownum,owner,n1,n2,n3,round(n3/n2,5) x1,round(1-1/rownum,5) x2 from b;
ROWNUM OWNER??????????????? N1???????? N2???????? N3???????? X1???????? X2
------ ----------------- ----- ---------- ---------- ---------- ----------
???? 1 SYS?????????????? 41942????? 91695????? 41942???? .45741????????? 0
???? 2 PUBLIC??????????? 37142????? 91695????? 79084???? .86247???????? .5
???? 3 APEX_040200??????? 3405????? 91695????? 82489????? .8996???? .66667
???? 4 ORDSYS???????????? 3157????? 91695????? 85646???? .93403??????? .75
???? 5 MDSYS????????????? 1819????? 91695????? 87465???? .95387???????? .8
???? 6 XDB???????????????? 985????? 91695????? 88450???? .96461???? .83333
???? 7 SYSTEM????????????? 641????? 91695????? 89091????? .9716???? .85714
???? 8 CTXSYS????????????? 405????? 91695????? 89496???? .97602?????? .875
???? 9 WMSYS?????????????? 387????? 91695????? 89883???? .98024???? .88889
??? 10 DVSYS?????????????? 352????? 91695????? 90235???? .98408???????? .9
??? 11 SH????????????????? 309????? 91695????? 90544???? .98745???? .90909
??? 12 ORDDATA???????????? 292????? 91695????? 90836???? .99063???? .91667
??? 13 LBACSYS???????????? 209????? 91695????? 91045???? .99291???? .92308
??? 14 OE????????????????? 142????? 91695????? 91187???? .99446???? .92857
??? 15 SCOTT??????????????? 96????? 91695????? 91283???? .99551???? .93333
??? 16 GSMADMIN_INTERNAL??? 77????? 91695????? 91360???? .99635????? .9375
??? 17 IX?????????????????? 58????? 91695????? 91418???? .99698???? .94118
??? 18 DBSNMP?????????????? 55????? 91695????? 91473???? .99758???? .94444
??? 19 PM?????????????????? 44????? 91695????? 91517???? .99806???? .94737
??? 20 HR?????????????????? 35????? 91695????? 91552???? .99844??????? .95
??? 21 OLAPSYS????????????? 25????? 91695????? 91577???? .99871???? .95238
??? 22 OJVMSYS????????????? 23????? 91695????? 91600???? .99896???? .95455
??? 23 DVF????????????????? 19????? 91695????? 91619???? .99917???? .95652
??? 24 FLOWS_FILES????????? 13????? 91695????? 91632???? .99931???? .95833
??? 25 AUDSYS?????????????? 12????? 91695????? 91644???? .99944??????? .96
??? 26 ORDPLUGINS?????????? 10????? 91695????? 91664???? .99966???? .96154
??? 27 OUTLN??????????????? 10????? 91695????? 91664???? .99966???? .96296
??? 28 BI??????????????????? 8????? 91695????? 91688???? .99992???? .96429
??? 29 ORACLE_OCM??????????? 8????? 91695????? 91688???? .99992???? .96552
??? 30 SI_INFORMTN_SCHEM???? 8????? 91695????? 91688???? .99992???? .96667
??? 31 APPQOSSYS???????????? 5????? 91695????? 91693???? .99998???? .96774
??? 32 TEST????????????????? 2????? 91695????? 91695????????? 1???? .96875
--//如果加入條件where round(n3/n2,5) >round(1-1/rownum,5),全部輸出.也就是這樣如果桶小于32,大于1.建立的都是Top Frequency.
3.繼續(xù)測試:
D:\temp>cat a1.sql
cat a1.sql
exec? dbms_stats.gather_table_stats(ownname=>user,tabname=>'T',method_opt=>'for columns owner size &1');
select column_name,num_distinct,density,histogram,SAMPLE_SIZE from user_tab_col_statistics where table_name ='T' and column_name ='OWNER';
SCOTT@test01p> @ a1.sql 2
PL/SQL procedure successfully completed.
COLUMN_NAME????????? NUM_DISTINCT??? DENSITY HISTOGRAM?????? SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER????????????????????????? 32???? .03125 HYBRID???????????????? 5500
SCOTT@test01p> @ a1.sql 3
PL/SQL procedure successfully completed.
COLUMN_NAME????????? NUM_DISTINCT??? DENSITY HISTOGRAM?????? SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER????????????????????????? 32 5.4529E-06 TOP-FREQUENCY???????? 91695
SCOTT@test01p> @ a1.sql 4
PL/SQL procedure successfully completed.
COLUMN_NAME????????? NUM_DISTINCT??? DENSITY HISTOGRAM?????? SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER????????????????????????? 32 5.4529E-06 TOP-FREQUENCY???????? 91695
SCOTT@test01p> @ a1.sql 31
PL/SQL procedure successfully completed.
COLUMN_NAME????????? NUM_DISTINCT??? DENSITY HISTOGRAM?????? SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER????????????????????????? 32 5.4529E-06 TOP-FREQUENCY???????? 91695
SCOTT@test01p> @ a1.sql 32
PL/SQL procedure successfully completed.
COLUMN_NAME????????? NUM_DISTINCT??? DENSITY HISTOGRAM?????? SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER????????????????????????? 32 5.4529E-06 FREQUENCY???????????? 91695
--//除了bucket=2,32建立的直方圖HYBRID,FREQUENCY外,建立的都是TOP-FREQUENCY.
--//以10個(gè)bucket為例.解方程式(90235-x)/(91695-x)=0.9 ,得到x=77095.也就是要減少77095.
--//delete t where owner='SYS' and rownum<=41000;
--//delete t where owner='PUBLIC' and rownum<=36095;
SCOTT@test01p> delete t where owner='SYS' and rownum<=41000;
41000 rows deleted.
SCOTT@test01p> delete t where owner='PUBLIC' and rownum<=36095;
36095 rows deleted.
SCOTT@test01p> commit ;
Commit complete.
with a as (select distinct owner,count(*) over(partition by owner) n1 ,count(*) over () n2 from t order by 2 desc ),
b as (select owner,n1,n2,sum(n1) over (order by n1 desc) n3? from a order by n1 desc)
select rownum,owner,n1,n2,n3,round(n3/n2,5) x1,round(1-1/rownum,5) x2 from b where rownum<=11;
ROWNUM OWNER???????? N1???????? N2???????? N3???????? X1???????? X2
------ ----------- ---- ---------- ---------- ---------- ----------
???? 1 APEX_040200 3405????? 14600?????? 3405???? .23322????????? 0
???? 2 ORDSYS????? 3157????? 14600?????? 6562???? .44945???????? .5
???? 3 MDSYS?????? 1819????? 14600?????? 8381???? .57404???? .66667
???? 4 PUBLIC????? 1047????? 14600?????? 9428???? .64575??????? .75
???? 5 XDB????????? 985????? 14600????? 10413???? .71322???????? .8
???? 6 SYS????????? 942????? 14600????? 11355???? .77774???? .83333
???? 7 SYSTEM?????? 641????? 14600????? 11996???? .82164???? .85714
???? 8 CTXSYS?????? 405????? 14600????? 12401???? .84938?????? .875
???? 9 WMSYS??????? 387????? 14600????? 12788???? .87589???? .88889
??? 10 DVSYS??????? 352????? 14600????? 13140???????? .9???????? .9
??? 11 SH?????????? 309????? 14600????? 13449???? .92116???? .90909
11 rows selected.
--//backet=10,前面10個(gè)值占90%.
SCOTT@test01p> @ a1 10
PL/SQL procedure successfully completed.
COLUMN_NAME????????? NUM_DISTINCT??? DENSITY HISTOGRAM?????? SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER????????????????????????? 32 .000034247 TOP-FREQUENCY???????? 14600
--//再減少1條記錄.
SCOTT@test01p> delete t where owner='SYS' and rownum<=1;
1 row deleted.
SCOTT@test01p> commit ;
Commit complete.
ROWNUM OWNER???????? N1???????? N2???????? N3???????? X1???????? X2
------ ----------- ---- ---------- ---------- ---------- ----------
???? 1 APEX_040200 3405????? 14599?????? 3405???? .23324????????? 0
???? 2 ORDSYS????? 3157????? 14599?????? 6562???? .44948???????? .5
???? 3 MDSYS?????? 1819????? 14599?????? 8381???? .57408???? .66667
???? 4 PUBLIC????? 1047????? 14599?????? 9428????? .6458??????? .75
???? 5 XDB????????? 985????? 14599????? 10413???? .71327???????? .8
???? 6 SYS????????? 941????? 14599????? 11354???? .77772???? .83333
???? 7 SYSTEM?????? 641????? 14599????? 11995???? .82163???? .85714
???? 8 CTXSYS?????? 405????? 14599????? 12400???? .84937?????? .875
???? 9 WMSYS??????? 387????? 14599????? 12787???? .87588???? .88889
??? 10 DVSYS??????? 352????? 14599????? 13139???? .89999???????? .9
??? 11 SH?????????? 309????? 14599????? 13448???? .92116???? .90909
11 rows selected.
--//現(xiàn)在前10占.89999.
SCOTT@test01p> @ a1 10
PL/SQL procedure successfully completed.
COLUMN_NAME????????? NUM_DISTINCT??? DENSITY HISTOGRAM?????? SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER????????????????????????? 32??? .018378 HYBRID??????????????? 14599
--//可以發(fā)現(xiàn)建立的直方圖不是TOP-FREQUENCY,而是HYBRID(混合型直方圖).
4.轉(zhuǎn)化成TOP-FREQUENCY.
SCOTT@test01p> insert into t? select * from dba_objects where owner='SYS' and rownum=1;
1 row created.
SCOTT@test01p> commit ;
Commit complete.
SCOTT@test01p> @ a1 10
PL/SQL procedure successfully completed.
COLUMN_NAME????????? NUM_DISTINCT??? DENSITY HISTOGRAM?????? SAMPLE_SIZE
-------------------- ------------ ---------- --------------- -----------
OWNER????????????????????????? 32 .000034247 TOP-FREQUENCY???????? 14600
--//DENSITY=1/SAMPLE_SIZE/2, 1/14600/2=.00003424657534246575正好符合.
5.現(xiàn)在看看執(zhí)行計(jì)劃:
SCOTT@test01p> select count(*) from t where owner='DVSYS';
? COUNT(*)
----------
?????? 352
SCOTT@test01p> @ dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID? 2at34f0zaqhzj, child number 0
-------------------------------------
select count(*) from t where owner='DVSYS'
Plan hash value: 2966233522
---------------------------------------------------------------------------------------------------------------------
| Id? | Operation????????? | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time?? | A-Rows |?? A-Time?? | Buffers |
---------------------------------------------------------------------------------------------------------------------
|?? 0 | SELECT STATEMENT?? |????? |????? 1 |??????? |?????? |?? 428 (100)|????????? |????? 1 |00:00:00.01 |??? 1544 |
|?? 1 |? SORT AGGREGATE??? |????? |????? 1 |????? 1 |???? 8 |??????????? |????????? |????? 1 |00:00:00.01 |??? 1544 |
|*? 2 |?? TABLE ACCESS FULL| T??? |????? 1 |??? 352 |? 2816 |?? 428?? (0)| 00:00:01 |??? 352 |00:00:00.01 |??? 1544 |
---------------------------------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
?? 1 - SEL$1
?? 2 - SEL$1 / T@SEL$1
Predicate Information (identified by operation id):
---------------------------------------------------
?? 2 - filter("OWNER"='DVSYS')
--//可以發(fā)現(xiàn)e_rows=A-Rows.可以發(fā)現(xiàn)非常準(zhǔn)確.
--//查看直方圖信息.
SCOTT@test01p> select endpoint_number,endpoint_actual_value from user_tab_histograms where table_name ='T' and column_name ='OWNER' order by 1;
ENDPOINT_NUMBER ENDPOINT_ACTUAL_VALU
--------------- --------------------
?????????? 3405 APEX_040200
?????????? 3810 CTXSYS
?????????? 4162 DVSYS
?????????? 5981 MDSYS
?????????? 9138 ORDSYS
????????? 10185 PUBLIC
????????? 11127 SYS
????????? 11768 SYSTEM
????????? 12155 WMSYS
????????? 13140 XDB
10 rows selected.
--//4162-3810=352,可以發(fā)現(xiàn)正好符合.也就是popular值統(tǒng)計(jì)很正確.看看非popular值.
SCOTT@test01p> select count(*) from t where owner='DVSYS1';
? COUNT(*)
----------
???????? 0
Plan hash value: 2966233522?? ??? ?
---------------------------------------------------------------------------------------------------------------------
| Id? | Operation????????? | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time?? | A-Rows |?? A-Time?? | Buffers |
---------------------------------------------------------------------------------------------------------------------
|?? 0 | SELECT STATEMENT?? |????? |????? 1 |??????? |?????? |?? 428 (100)|????????? |????? 1 |00:00:00.01 |??? 1544 |
|?? 1 |? SORT AGGREGATE??? |????? |????? 1 |????? 1 |???? 8 |??????????? |????????? |????? 1 |00:00:00.01 |??? 1544 |
|*? 2 |?? TABLE ACCESS FULL| T??? |????? 1 |???? 66 |?? 528 |?? 428?? (0)| 00:00:01 |????? 0 |00:00:00.01 |??? 1544 |
---------------------------------------------------------------------------------------------------------------------
SCOTT@test01p> select count(*) from t where owner='SCOTT';
? COUNT(*)
----------
??????? 96
Plan hash value: 2966233522
---------------------------------------------------------------------------------------------------------------------
| Id? | Operation????????? | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time?? | A-Rows |?? A-Time?? | Buffers |
---------------------------------------------------------------------------------------------------------------------
|?? 0 | SELECT STATEMENT?? |????? |????? 1 |??????? |?????? |?? 428 (100)|????????? |????? 1 |00:00:00.01 |??? 1544 |
|?? 1 |? SORT AGGREGATE??? |????? |????? 1 |????? 1 |???? 8 |??????????? |????????? |????? 1 |00:00:00.01 |??? 1544 |
|*? 2 |?? TABLE ACCESS FULL| T??? |????? 1 |???? 66 |?? 528 |?? 428?? (0)| 00:00:01 |???? 96 |00:00:00.01 |??? 1544 |
---------------------------------------------------------------------------------------------------------------------
--//可以發(fā)現(xiàn)估計(jì)E-Rows如何計(jì)算的呢?
6.做10053跟蹤:
D:\tools\sqllaji>cat 10053x.sql
execute dbms_sqldiag.dump_trace(p_sql_id=>'&1',p_child_number=>&2,p_component=>'Compiler',p_file_id=>'&&1');
SCOTT@test01p> @ 10053x f31kz63ksu1tc 0
PL/SQL procedure successfully completed.
***************************************
SINGLE TABLE ACCESS PATH
? Single Table Cardinality Estimation for T[T]
SPD: Return code in qosdDSDirSetup: NOCTX, estType = TABLE
? Column (#1):
??? NewDensity:0.004545, OldDensity:0.000034 BktCnt:13140.000000, PopBktCnt:13140.000000, PopValCnt:10, NDV:32
? Column (#1): OWNER(VARCHAR2)
??? AvgLen: 8 NDV: 32 Nulls: 0 Density: 0.000000
??? Histogram: Top-Freq? #Bkts: 13140? UncompBkts: 13140? EndPtVals: 10? ActualVal: yes
? Table: T? Alias: T
??? Card: Original: 14600.000000? Rounded: 66? Computed: 66.36? Non Adjusted: 66.36
? Access Path: TableScan
??? Cost:? 428.36? Resp: 428.36? Degree: 0
????? Cost_io: 428.00? Cost_cpu: 14122025
????? Resp_io: 428.00? Resp_cpu: 14122025
? Best:: AccessPath: TableScan
???????? Cost: 428.36? Degree: 1? Resp: 428.36? Card: 66.36? Bytes: 0
?? ?check parallelism for statement[<unnamed>]
kkfdtParallel: parallel is possible (no statement type restrictions)
?? ?kkfdPaForcePrm: dop:1 ()
?? ? use dictionary DOP(1) on table
kkfdPaPrm:- The table : 106380
kkfdPaPrm:DOP = 1 (computed from hint/dictionary/autodop)
kkfdiPaPrm: dop:1 serial(?)
***************************************
--//使用 NewDensity:0.004545.
?BktCnt:13140.000000, PopBktCnt:13140.000000 => 對(duì)應(yīng)就是前10個(gè)流行值的總和.
--//非流行值的數(shù)量: 14600-13140=1460
--//非流行值的桶數(shù)量: 32-10=22
--//非流行值的數(shù)量/非流行值的桶數(shù)量 1460/22=66.36363636363636363636,四舍五入66,正好符合執(zhí)行計(jì)劃的推斷.
--//NewDensity的計(jì)算 =1460/14600/22=.00454545454545454545,非常接近.
---恢復(fù)內(nèi)容結(jié)束---
轉(zhuǎn)載于:https://www.cnblogs.com/lfree/p/6947417.html
總結(jié)
以上是生活随笔為你收集整理的[20170603]12c Top Frequency histogram.txt的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: redis 基础操作学习
- 下一篇: HttpRequest中常见的四种Con