一、?環境信息: Oracle 11.1.0.7.0 - 64bit,單機,AIX 6.1
?
二、?EXPDP性能診斷 1、?expdp腳本如下: cat? exp_SCOTT_20180511.par?
userid="/?as?sysdba"
directory=temp_dump
dumpfile=exp_SCOTT_20180511_%u.dmp
logfile=exp_SCOTT_20180511.log
cluster=n
parallel=8
compression=all
schemas=SCOTT
exclude=statistics
2、導出整個schema的時間大約為13個小時,數據量133GB:
sed?-n?-e?'2p'?-e?'/method:/p'?-e?'$p'?exp_SCOTT_2018051.log
Export:?Release?11.1.0.7.0?-?64bit?Production?on?Friday,?11?May,?2018?15:16:44
Total?estimation?using?BLOCKS?method:?133.3?GB
Job?"SYS"."SYS_EXPORT_SCHEMA_01"?successfully?completed?at?04:13:05
3、?expdp 導出過程10046 Trace追蹤診斷: 因該庫需要使用expdp按schema導出導入方式進行整庫遷移,按照一個schema導出需要13個小時的速度是接受不了的,雖然該機器配置不是非常的好。接下來導出另外一個schema,并啟用10046 trace追蹤:
--查詢正在執行的expdp相關會話信息:
set?lines?150?pages?100?numwidth?7?
col?program?for?a38?
col?username?for?a10?
col?spid?for?a7?
select?to_char(sysdate,'YYYY-MM-DD?HH24:MI:SS')?"DATE",?s.program,?s.sid,???s.status,?s.username,?d.job_name,?p.spid,?s.serial#,?p.pid???from?v$session?s,?v$process?p,?dba_datapump_sessions?d??where?p.addr=s.paddr?and?s.saddr=d.saddr;DATE????????????????PROGRAM????????????????????????SID?STATUS???USERNAME???JOB_NAME??????????????SPID????SERIAL#?????PID
-------------------?--------------------------?-------?--------?----------?---------------------?-------?-------?-------
2018-05-14?14:46:10?)????????2814?ACTIVE???SYS????????SYS_EXPORT_SCHEMA_01??488648?????6557?????467
2018-05-14?14:46:10?)????????3243?ACTIVE???SYS????????SYS_EXPORT_SCHEMA_01??124224????19566?????466
2018-05-14?14:46:10?V1-V3)???2801?ACTIVE???SYS????????SYS_EXPORT_SCHEMA_01??340810????61833?????464--查詢對應會話是否有其它會話阻塞:
select?sid,serial#,event,blocking_session,seconds_in_wait,state,last_call_et?from?v$session?where?serial#?in?(6557,19566,61833);
SID?SERIAL#?EVENT???????????????????????????????????BLOCKING_SESSION?SECONDS_IN_WAIT?STATE???????????????LAST_CALL_ET
-------?-------?--------------------------------------?----------------?---------------?-------------------?------------???
2801???61833?wait?for?unread?message?on?broadcast?channel????????????????????????????1?WAITING????????????????????46636???
2814????6557?db?file?sequential?read?????????????????????????????????????????????????0?WAITED?SHORT?TIME??????????46636???
3243???19566?wait?for?unread?message?on?broadcast?channel????????????????????????????0?WAITING????????????????????46638--對相關進程進行10046?trace追蹤:
oradebug?setospid?124224
oradebug?unlimit
oradebug?event?10046?trace?name?context?forever,?level?12;
oradebug?tracefile_name
--oradebug?event?10046?trace?name?context?off;oradebug?setospid?488648
oradebug?unlimit
oradebug?event?10046?trace?name?context?forever,?level?12;
oradebug?tracefile_name
--oradebug?event?10046?trace?name?context?off;
--已上trace時間長一點,會抓取較大的信息
同時查看相關trace文件的的信息:
grep?"nam='?WAIT?#13:?nam='db?file?sequential?read'"?TEST_dw01_488648.trc|awk?'{print?$12}'|sort?-u
obj#=-1
obj#=0
obj#=14
...
obj#=579
obj#=596
obj#=9
發現這些在等待的這些對象基本都是SYS用戶下:
select?obj#,owner#,name,namespace?from?obj$?where?obj#?in?(-1,0,14,3,36,559,563,566,567,575,578,579,596,9);OBJ#?????OWNER#?NAME????????????????????????????NAMESPACE
----------?----------?------------------------------?----------?????????
3??????????0?I_OBJ#??????????????????????????????????4?????????
9??????????0?I_FILE#_BLOCK#??????????????????????????4????????
14?????????0?SEG$????????????????????????????????????1????????
36?????????0?I_OBJ1??????????????????????????????????4???????
559????????0?PARTOBJ$????????????????????????????????1???????
563????????0?TABPART$????????????????????????????????1???????
566????????0?I_TABPART_BOPART$???????????????????????4???????
567????????0?I_TABPART_OBJ$??????????????????????????4???????
575????????0?TABSUBPART$?????????????????????????????1???????
578????????0?I_TABSUBPART_POBJSUBPART$???????????????4???????
579????????0?I_TABSUBPART$_OBJ$??????????????????????4???????
596????????0?LOBFRAG$????????????????????????????????1
同時查看expdp導出日志,是在估算大小:
Total estimation using BLOCKS method:xxx.xx GB
?
初步診斷為,expdp在導出時,要估算導出大小,需要查詢sys用戶下的基表,很有可能是這些查詢的SQL出現了性能問題。
--抽查SEG$表的統計信息:
OWNER??????PARTNAME??????NROWS?????BLOCKS?AVGSPC?CCNT?ROWLEN????SSIZE?ANADATE
----------?----------?--------?----------?------?----?------?--------?-------------------
SYS???????????????????????4531????????132??????0????0?????66?????4531?2013-07-19?01:10:49
發現該表統計信息很久沒有更新,接連查詢其它表的統計信息,發現都是很久沒有更新統計信息。于是做了如下操作,先將相關表的統計信息更新到最新:
--收集整個sys用戶的統計信息:
exec?dbms_stats.gather_schema_stats(ownname?=>?'SYS',degree=>8);
--收集數據字典的統計信息:
exec?dbms_stats.gather_dictionary_stats(options?=>'gather?auto',degree=>16);
--收集fixed表統計信息:
exec?dbms_stats.gather_fixed_objects_stats;
然后中斷已經在導出的進程,等統計信息收集完成后,再執行導出,測試導出速度是否有提升。發現導出速度并沒有明顯提升,看來真正的問題并沒有找到,再次進行trace追蹤。
tkprof?TEST_dm00_124224.trc?tkprof_TEST_dm00_124224.out?waits=y?sort=exeelatkprof?TEST_dw01_488648.trc?tkprof_TEST_dw01_488648.out?waits=y?sort=exeela
然后發現如下SQL異常:
SQL?ID:?aujcr6t1u8u62
Plan?Hash:?3731411368
SELECT?LVL?FROM?SYS.KU$_REF_PAR_LEVEL_VIEW?WHERE?OBJ#?=?:B1
call?????count???????cpu????elapsed???????disk??????query????current????????rows
-------?------??--------?----------?----------?----------?----------??----------
Parse????????0??????0.00???????0.00??????????0??????????0??????????0???????????0
Execute?160363??????1.87??????32.76??????????0??????????0??????????0???????????0
Fetch???160362??13047.39???13496.19??????????0?1008676980??????????0???????????0
-------?------??--------?----------?----------?----------?----------??----------
total???320725??13049.26???13528.96??????????0?1008676980??????????0???????????0
Misses?in?library?cache?during?parse:?0
Elapsed?times?include?waiting?on?following?events:??Event?waited?on?????????????????????????????
Times???Max.?Wait??Total?Waited??
----------------------------------------???Waited??----------??------------??
db?file?sequential?read????????????????????169141????????0.60????????453.81??
db?file?scattered?read??????????????????????????3????????0.01??????????0.02OVERALL?TOTALS?FOR?ALL?RECURSIVE?STATEMENTS
call?????count???????cpu????elapsed???????disk??????query????current????????rows
-------?------??--------?----------?----------?----------?----------??----------
Parse???167759??????4.95???????5.86??????????0??????????0??????????0???????????0
Execute?648851????123.59?????151.44??????????0??????????0??????????0???????????0
Fetch???648850??13057.99???13507.29??????????0?1010718198??????????0??????488487
-------?------??--------?----------?----------?----------?----------??----------
total???1465460??13186.53???13664.61??????????0?1010718198??????????0??????488487
查詢該SQL歷史執行的情況:
set?lines?180?pages?9999
col?execs?for?999,999,999
col?avg_etime?for?999,999.999
col?avg_lio?for?999,999,999.9
col?begin_interval_time?for?a30
col?node?for?99999
break?on?plan_hash_value?on?startup_time?skip?1
select?ss.snap_id,ss.instance_number?node,begin_interval_time,sql_id,plan_hash_value,nvl(executions_delta,?0)?execs,(elapsed_time_delta?/decode(nvl(executions_delta,?0),?0,?1,?executions_delta))?/?1000000?avg_etime,(buffer_gets_delta?/decode(nvl(buffer_gets_delta,?0),?0,?1,?executions_delta))?avg_lio,(disk_reads_delta?/decode(nvl(buffer_gets_delta,?0),?0,?1,?executions_delta))?avg_piofrom?DBA_HIST_SQLSTAT?S,?DBA_HIST_SNAPSHOT?SSwhere?sql_id?=?to_char('aujcr6t1u8u62')and?ss.snap_id?=?S.snap_idand?ss.instance_number?=?S.instance_numberand?executions_delta?>?0order?by?1,?2,?3;--可以看到,該SQL單次執行時間為0.081秒,30分鐘內只能執行20萬次左右SNAP_ID???NODE?BEGIN_INTERVAL_TIME????SQL_ID????????PLAN_HASH_VALUE????????EXECS????AVG_ETIME????????AVG_LIO????AVG_PIO
--------?-----?---------------------?-------------?---------------?------------?------------?--------------?----------
54453??1?11-MAY-18?03.00.27.869?PM???aujcr6t1u8u62??????3731411368????????7,607?????????.081????????6,289.2?.816221901
54454??1?11-MAY-18?03.30.30.087?PM???aujcr6t1u8u62???????????????????????20,606?????????.081????????6,290.0??????????0
54455??1?11-MAY-18?04.00.32.259?PM???aujcr6t1u8u62???????????????????????20,736?????????.081????????6,290.0??????????0
因為是expdp導出執行的sql,對于這樣的sql有一個好處,就是可以對比相同版本的其它庫的歷史執行計劃。 ?
造成相同執行計劃在不同庫性能差異的這種情況的原因有很多,比如優化器的相關參數設置不同、機器配置不同,庫本身的性能較好或庫本身很空閑等。這里不對這些原因進行分析,先分析一下該SQL,看看該SQL的執行計劃是否合理:
--獲取該sql的綁定變量:
SQL>?set?linesize?200?pagesize?200
SQL>?col?name?for?a30
SQL>?col?datatype_string?for?a20
SQL>?col?value_string?for?a20
SQL>?select?sql_id,?name,?datatype_string,?last_captured,?value_string??from?v$sql_bind_capture2?where?sql_id?=?'aujcr6t1u8u62'?order?by?last_captured,?position;
SQL_ID????????NAME???????????????????????????DATATYPE_STRING??????LAST_CAPTURED???????VALUE_STRING
-------------?------------------------------?--------------------?-------------------?--------------------
aujcr6t1u8u62?:B1????????????????????????????NUMBER???????????????2018-05-15?16:49:30?801250
aujcr6t1u8u62?:B1????????????????????????????NUMBER???????????????2018-05-16?10:39:08?885287--獲取真實的執行計劃:
set?pages?9999?line?333
var?B1?number;
exec?:B1?:=?885287;
SELECT?LVL?FROM?SYS.KU$_REF_PAR_LEVEL_VIEW?WHERE?OBJ#?=?:B1;
select?*?from?table(dbms_xplan.display_cursor(null,0));
?
然后與該trace追蹤到的執行計劃進行對比:
?
?
發現該SQL使用的優化器模式是Rule Based Optimizer(簡稱RBO),是基于數據字典的優化,它根據數據字典查詢有無可用的索引,如果有則使用,否則不使用,不同的訪問方法有預定好的優先級,選擇優先級高的執行方法。但在這里不知道它為什么沒有選擇走CDEF$的I_CDEF3索引,而是走了全表掃描。這也可以解釋為什么收集完sys用戶統計信息之后,執行計劃不變的原因了。那問題來了,是使用CBO做索引掃描的執行效率高,還是現在RBO模式的效率高呢?經測試,使用Cost Based Optimizer(簡稱CBO)的效率高,也就是通過變量信息產生的真正執行計劃。
SQL>?SELECT?/*+?rule?*/?LVL?FROM?SYS.KU$_REF_PAR_LEVEL_VIEW?WHERE?OBJ#?=?:B1;
no?rows?selected
Elapsed:?00:00:00.10
SQL>?SELECT?LVL?FROM?SYS.KU$_REF_PAR_LEVEL_VIEW?WHERE?OBJ#?=?:B1;
no?rows?selected
Elapsed:?00:00:00.00
既然問題已經找到,接下來就好辦了,使用coe_xfr_sql_profile.sql直接綁定較優的執行計劃。
?
綁定后的執行計劃效果如下:
綁定之后的提升效果還是很明顯的,單次平均執行時間已經下降到0.002秒,平均邏輯讀也成倍的下降。這里為什么說很明顯呢?可能看起來0.081秒的執行效率已經很高了,但是這個sql的是需要很次執行的,在expdp導出schema的過程中,需要查詢整個schema的所有對象。
優化后的導出效果,1個多小時就完成了導出:
sed?-n?-e?'2p'?-e?'$p'?exp_SCOTT_20180515.log
Export:?Release?11.1.0.7.0?-?64bit?Production?on?Tuesday,?15?May,?2018?16:51:22
Job?"SYS"."SYS_EXPORT_SCHEMA_01"?successfully?completed?at?18:16:51
?
三、?總結 1、?對于expdp/impdp的性能影響因素很多,比如庫整體性能,機器配置情況,存儲I/O資源等。 2、?有關expdp/impdp性能診斷請參考: SRDC - 數據泵導入(IMPDP)性能問題的診斷收集 (Document 2365615.1) SRDC - 數據泵導出性能問題的診斷收集 (Document 2365568.1) 針對數據泵導出 (expdp) 和導入 (impdp)工具性能降低問題的檢查表 (Document 1549185.1)
?
?
?
轉載于:https://blog.51cto.com/wyzwl/2117170
總結
以上是生活随笔 為你收集整理的一次EXPDP数据泵性能问题诊断和调优 的全部內容,希望文章能夠幫你解決所遇到的問題。
如果覺得生活随笔 網站內容還不錯,歡迎將生活随笔 推薦給好友。