解决SAP PI Cluster系统故障
文檔已經交付給用戶了,這里總結一下:
SAP PI的PI服務當在MSCS群集的node1和node2中都啟動的時候,MSCS故障,所有PI資源組會在node1和node2中來回切換,導致Oracle OFS資源和MSCS資源也切換,由于PI占用內存很大,有30GB內存,這樣的自動來回切換約8次后,pubilc網卡down,up多次崩潰。
由于MSCS切換和OFS資源切換都沒有問題,檢查MSCS的集群配置參數,無誤。
檢查操作系統,看是否有不利于MSCS的補丁,無誤
檢查網絡設置和網卡屬性中的BOE offload,RSS,speed,無誤
檢查針對WINDOWS 2003 R2 SP2中的伸縮端縮放,補丁已達,無誤
檢查public和private的網絡千兆交換機環境,無誤
最后發現:
node1和node2的網卡 HP NC357i驅動都是最新的556版本,而node1 的網卡固件是 527版本,node2的網卡固件是534,經查確認,527固件和556驅動不匹配。找到問題了
解決,驅動由于是最新,不必重裝驅動,刷固件
C:\SWSetup\SP50817>nxflash_x64.exe -i private --all
0/8 - Init
*** Currently in flash ***
Board Type?????? : HP NC375i Integrated Quad Port Multifunction Gigabit Server Adapter
Firmware Version : 4.0.534
MAC Address 0??? : 68:B5:99:C4:B2:B8
MAC Address 1??? : 68:B5:99:C4:B2:B9
MAC Address 2??? : 68:B5:99:C4:B2:BA
MAC Address 3??? : 68:B5:99:C4:B2:BB
Serial Number??? : 牋牋牋牋牋牋牋牋牋牋牋牋牋牋牋牋??
NIC binary romimage found in C:\SWSetup\SP50817
Rom Image??????? : C:\SWSetup\SP50817\phantom_romimage
1/8 - Extracting Romimage
Firmware version From Board: 4.0.534
Firmware version From Romimage: 4.0.539
WARNING: This operation will take the NIC offline.
Do you wish to upgrade? (Y/N) y
Disabling devices
Disabling devices
Disabling devices
Disabling devices
Driver Loaded in Quiesce mode
2/8 - Restoring License
?100%? - DONE
?100%? - DONE
No vNIC property area in romimage
No VPD area in romimage
3/8 - Calculating MD5
?100%? - DONE
4/8 - Backing up current flash
?100%? - DONE
Backup file : "flashbackup__v4.0.534_Sat-Oct-13-22-06-50-2012" - completed successfully.
5/8 - Updating flash
WARNING: This is a very sensitive operation.
Do not interrupt until operation is complete.
setting up the flash_write
?100%? - DONE
6/8 - Verifying Flash MD5
Flashing completed successfully.
Reboot system for firmware to take effect
Enabling devices
Enabling devices
Enabling devices
Enabling devices
Driver Loaded in Normal mode
7/8 - Performing cleanup
8/8 - Finished
C:\SWSetup\SP50817>
在2號機node2上
C:\SWSetup\SP50817>nxflash_x64.exe -i private --all
0/8 - Init
*** Currently in flash ***
Board Type?????? : HP NC375i Integrated Quad Port Multifunction Gigabit Server A
dapter
Firmware Version : 4.0.527
MAC Address 0??? : 68:B5:99:B3:3C:58
MAC Address 1??? : 68:B5:99:B3:3C:59
MAC Address 2??? : 68:B5:99:B3:3C:5A
MAC Address 3??? : 68:B5:99:B3:3C:5B
Serial Number??? : 牋牋牋牋牋牋牋牋牋牋牋牋牋牋牋牋??
NIC binary romimage found in C:\SWSetup\SP50817
Rom Image??????? : C:\SWSetup\SP50817\phantom_romimage
1/8 - Extracting Romimage
Firmware version From Board: 4.0.527
Firmware version From Romimage: 4.0.539
WARNING: This operation will take the NIC offline.
Do you wish to upgrade? (Y/N) y
Disabling devices
Disabling devices
Disabling devices
Disabling devices
Driver Loaded in Quiesce mode
2/8 - Restoring License
?100%? - DONE
?100%? - DONE
No vNIC property area in romimage
No VPD area in romimage
3/8 - Calculating MD5
?100%? - DONE
4/8 - Backing up current flash
?100%? - DONE
Backup file : "flashbackup__v4.0.527_Sat-Oct-13-21-06-28-2012" - completed succe
ssfully.
5/8 - Updating flash
WARNING: This is a very sensitive operation.
Do not interrupt until operation is complete.
setting up the flash_write
?100%? - DONE
6/8 - Verifying Flash MD5
Flashing completed successfully.
Reboot system for firmware to take effect
Enabling devices
Enabling devices
Enabling devices
Enabling devices
Driver Loaded in Normal mode
7/8 - Performing cleanup
8/8 - Finished
C:\SWSetup\SP50817>
問題解決!
后來和采購確認,兩臺機器來源采購相差半年,不是同一批次。2號機是開發機,半年后才新購1號機生產機,然后實施的時候開發機和生產機做MSCS PI。
看來實施MSCS的人技術很毛躁,不靠譜。Windows企業環境要更加精細化,對技術素養要更高,因為很多錯誤你無法深入內核解決,我不可能遇到問題就看dump崩潰核心轉儲文件,或者拿出windbg就開工。——當然這是最后的辦法
總結
以上是生活随笔為你收集整理的解决SAP PI Cluster系统故障的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: mysql 查询超过60分钟的_mysq
- 下一篇: Domain Socket本地进程间通信