OCFS2+ASM 的RAC安装文档
有關RAC?的一些概念性和原理性的知識,?請參考我的blog:
http://blog.csdn.net/tianlesoftware/archive/2010/02/27/5331067.aspx
?
這次實驗是?OCFS2+ASM?來實現的,至于裸設備平臺的搭建,以后在整理。在生產環境中還是raw?+?ASM?比較多。?
實驗平臺:Oracle?10gR2?RAC?+?RHEL?4.0?+VMWare?GSX?3.2.0??
?
安裝步驟:
1.安裝前準備及OS安裝配置?
2.安裝Oracle?10gR2?clusterware?
3.安裝Oracle?10gR2?database?
4.配置netca?
5.創建ASM?實例
6.配置dbca創建數據庫?
7.檢查RAC狀態
8.RAC?卸載
?
?
一.?安裝前準備及OS安裝配置?
Host?OS?準備?
?
1.?從Oracle?OTN下載?oracle?10gR2?for?x86?linux,下2個zip,?一個是clusterware?zip,一個是database?zip,?comp?CD?可選.
2.?準備的RHEL?5(x86)
3.?搞清楚你的linux的kernel?version?
4.?從Oracle?OTN下載?ocfs2?,?ocfs2?tools,?ocfs2?console,?ASMLib,?ASMLib?support?,?記住這些都是針對你的kernel的,不要下錯?
http://oss.oracle.com/projects/ocfs2/source.html
http://oss.oracle.com/projects/ocfs2/files/
http://oss.oracle.com/projects/ocfs2-tools/files/
注意,?ocfs2?tools?和?ocfs2?console?在一個頁面下載。
?
http://www.oracle.com/technology/tech/linux/asmlib/index.html
該頁面有下載地址,注意選擇CPU?類型。?里面有asmlib?和?support。?在同一個頁面。
?
5.?VMWare?GSX?3.2.0?for?linux?
???
虛擬機?Workstation,GSX?Server?和ESX之間的區別
http://blog.csdn.net/tianlesoftware/archive/2010/02/22/5316767.aspx
?
?
OS安裝?
?
1.?在vmware?console?中創建redhat?4?實例,取名rac1.?內存600M,?硬盤12GB。
2.?創建好后vmware?OS之后,加上一塊NIC網卡
3.?在gsx里用vdiskmanager?創建Share?Disk。
?
ocfs2?for?OCR?and?voting?disk,?ASM?for?Oracle?DATA.
?
Dos?下進入vmware?的安裝目錄,運行一下命令
?
vmware-vdiskmanager?創建?pre-allocated并且是lsi?contoller的硬盤?1GB一個?for?ocfs2?for?OCR?+?CRS?voting?
vmware-vdiskmanager.exe?-c?-s?500Mb?-a?lsilogic?-t?2?E:/VM/RACShare/ocfs2_ocr_crs.vmdk
?
vmware-vdiskmanager?創建?pre-allocated并且是lsi?contoller的硬盤?for?Oracle?data?&?flash?recovery?area
vmware-vdiskmanager.exe?-c?-s?4096Mb?-a?lsilogic?-t?2?E:/VM/RACShare/asm_data.vmdk
vmware-vdiskmanager.exe?-c?-s?2048Mb?-a?lsilogic?-t?2?E:/VM/RACShare/asm_recovery.vmdk
?
做好后,share目錄就會產生你剛才創建的這些vmdk了.?
?
4.?到?rac1的目錄,打開rac1.vmx?,?在最后空白處添加這幾段內容(一定要最后)
?
scsi1.present?=?"TRUE"?
scsi1.virtualDev?=?"lsilogic"?
scsi1.sharedBus?=?"virtual"?
?
這段是打開?scsi1上的使用,并且設置成virtual,?controller設置成lsilogic?
?
然后依次添加?
?
scsi1:1.present?=?"TRUE"?
scsi1:1.mode?=?"independent-persistent"?
scsi1:1.filename?=?"E:/VM/RACShare/ocfs2_ocr_crs.vmdk"?
scsi1:1.deviceType?=?"plainDisk"?
?
scsi1:2.present?=?"TRUE"?
scsi1:2.mode?=?"independent-persistent"?
scsi1:2.filename?=?"E:/VM/RACShare/asm_data.vmdk"?
scsi1:2.deviceType?=?"plainDisk"?
?
scsi1:3.present?=?"TRUE"?
scsi1:3.mode?=?"independent-persistent"?
scsi1:3.filename?=?"E:/VM/RACShare/asm_recovery.vmdk"?
scsi1:3.deviceType?=?"plainDisk"?
?
?
?
最后添加這個?
disk.locking?=?"false"?
diskLib.dataCacheMaxSize?=?"0"?
diskLib.dataCacheMaxReadAheadSize?=?"0"?
diskLib.DataCacheMinReadAheadSize?=?"0"?
diskLib.dataCachePageSize?=?"4096"?
diskLib.maxUnsyncedWrites?=?"0"
?
?
這段是對vmware使用共享硬盤的方式進行定義
?
保存退出之后,重新打開你的vmware-console,你就可以看到2個vmware?guest?OS的配置中,都有這些硬盤出現了.?
?
?
5.?然后就安裝你的vmware??OS,?安裝的時候,為了方便,把包全部裝上,省得以后麻煩。
?
將rac1結點復制到rac2,?在虛擬機中用新ID打開,修改IP地址,hostname,節點2創建完成。?采用復制就省的安裝2次系統,比較方便。
?
6.??配置你的2個網卡的固定IP,hostname,?DNS,?gateway,?time?server?(NTP)??
/etc/sysconfig/network-script/ifcfg-eth0?
/etc/sysconfig/network-script/ifcfg-eth1?
?
修改機器名,IP和網關?--?默認網關必須設置,不然?vipca?報錯
?
[root@rac1?~]#?vi?/etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
ONBOOT=yes
BOOTPROTO=static
IPADDR=10.85.10.119
NETMASK=255.255.255.0
GATEWAY=10.85.10.253
?
修改主機名
vi?/etc/sysconfig/network
NETWORKING=yes
HOSTNAME=rac1
修改完之后重啟生效:
/etc/rc.d/init.d/network?restart?重新啟動
查看DNS:
cat?/etc/resolv.conf?
?
同步時間:
1.?在rac1上用root用戶執行
#chkconfig?time?on?????#在系統引導的時候自動啟動
?
2.?在rac2?上添加任務,每一分鐘和rac1進行一次時間同步。
[root@rac2?~]#?crontab?-l
*/1?*?*?*?*?rdate?-s?10.85.10.119
?
rac?對節點間時間較敏感,如果不同步在安裝clusterware?時后會報錯,而且在安裝vipca?的時候也會報錯。?具體時間同步參考我的blog:
Linux?時間同步配置
http://blog.csdn.net/tianlesoftware/archive/2010/02/21/5315587.aspx
?
linux?下修改日期和時間
http://blog.csdn.net/tianlesoftware/archive/2009/11/13/4808096.aspx
?
?
7.?安裝結束后,進入?OS,?修改解析文件:?/etc/hosts?
?
如下?
注:機器名和public名要一樣
?
127.0.0.1?localhost?(必須要這樣)?
?
10.85.10.119?rac1
10.85.10.121?rac2
?
192.168.1.119?rac1-priv
192.168.1.121?rac2-priv
?
10.85.10.122?rac1-vip
10.85.10.123?rac2-vip
?
兩個node都要一樣.?
?
修改后要確認這個hosts?都正確?(ping)?
?
?
8.?建立用戶等效性
??建立等效用戶之后,2個結點直接Oracle?用戶互相訪問就不在需要密碼了,?這樣就為RAC管理提供了可能性,如果等效性沒有配好,?RAC?肯定是裝不起來的。?
?
group?add?dba?oinstall?組在兩個node上,?創建oracle?用戶,?主組oinstall,?附加組是dba和disk?
?
#groupadd?oinstall?
#groupadd?dba?
#useradd?-g?oinstall?-G?dba?oracle?
#passwd?oracle?
?
建立等效用戶
?
在rac1:
[root@rac1?opt]#?su?-?oracle
[oracle@rac1?~]$?mkdir?~/.ssh
[oracle@rac1?~]$?chmod?700?~/.ssh
[oracle@rac1?~]$?ssh-keygen?-t?rsa
[oracle@rac1?~]$?ssh-keygen?-t?dsa
?
在rac2:
[root@rac2?opt]#?su?-?oracle
[oracle@rac2?~]$?mkdir?~/.ssh
[oracle@rac2?~]$?chmod?700?~/.ssh
[oracle@rac2?~]$?ssh-keygen?-t?rsa
[oracle@rac2?~]$?ssh-keygen?-t?dsa
?
切換回rac1,接著執行:
[oracle@rac1?~]$?cat?~/.ssh/id_rsa.pub?>>?~/.ssh/authorized_keys
[oracle@rac1?~]$?cat?~/.ssh/id_dsa.pub?>>?~/.ssh/authorized_keys
?
提示:下列命令會提示你輸入rac2?的oracle?密碼,按照提示輸入即可,如果失敗可重新嘗試執行命
令。
[oracle@rac1?~]$?scp?~/.ssh/authorized_keys?rac2:~/.ssh/authorized_keys
[oracle@rac1?~]$?ssh?rac2?cat?~/.ssh/id_rsa.pub?>>?~/.ssh/authorized_keys
[oracle@rac1?~]$?ssh?rac2?cat?~/.ssh/id_dsa.pub?>>?~/.ssh/authorized_keys
[oracle@rac2?~]$?scp?~/.ssh/authorized_keys?rac1:~/.ssh/authorized_keys
?
確保2個node都有相互的結點信息。
?
兩機相互執行,看看是否還需要輸入密碼
[oracle@rac1?~]$?ssh?rac1?date
[oracle@rac1?~]$?ssh?rac2?date
[oracle@rac1?~]$?ssh?rac1-priv?date
[oracle@rac1?~]$?ssh?rac2-priv?date
切換至rac2?執行
[oracle@rac2?~]$?ssh?rac1?date
[oracle@rac2?~]$?ssh?rac2?date
[oracle@rac2?~]$?ssh?rac1-priv?date
[oracle@rac2?~]$?ssh?rac2-priv?date
?
?
9.?在每個結點上創建目錄
?
?
[root@rac2?~]#?mkdir?-p?/u01/app/oracle
[root@rac2?~]#?chown?-R?oracle:oinstall?/u01
[root@rac2?~]#?chmod?-R?777?/u01
這個目錄給oracle和clusterware系統的?
?
[root@rac2?~]#?mkdir?-p?/u02/oradata/orcl
[root@rac2?~]#?chown?-R?oracle:oinstall?/u02
[root@rac2?~]#?chmod?-R?777?/u02
這個目錄給?ocfs2用來裝OCR,?CRS?voting?的?
?
10.?修改你的?/etc/sysctl.conf?,添加這些kernel?參數??
net.core.rmem_default=262144?
net.core.wmem_default=262144?
net.core.rmem_max=262144?
net.core.wmem_max=262144?
?
kernel.shmall?=?78643200?
kernel.shmmax?=?314572800?
kernel.shmmni?=?4096?
kernel.sem?=?250?32000?100?128?
fs.file-max?=?65536?
net.ipv4.ip_local_port_range?=?1024?65000?
?
?#?sysctl?-p?立刻生效
?
kernel.shmall為物理內存除以pagesize;
kernel.shmmax為物理內存的一半;
fs.file-max為512?乘以?processes?(如128個process則為?65536);
net.ipv4.ip_local_port_range/net.core.rmem_default/net.core.rmem_max三個參數設置和官方文檔不一樣,?這是根據metalink?343431.1?最新要求更改的;
net.ipv4.tcp_rmem/net.ipv4.tcp_wmem兩個參數一般情況下無需設置,?除非是在Dataguard/Streams等需很多網絡傳輸情況下;
其它參數根據官方文檔要求設置即可.
?
具體內容參考我的blog:
?
Linux?內核參數及Oracle相關參數調整
http://blog.csdn.net/tianlesoftware/archive/2009/10/15/4668741.aspx
?
?
11.?設置用戶資源限制
因為所有的進程都是以Oracle?身份來運行的,因此需要定義Oracle?用戶能夠使用的系統資源數量。
?
vi?/etc/sysconfig/limits.conf?
--使用HugePage?內存技術,添加下面2行
Oracle?soft?memlock?5242880
Oracle?hard?memlock?524280
--進程句柄數量
oracle?soft?nproc?2047?
oracle?hard?nproc?16384
--?文件句柄
oracle?soft?nofile?65536?
oracle?hard?nofile?65536?
?
將下面一行添加到/etc/pam.d/login文件中:
session?required?/lib/security/pam_limits.so?
?
?
12.?配置?hangcheck-timer?模塊
Hangcheck-timer?是Linux?提供的一個內核級的IO-Fencing?模塊,?這個模塊會監控Linux?內核運行狀態,?如果長時間掛起,?這個模塊會自動重啟系統。?這個模塊在Linux?內核空間運行,?不會受系統負載的影響。?這個模塊會使用CPU的Time?Stamp?Counter(TSC)?寄存器,這個寄存器的值會在每個時鐘周期自動增加,?因此使用的是硬件時間,所以精度更高。
配置這個模塊需要2個參數:?hangcheck_tick?和?hangcheck_margin。?
hangcheck_tick用于定義多長時間檢查一次,缺省值是30秒。?有可能內核本身很忙,?導致這個檢查被推遲,?該模塊還允許定義一個延遲上限,就是hangcheck_margin,?它的缺省值是180秒。
Hangcheck-timer?模塊會根據hangcheck_tick?的設置,定時檢查內核。只要2次檢查的時間間隔小于?hangcheck_tick?+?hangchec_margin,?都會認為內核運行正常,否則就意味著運行異常,這個模塊會自動重啟系統。
CRS本身還有一個MissCount?參數,可以通過crsctl?get?css?miscount?命令查看。
????當RAC結點間的心跳信息丟失時,?Clusterware?必須確保在進行重構時,故障結點確實是Dead?狀態,否則結點僅是臨時負載過高導致心跳丟失,然后其他結點開始重構,但是結點沒有重啟,這樣會損壞數據庫。?因此MissCount?必須大于?hangcheck_tick+hangcheck_margin的和。?
?
12.1?查看模塊位置:
[root@rac1?~]#?find?/lib/modules?-name?"hangcheck-timer.ko"
/lib/modules/2.6.9-78.EL/kernel/drivers/char/hangcheck-timer.ko
/lib/modules/2.6.9-78.ELsmp/kernel/drivers/char/hangcheck-timer.ko
?
12.2?配置系統啟動時自動加載模塊,在/etc/rc.d/rc.local?中添加如下內容
[root@rac1?~]#?modprobe?hangcheck-timer
[root@rac1?~]#?vi?/etc/rc.d/rc.local
modprobe?hangcheck-timer
?
12.3?配置hangcheck-timer參數,?在/etc/modprobe.conf?中添加如下內容:
[root@rac1?~]#?vi?/etc/modprobe.conf
options?hangcheck-timer?hangcheck_tick=30?hangcheck_margin=180
?
12.4?確認模塊加載成功:
[root@rac1?~]#?grep?Hangcheck?/var/log/messages?|?tail?-2
Feb?23?22:08:44?rac1?kernel:?Hangcheck:?starting?hangcheck?timer?0.9.0?(tick?is?30?seconds,?margin?is?180?seconds).
?
?
13.?安裝ocfs2?,?ocfs2的console的rpm,?
#?rpm?-ivh?*.rpm?
?
linux?掛在windows?共享的盤
1.?啟動nfs服務:???service?nfs?start
2.?mount?-o?username=share,password=share?//10.85.10.80/RAC?/mnt?
?
14.?在每個node?上這樣操作?
?
/etc/init.d/o2cb?enable
?
然后編輯?/etc/init.d/o2cb,?刪除掉?靠近配置開頭的那些帶?#的配置行?
進入X,?然后運行?ocfs2console,?把你的2個node都添加進去?
添加后會生成?/etc/ocfs2/cluster.conf文件
?
若不能apply,?把/etc/ocfs2/cluster.Conf?文件刪了在運行即可。
node:
????????ip_port?=?7777
????????ip_address?=?10.85.10.119
????????number?=?0
????????name?=?rac1??????????????????----?注意:name是機器名
????????cluster?=?ocfs2
node:
????????ip_port?=?7777
????????ip_address?=?10.85.10.121
????????number?=?1
????????name?=?rac2
????????cluster?=?ocfs2
cluster:
????????node_count?=?2
????????name?=?ocfs2
?
15.先格式化分區?fdisk?/dev/sdb,/dev/sdc,/dev/sdd,/dev/sde,/dev/sdf.
在一個結點執行格式化就可以了,因為他們是共享的。?
[root@rac1?init.d]#?fdisk?/dev/sdc
Device?contains?neither?a?valid?DOS?partition?table,?nor?Sun,?SGI?or?OSF?disklabel
Building?a?new?DOS?disklabel.?Changes?will?remain?in?memory?only,
until?you?decide?to?write?them.?After?that,?of?course,?the?previous
content?won't?be?recoverable.
?
Warning:?invalid?flag?0x0000?of?partition?table?4?will?be?corrected?by?w(rite)
?
Command?(m?for?help):?n
Command?action
???e???extended
???p???primary?partition?(1-4)
p
Partition?number?(1-4):?1
First?cylinder?(1-130,?default?1):
Using?default?value?1
Last?cylinder?or?+size?or?+sizeM?or?+sizeK?(1-130,?default?130):
Using?default?value?130
?
Command?(m?for?help):?w
The?partition?table?has?been?altered!
?
Calling?ioctl()?to?re-read?partition?table.
Syncing?disks.
?
在用fdisk?-l?就會看到新的分區sdb1,sdc1,sdd1,sde1,sdf1
?
格式化分區:
在一個?node上?mkfs.ocfs2?-b?4k?-C?32k?-L?oradatafiles?/dev/sdb1?(就是前面創建的第一個vmdk)?
?
16.?在每個node上?
mount?-t?ocfs2?-o?datavolume?/dev/sdb1?/u02/oradata/orcl
?
若在第二個結點掛不上,把系統重啟一下就可以了
?
在掛載之前,/etc/init.d/o2cb?status?顯示為Checking?O2CB?heartbeat:?Not?active。
在格式化和掛載文件系統之前,應驗證?O2CB?在兩個節點上均聯機;O2CB?心跳當前沒有
活動,因為文件系統還沒有掛載?。掛載之后就會變成active。
?
問題1:若出現這樣的錯誤,注意檢查下防火墻是否關閉,關閉防火墻在看看。
o2net_connect_expired:1664?ERROR:?no?connection?established?with?node?0?after?30.0?seconds,?giving?up?and?returning?errors.
關閉防火墻命令:
1)?永久性生效,重啟后不會復原
開啟:?chkconfig?iptables?on
關閉:?chkconfig?iptables?off
2)?即時生效,重啟后復原
開啟:?service?iptables?start
關閉:?service?iptables?stop?
?
參考:?Build?Your?Own?Oracle?RAC?10g?Release?2?Cluster?on?Linux?and?FireWire?
http://blog.csdn.net/tianlesoftware/archive/2009/11/13/4805700.aspx
?
修改參數,讓ocfs2?在系統啟動時自動掛共享盤
Configure?OCFS?to?Mount?Automatically?at?Startup
修改你的?/etc/fstab?,?添加?類似這樣的行?
/dev/sdb1?/u02/oradata/orcl?ocfs2?_netdev,datavolume?0?0?
到這里,我們的ocfs2?for?OCR,?CRS?voting?就OK了?
?
RAC?Ocfs2文件系統常見問題解決方法?
http://blog.csdn.net/tianlesoftware/archive/2009/11/13/4805727.aspx
?
?
17.?修改?/etc/sysconfig/o2cb?
把threshhold?的?值設置成?600?
[隔離時間(秒)]?=?(O2CB_HEARTBEAT_THRESHOLD?-?1)?*?2
(301?-?1)?*?2?=?600?秒
?
為什么設成600,如果時間過短,會造成ocfs不能正常掛載。
?
具體參考我的blog:
解決?OCFS2?不能自動掛載?提示?o2net_connect_expired?
http://blog.csdn.net/tianlesoftware/archive/2009/11/14/4806813.aspx
?
?
18.?在每個node上安裝?ASMLibs,?tools,?support?三個rpm文件?
?
#?rpm?-ivh?*.rpm?--nodeps?--force
?
然后運行?/etc/init.d/oracleasm?configure?
回答?oracle?,?dba,?y,?y?就可以了?
?
19.?創建ASM?
在一個node上:?
通過以?root?用戶身份運行以下命令來標記由?ASMLib?使用的磁盤:/etc/init.d/oracleasm?createdisk?DISK_NAME?device_name?
(提示:DISK_NAME?應由大寫字母組成。當前版本有一個錯誤,即假如使用小寫字母,ASM?實例將無法識別磁盤。)
?
記住,ASM在linux下面處理的對象是?partition,不是disk,?所以你那些vmdk要linux?下面partition好才能用,所以先fdisk一下在創建.?
?
/etc/init.d/oracleasm?createdisk?VOL3?/dev/sdc1?
/etc/init.d/oracleasm?createdisk?VOL2?/dev/sdd1
?
?
創建好后,?在這個node?上運行?/etc/init.d/oracleasm?listdisks?查看?
?
20.?在另外一個node?上?
/etc/init.d/oracleasm?scandisks?
/etc/init.d/oracleasm?listdisks?查看?
?
21.?在每個node上?
Su?-oracle
Cd?/home/oracle
修改?oracle用戶家目錄下的?.bash_profile?
?
注意ORACLE_SID,?和后面建庫要一致。
?
#?.bash_profile?
#?Get?the?aliases?and?functions?
if?[?-f?~/.bashrc?];?then?
.?~/.bashrc?
fi?
#?User?specific?environment?and?startup?programs?
PATH=$PATH:$HOME/bin
export?ORACLE_BASE=/u01/app/oracle
export?ORACLE_HOME=$ORACLE_BASE/product/10.2.0/db_1
export?ORA_CRS_HOME=$ORACLE_BASE/product/crs
export?ORACLE_SID=rac1
export?PATH=.:${PATH}:$HOME/bin:$ORACLE_HOME/bin
export?PATH=${PATH}:/usr/bin:/bin:/usr/bin/X11:/usr/local/bin
export?PATH=${PATH}:$ORACLE_BASE/common/oracle/bin
export?ORACLE_TERM=xterm
export?TNS_ADMIN=$ORACLE_HOME/network/admin
export?ORA_NLS10=$ORACLE_HOME/nls/data
export?LD_LIBRARY_PATH=$ORACLE_HOME/lib
export?LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:$ORACLE_HOME/oracm/lib
export?LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/lib:/usr/lib:/usr/local/lib
export?CLASSPATH=$ORACLE_HOME/JRE
export?CLASSPATH=${CLASSPATH}:$ORACLE_HOME/jlib
export?CLASSPATH=${CLASSPATH}:$ORACLE_HOME/rdbms/jlib
export?CLASSPATH=${CLASSPATH}:$ORACLE_HOME/network/jlib
export?THREADS_FLAG=native
export?TEMP=/tmp
export?TMPDIR=/tmp
?
第二個節點的?ORACLE_SID=rac2?其他都一樣?
?
?
二.?安裝Oracle?10gR2?clusterware
?
1.?用Xmanager?軟件連上虛擬機之后運行clusterware?的安裝軟件,?Xmanager?支持圖形界面,?所以能省很多事。?
注:若出現這樣:libXp.so.6:?cannot?open?shared?object?file?錯誤,安裝下libXp包就可以了
?
2.確認你的安裝目錄是/u01/app/oracle/product/crs?
?
3.?增加相關結點信息?
rac1?rac1-priv?rac1-vip?
rac2?rac2-priv?rac2-vip?
?
4.指定?eth0?的類型時public?
?
5.?指定OCR?和?Voting?Disk
一般而言,如果采用存儲來存放OCR和Voting?Disk.?存儲本身就提供了redundancy策略,此時我們可以選擇External?Redundancy?選項,?此時Oracle?就不在考慮軟件冗余了。?如果沒有使用存儲設備或者存儲為RAID0,那么就可以使用Oracle?自己提供的軟件冗余機制?Normal?Redundancy?選項,此時就會激活Mirror?Location?選項.?用來指定鏡像文件位置,?Oracle?的Clusterware在運行時會維護這個Mirror文件的內容同步。
?
OCR?最多只有一份冗余:
/u02/oradata/orcl/OCRFile?
/u02/oradata/orcl/OCRFile_mirror?
?
Voting?Disk?最多可以定義2份冗余:
/u02/oradata/orcl/VotingFile?
/u02/oradata/orcl/VotingFile_mirror1?
/u02/oradata/orcl/VotingFile_mirror2?
?
7.然后就開始安裝了,結束時會提示用root在每個節點上運行orainstRoot.Sh?和?root.Sh腳本,?在第二個結點上運行root.Sh?后自動調用vipca?這個命令,?在第二個結點運行root.Sh?之前要修改一下vipca命令,?不然可能會報錯。?
?
RAC安裝時需要執行4個腳本及意義
http://blog.csdn.net/tianlesoftware/archive/2010/02/22/5317034.aspx
?
?
注意:?VIPCA?命令也是用ROOT?用戶來運行的,?只需要在一個結點運行就可以了。
?
進入$CRS_HOME/bin/目錄,?用vi來修改vipca?和?srvctl?2個命令。?
?
問題1:?vipca報錯,是redhat的bug
Running?vipca(silent)?for?configuring?nodeapps
/home/oracle/crs/oracle/product/10/crs/jdk/jre//bin/java:?error?while?loading?
shared?libraries:?libpthread.so.0:?cannot?open?shared?object?file:?
No?such?file?or?directory
?
解決方法:
Remember?to?re-edit?these?files?on?all?nodes:?
<CRS_HOME>/bin/vipca?
<CRS_HOME>/bin/srvctl?
<RDBMS_HOME>/bin/srvctl?
<ASM_HOME>/bin/srvctl?
?
after?applying?the?10.2.0.2?or?10.2.0.3?patchsets,?as?these?patchset?will?still?include?those?settings?unnecessary?for?OEL5?or?RHEL5?or?SLES10.??This?issue?was?raised?with?development?and?is?fixed?in?the?10.2.0.4?patchsets.?
Note?that?we?are?explicitly?unsetting?LD_ASSUME_KERNEL?and?not?merely?commenting?out?its?setting?to?handle?a?case?where?the?user?has?it?set?in?their?environment?(login?shell).?
?
$?vi?vipca?
...?...?
Linux)?LD_LIBRARY_PATH=$ORACLE_HOME/lib:/lib:$ORACLE_HOME/srvm/lib:$LD_LIBRARY_PATH?
???????export?LD_LIBRARY_PATH?
????????echo?$LD_LIBRARY_PATH?
????????echo?$CLASSPATH?
???????#Remove?this?workaround?when?the?bug?3937317?is?fixed?
???????arch=`uname?-m`?
???????if?[?"$arch"?=?"i686"?-o?"$arch"?=?"ia64"?]?
???????then?
????????#?LD_ASSUME_KERNEL=2.4.19????
????????#?export?LD_ASSUME_KERNEL??
????????echo??
???????fi?
???????#End?workaround?
?
問題2:?如果遇到這個錯誤:
#?vipca
Error?0(Native:?listNetInterfaces:[3])
[Error?0(Native:?listNetInterfaces:[3])]
解決方法:
在CRS_HOME下?運行?oifcfg?命令:
在rac1:
#?./oifcfg?setif?-global?eth0/10.85.10.119:public
#?./oifcfg?setif?-global?eth1/192.168.1.119:cluster_interconnect
#?./oifcfg?getif
eth0?10.85.10.119?global?public
eth1?192.168.1.119?global?cluster_interconnect
?
在rac2:
/bin?#?./oifcfg?setif?-global?eth0/10.85.10.121:public
/bin?#?./oifcfg?setif?-global?eth1/192.168.1.121:cluster_interconnect
/bin?#?./oifcfg?getif
eth0?10.85.10.121?global?public
eth1?192.168.1.121?global?cluster_interconnect
?
然后在手工運行vipca添加nodeapps?resource即可。
?
?
問題3:An?unexpected?exception?has?been?detected?in?native?code?outside?the?VM.
Unexpected?Signal?:?11?occurred?at?PC=0xB7503E29
Function=__libc_free+0x49
Library=/lib/tls/libc.so.6
修改主機名不正確導致的。
RHEL
/etc/sysconfig/network?主機名,如果在/etc/hosts中解析不了。就報這個錯誤!
?
8.?clusterware?就安裝好了.?
?
確認一下.?
$?/u01/app/oracle/product/crs/bin/olsnodes?-n?
rac1pub?1?
rac2pub?2?
$?ls?-l?/etc/init.d/init.*?
-r-xr-xr-x?1?root?root?1951?Oct?4?14:21?/etc/init.d/init.crs*?
-r-xr-xr-x?1?root?root?4714?Oct?4?14:21?/etc/init.d/init.crsd*?
-r-xr-xr-x?1?root?root?35394?Oct?4?14:21?/etc/init.d/init.cssd*?
-r-xr-xr-x?1?root?root?3190?Oct?4?14:21?/etc/init.d/init.evmd*?
?
檢查CRS?安裝啟動情況:用Root用戶執行:
$CRS_HOME/bin/crsctl?check?crs
CSS?appears?healthy
CRS?appears?healthy
EVM?appears?healthy
表明CRS?安裝完成,并且啟動成功
?
#./crs_stat?-t?-v
?
注:如果clusterware?安裝失敗,再次運行安裝程序,里面可以把之前的安裝刪除掉,刪除之后在進行安裝。
?
三.?安裝Oracle?10gR2?database?
?
1.?檢查Oracle?的相關包。Oracle?10g?需要如下包
binutils-2.15.92.0.2-10.EL4?
compat-db-4.1.25-9?
control-center-2.8.0-12?
gcc-3.4.3-9.EL4?
gcc-c++-3.4.3-9.EL4?
glibc-2.3.4-2?
glibc-common-2.3.4-2?
gnome-libs-1.4.1.2.90-44.1?
libstdc++-3.4.3-9.EL4?
libstdc++-devel-3.4.3-9.EL4?
make-3.80-5?
pdksh-5.2.14-30?
sysstat-5.0.5-1?
xscreensaver-4.18-5.rhel4.2?
libaio-0.3.96?
?
To?see?which?versions?of?these?packages?are?installed?on?your?system,?run?the?following?command:?
rpm?-q?binutils?compat-db?control-center?gcc?gcc-c++?glibc?glibc-common?/
gnome-libs?libstdc++?libstdc++-devel?make?pdksh?sysstat?xscreensaver?libaio?openmotif21
2.?在Xmanager?中用oracle用戶,運行database的runInstaller?
3.?ORACLE安裝目錄指定到?/u01/app/oracle/product/10.2.0/db_1?
4.?把2個node選擇上?
5.?選擇?Install?database?Software?only?
6.?會要求你用完全的root權限運行?root.sh?,分別在2個node上一一運行
7.?安裝完畢?
?
四.?netca?創建監聽?
?
注:創建數據庫過程應該遵循這個順序:?先配置監聽,?再配置ASM?實例,?最后創建數據庫實例,?這樣可以減少出錯的概率。
?
1.?oracle?用戶在一個node上運行?netca?
2.?選擇所有node?
3.?選擇?Listener?configuration?
4.添加一個LISTEN,?1521?port?
然后結束配置?
?
監聽配置成功后,?2個結點上的Listener?都會坐位Application?Resource?注冊到CRS中,?這樣CRS?就可以監控Listener?的運行狀態。?我們可以通過?crs_stat?-t?-v??查看Listener?狀態。
?
?
?
五.?創建ASM?實例
?
1.?運行DBCA?命令
2.?選擇?configure?Automatic?Storage?Management,?來創建ASM?實例
3.?選擇所有結點
4.?輸入密碼和參數文件位置
???如:/u02/oradata/orcl/dbs/spfile+ASM.ora?
5.?修改asm?參數:?asm_diskstring?=?ORCL:VOL*,?這樣能讓Oracle自動發現這些硬盤
6.?ASM?實例創建完后,用Create?New?來創建ASM?磁盤組。?我們用VOL1來創建一個DATA?組,?VOL2?創建FLASH_RECOVERY_AREA組。
?
注:?Redundancy?一般選external?就是也就是不考慮冗余,假如選normal?則是mirror,?至少要一個FailGroup選High?就是triple?mirror,3倍鏡像,需要三個FailGroup?
?
7.??創建完成后,能看到組的狀態是Mount,?ASM?組必須mount之后才能使用。
?
ASM?的相關信息參考blog:
Oracle?ASM?詳解
http://blog.csdn.net/tianlesoftware/archive/2010/02/21/5314541.aspx
?
?
六.?配置dbca創建數據庫
?
1.?用oracle用戶運行?dbca?
2.?選擇custom?database?
3.?輸入數據庫的全局名,比如rac?
4.?輸入系統的角色建立密碼?
5.?選擇ASM?來存儲,?分別選擇我們剛創建的DATA?和FLASH_RECOVERY_AREA?組
6.?Database?Services?這里,你選擇Add你一個新的service,?隨便叫名字,比如oltp?
然后選擇?TAF?Policy,是Basic。?這個服務在RAC?的Failover中會用到,如果在這里沒有配置,也可以通過dbca命令,?選擇?Services?Management?來進行配置。?具體參考Blog:
?? Oracle RAC Failover 詳解
??? http://blog.csdn.net/tianlesoftware/archive/2010/03/03/5340788.aspx
?
7.?開始創建數據庫?
?
七.?檢查RAC?運行狀態
1.用oracle用戶login,?運行?
[oracle@rac1?bin]$?./srvctl?status?database?-d?rac
Instance?rac1?is?running?on?node?rac1
Instance?rac2?is?running?on?node?rac2
?
2.?$?srvctl?status?service?-d?rac?-s?rac?
Service?orcltest?is?running?on?instance(s)?orcl2,?orcl1?
?
?
3.?[oracle@rac1?bin]$?./srvctl?status?nodeapps?-n?rac1
VIP?is?running?on?node:?rac1
GSD?is?running?on?node:?rac1
Listener?is?running?on?node:?rac1
ONS?daemon?is?running?on?node:?rac1
?
4.?[oracle@rac1?bin]$?./srvctl?status?asm?-n?rac1
ASM?instance?+ASM1?is?running?on?node?rac1.
?
5.?[root@rac2?bin]#?./crs_stat?-t
Name???????????Type???????????Target????State?????Host
------------------------------------------------------------
ora.rac.db?????application????ONLINE????ONLINE????rac1
ora....orcl.cs?application????ONLINE????ONLINE????rac1
ora....ac1.srv?application????ONLINE????ONLINE????rac1
ora....ac2.srv?application????ONLINE????ONLINE????rac2
ora....c1.inst?application????ONLINE????ONLINE????rac1
ora....c2.inst?application????ONLINE????ONLINE????rac2
ora....SM1.asm?application????ONLINE????ONLINE????rac1
ora....C1.lsnr?application????ONLINE????ONLINE????rac1
ora.rac1.gsd???application????ONLINE????ONLINE????rac1
ora.rac1.ons???application????ONLINE????ONLINE????rac1
ora.rac1.vip???application????ONLINE????ONLINE????rac1
ora....SM2.asm?application????ONLINE????ONLINE????rac2
ora....C2.lsnr?application????ONLINE????ONLINE????rac2
ora.rac2.gsd???application????ONLINE????ONLINE????rac2
ora.rac2.ons???application????ONLINE????ONLINE????rac2
ora.rac2.vip???application????ONLINE????ONLINE????rac2
?
?
6.?運行?sqlplus?/nolog?
SQL>?connect?/as?sysdba?
SQL>?SELECT?inst_id?,?instance_number?inst_no?,?instance_name?inst_name?
,?parallel?,?status?,?database_status?db_status?,?active_state?state?
,?host_name?host?FROM?gv$instance?ORDER?BY?inst_id;?
?
INST_ID?INST_NO?INST_NAME??PAR?STATUS??DB_STATUS?STATE?????HOST
-------?--------?---------?---?--------?--------?---------?------
?1????????1??????rac1???????YES?OPEN?????ACTIVE??NORMAL????rac1
?2????????2??????rac2????????YES?OPEN?????ACTIVE??NORMAL????rac2
?
7.客戶端Failover測試
7.1??修改C:/windows/system32/drivers/etc/hosts?文件,添加如下內容
10.85.10.119?rac1
10.85.10.121?rac2
10.85.10.122?rac1-vip
10.85.10.123?rac2-vip
7.2?修改tnsnames.Ora?文件,增加一下內容:
RAC?=
??(DESCRIPTION?=
??????(ADDRESS?=?(PROTOCOL?=?TCP)(HOST?=?rac1-vip)(PORT?=?1521))
??????(ADDRESS?=?(PROTOCOL?=?TCP)(HOST?=?rac2-vip)(PORT?=?1521))
??????(LOAD_BALANCE=YES)
??????(
CONNECT_DATA=
?(SERVER=DEDICATED)
?(SERVICE_NAME=RAC)
?(
???FAILOVER_MODE=
? ? (TYPE=session)
?? (METHOD=basic)
?? (RETRIES=180)
??? (DELAY=5)
?)
??????)
????)
??)
7.3?客戶端用sqlplus?連接數據庫
C:/Documents?and?Settings/Administrator>sqlplus?system/admin@rac
SQL*Plus:?Release?10.2.0.1.0?-?Production?on?星期六?2月?27?02:06:40?2010
Copyright?(c)?1982,?2005,?Oracle.??All?rights?reserved.
連接到:
Oracle?Database?10g?Enterprise?Edition?Release?10.2.0.1.0?-?Production
With?the?Partitioning,?Real?Application?Clusters,?OLAP?and?Data?Mining?options
SQL>?select?instance_name?from?V$instance;
INSTANCE_NAME
--------------------------------
rac2
?
7.4?關閉rac2?數據庫
[oracle@rac2?~]$?export?ORACLE_SID=rac2
[oracle@rac2?~]$?sqlplus?/?as?sysdba
SQL*Plus:?Release?10.2.0.1.0?-?Production?on?Sat?Feb?27?02:58:48?2010
Copyright?(c)?1982,?2005,?Oracle.??All?rights?reserved.
Connected?to:
Oracle?Database?10g?Enterprise?Edition?Release?10.2.0.1.0?-?Production
With?the?Partitioning,?Real?Application?Clusters,?OLAP?and?Data?Mining?options
SQL>?select?instance_name?from?v$instance;
INSTANCE_NAME
----------------
rac2
SQL>?shutdown?immediate
Database?closed.
Database?dismounted.
ORACLE?instance?shut?down.
?
7.5?在客戶段再次查詢,自動切換到了rac1
SQL>?select?instance_name?from?V$instance;
INSTANCE_NAME
--------------------------------
rac1
?
八.?RAC?的卸載
?
卸載分為幾個部分:database的卸載和clusterware(10.2版本說法)的卸載(10.1?版本稱為CRS:cluster?ready?service)。
?
oracle?database的卸載可以利用很多方法-粗野的和溫柔的,我這里利用dbca去卸載即可。
?
database的卸載會把所有節點的instance刪除掉,并把唯一的database刪除。
卸載database的第二步驟是把所有節點的listener卸載,可以利用netca卸載即可。
?
最后卸載clusterware,可以用Clusterware?的安裝程序來卸載,也可以利用腳本進行:
$ORA_CRS_HOME/install/rootdelete.sh?-help
對本地節點和遠程節點使用不用的命令,詳細參考幫助。
[root@rac1?install]#?./rootdelete.sh?--help
Usage:?rootdelete?[-help]?[local|remote]?[nosharedvar|sharedvar]?[sharedhome|nosharedhome]?[-downgrade?[-version?<version>]]
/-help:?print?this?message
local:?if?this?node?is?the?node?where?OUI?is?to?be?run?to?deinstall,?otherwise?use?'remote'
sharedvar:?OCR?is?on?a?shared?path,?otherwise?use?'nosharedvar'
sharedhome:?CRS?home?is?on?a?shared?path,?otherwuse?use?'nosharedhome'
/-downgrade:?Oracle?clusterware?and?OCR?will?be?reset?for?downgrade
/-version?<version>:?OCR?location?file?will?reset?for?downgrade?to?specified?version,?default:?10.1
?
[root@rac1?install]#?./rootdelete.sh?local?sharedvar?sharedhome?-downgrade
[root@rac1?install]#?./rootdelete.sh?remote?sharedvar?sharedhome?-downgrade
最后在本地節點執行
$ORA_CRS_HOME/install/rootdeinstall.sh
即可。
?
腳本執行是比較安全的方式,完成之后把相關目錄刪除即可完成clusterware的卸載。
?
補充:RAC?安裝的相關問題解決方法:
問題一:
安裝好RAC后,在用DBCA建庫時選擇ASM做為存儲方案時,有時候會報錯說ASM是單實例環境,不是RAC環境,這樣就無法繼續建庫下來,出錯信息如下:
The?ASM?instance?configured?on?the?local?node?is?a?single-instance?ASM.To?create?a?single-instance?database?using?this?ASM?instance?,restart?DBCA?and?select?the?single-instance?database?option?,to?create?a?RAC?database?using?this?ASM?instance,convert?it?to?RAC?ASM?first.
?
這個錯誤一般是發生在重裝clusterware和database后,這樣無論怎么樣重啟DBCA運行都會報同樣的錯。具體的解決辦法便是在/etc/oratab里面的關于ASM的記錄:+ASM1:/u01/app/oracle/product/10.2.0/db_1:N這么一行刪除掉,再接著建庫就可以了。
?
問題二:
?
創建ASM時報:ORA-12547:TNS:lost?contact
解決方法:
$?cd?$ORACLE_HOME/rdbms/lib?
$?make?-f?ins_rdbms.mk?ioracle?
?
問題三
1.counld?not?start?cluster?stack.?This?must?be?resolved?before?any?OCFS2?filesystem?can?be?mounted
?
?This?problem?can?be?caused?by?different?version?of?ocfs2?lib?and?redhat?kernel.selinux?is?allowed?is?another?possible?reason.
?
??tail?-n100?/var/log/messages:
?
?May?18?12:10:27?rac1?kernel:?SELinux:?initialized?(dev?configfs,?type?configfs),?not?configured?for?labeling
?May?18?12:10:27?rac1?kernel:?audit(1211083827.759:7):?avc:??denied??{?mount?}?for??pid=12346?comm="mount"?name="/"?dev=configfs?ino=44504?scontext=root:system_r:initrc_t?tcontext=system_u:object_r:unlabeled_t?tclass=filesystem
?May?18?12:10:30?rac1?dbus:?Can't?send?to?audit?system:?USER_AVC?pid=2642?uid=81?loginuid=-1?message=avc:??denied??{?send_msg?}?for??scontext=root:system_r:unconfined_t?tcontext=user_u:system_r:initrc_t?tclass=dbus
?May?18?12:11:05?rac1?last?message?repeated?7?times
?May?18?12:12:10?rac1?last?message?repeated?13?times
?
?[root@rac1?/]#vi?/etc/selinux/config
?#SELINUX=enforcing
?SELINUX=disabled
?
?[root@rac1?/]#?setenforce?0
?setenforce:?SELinux?is?disabled
?
問題四
?2.The?cluster?stack?has?been?started.?It?needs?to?be?running?for?any?clustering?functionality?to?happen.
?Please?run?"/etc/init.d/o2cb?enable"?to?have?it?started?upon?bootup.
?
?o2cb_ctl:?Unable?to?access?cluster?service?while?creating?node?Could?not?add?node?rac1
?
?[root@rac1?init.d]#?./o2cb?enable
?Writing?O2CB?configuration:?OK
?Starting?O2CB?cluster?ocfs2:?Failed
?Cluster?ocfs2?created
?o2cb_ctl:?Configuration?error?discovered?while?populating?cluster?ocfs2.??None?of?its?nodes?were?considered?local.?
?A?node?is?considered?local?when?its?node?name?in?the?configuration?matches?this?machine's?host?name.
?Stopping?O2CB?cluster?ocfs2:?OK
?
?[root@rac1?ocfs2]#?pwd
?/etc/ocfs2
?[root@rac1?ocfs2]#?ls
?cluster.conf
?[root@rac1?ocfs2]#?mv?cluster.conf?cluster.conf.Bak
?
?
問題5:?在CRS安裝時,最后執行root.sh時,后執行的節點上無法成功,提示:
#?./root.sh
WARNING:?directory?'/u01/app/oracle/product/10.2.0'?is?not?owned?by?root
WARNING:?directory?'/u01/app/oracle/product'?is?not?owned?by?root
WARNING:?directory?'/u01/app/oracle'?is?not?owned?by?root
WARNING:?directory?'/u01/app'?is?not?owned?by?root
WARNING:?directory?'/u01'?is?not?owned?by?root
Checking?to?see?if?Oracle?CRS?stack?is?already?configured
Setting?the?permissions?on?OCR?backup?directory
Setting?up?NS?directories
Failed?to?upgrade?Oracle?Cluster?Registry?configuration
另外有一種提示為:PRIF-10:?failed?to?initialize?the?cluster?registry?
解決方法:關閉共享磁盤的鎖定屬性
SSA或者FASTT系列盤陣關閉磁盤鎖定用:/usr/sbin/chdev?-l?hdiskn?-a?reserve_lock=no
ESS,EMC,HDS,CLARIION系列盤陣關閉磁盤鎖定用:/usr/sbin/chdev?-l?hdiskn?-a?reserve_policy=no_reserve
在虛擬機上就是添加參數.?disk.locking?=?"false"?
轉載于:https://www.cnblogs.com/hibernate315/archive/2010/02/27/2399318.html
總結
以上是生活随笔為你收集整理的OCFS2+ASM 的RAC安装文档的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: JQuery的摸索之路(二比较)
- 下一篇: CALL TRANSACTION