當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

strocli64 源码_storcli64和smartctl定位硬盘的故障信息

發布時間：2023/12/20 编程问答 26 豆豆

生活随笔收集整理的這篇文章主要介紹了 strocli64 源码_storcli64和smartctl定位硬盘的故障信息小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

定位硬盤盤位和盤符的方法

from lin.wang

section one : introduction

strocli是megacli的升級版本，針對于戴爾服務器是perccli，用法完全一致

smartctl可以查看磁盤的主控芯片smart信息

lsscsi可以查看系統的scsi信息，數據來源/proc/scsi/scsi相關，該文檔此處暫不介紹

這些工具都是查看磁盤相關信息的常用工具，對于排查磁盤狀態和raid卡問題都有幫助

section two : install package

安裝一下storcli或者perccli，并且將命令軟連接到/usr/bin/目錄下，方便使用命令：

ln -s /opt/megaraid/storcli/storcli64 /usr/bin/

ln -s /opt/megaraid/perccli/percclie64 /usr/bin/

section three : step

由系統磁盤盤符/dev/sdf定位對應的硬盤盤位思路如下：

perccli64 /c0/eall/sall show 看到該磁盤有

img-/c0/eall/sall

從該圖看到有四個jbod分區，根據經驗一般人為jbod的分區系統盤符會在raid分區之前，也就是說jbod的分區會從/dev/sda > /dev/sdd，raid的分區從/dev/sde開始；

dg代表drive group，是配置raid建分組的順序，有圖上看到32:4和32:5是一個卷組。

perccli64 /c0/vall show看到該磁盤的dg與vd的對應關系如下

img-/c0/vall

? 由圖上看到dg/vd就是raid的卷組和系統里卷組的順序對應關系，一般如果服務器只有raid卷組來說的話，vd0就是操作系統里的/dev/sda，以此類推；但是如果服務器包括了jbod卷組，則raid的卷組從jbod后開始排序，本例中也就是vd0＝/dev/sde，則要定位/dev/sdf的話vd＝１，對應dg＝１；

? 回到img-/c0/eall/sall上，dg為1時，did=6，did就是device id，這個概念后邊有用；同時slot no.也就是slt = 6對應的服務器上盤位就是第７個(從0開始到6)，此時即定位到了/dev/sdf的物理盤位。

反之從服務器上看到硬盤故障燈，可以反推對應的系統分區盤符

note:

? 如果服務器沒有jbod卷組，全是raid的，則此時/c0/vall找到對應關系即可定位關聯關系

? 實際操作時還可以通過 perccli64 /c0/e32/s6 start/stop locate點亮關閉磁盤燈，來判斷定位是否正確

section four : storcli/perccli usage

查看控制器的信息

perccli64 show ctrlcount 查看有幾個控制器即幾個raid卡

perccli64 show 顯示raid卡信息

[root@node-15 ~]# perccli64 show

status code = 0

status = success

description = none

number of controllers = 1

host name = node-15.domain.tld

operating system = linux3.10.0-327.20.1.es2.el7.x86_64

system overview :

===============

------------------------------------------------------------------------

ctl model ports pds dgs dnopt vds vnopt bbu spr ds ehs asos hlth

------------------------------------------------------------------------

0 perch730mini 8 16 11 0 11 0 opt on 3 n 0 opt

------------------------------------------------------------------------

ctl=controller index|dgs=drive groups|vds=virtual drives|fld=failed

pds=physical drives|dnopt=dg notoptimal|vnopt=vd notoptimal|opt=optimal

msng=missing|dgd=degraded|ndatn=need attention|unkwn=unknown

spr=scheduled patrol read|ds=dimmerswitch|ehs=emergency hot spare

y=yes|n=no|asos=advanced software options|bbu=battery backup unit

hlth=health|safe=safe-mode boot

可以看到只有一個raid卡，ctrl 0也是就是/c0

storcli64 /c0 show

[root@node-15 ~]# perccli64 /c0 show

generating detailed summary of the adapter, it may take a while to complete.

controller = 0

status = success

description = none

product name = perc h730 mini

serial number = 663021z

sas address = 51866da066153000

pci address = 00:03:00:00

system time = 01/10/2019 20:48:38

mfg. date = 06/17/16

controller time = 01/10/2019 12:44:21

fw package build = 25.4.0.0017

bios version = 6.29.00.0_4.16.07.00_0x06120100

fw version = 4.260.00-6259

driver name = megaraid_sas

driver version = 06.807.10.00-rh1

current personality = raid-mode

vendor id = 0x1000

device id = 0x5d

subvendor id = 0x1028

subdevice id = 0x1f49

host interface = pci-e

device interface = sas-12g

bus number = 3

device number = 0

function number = 0

drive groups = 11

topology :

========

---------------------------------------------------------------------------

dg arr row eid:slot did type state bt size pdc pi sed ds3 fspace tr

---------------------------------------------------------------------------

0 - - - - raid1 optl n 931.0 gb dflt n n dflt n n

0 0 - - - raid1 optl n 931.0 gb dflt n n dflt n n

0 0 0 32:4 4 drive onln n 931.0 gb dflt n n dflt - n

0 0 1 32:5 5 drive onln n 931.0 gb dflt n n dflt - n

1 - - - - raid0 optl n 931.0 gb dflt n n dflt n n

1 0 - - - raid0 optl n 931.0 gb dflt n n dflt n n

1 0 0 32:6 6 drive onln n 931.0 gb dflt n n dflt - n

2 - - - - raid0 optl n 931.0 gb dflt n n dflt n n

2 0 - - - raid0 optl n 931.0 gb dflt n n dflt n n

2 0 0 32:7 7 drive onln n 931.0 gb dflt n n dflt - n

3 - - - - raid0 optl n 931.0 gb dflt n n dflt n n

3 0 - - - raid0 optl n 931.0 gb dflt n n dflt n n

3 0 0 32:8 8 drive onln n 931.0 gb dflt n n dflt - n

4 - - - - raid0 optl n 931.0 gb dflt n n dflt n n

4 0 - - - raid0 optl n 931.0 gb dflt n n dflt n n

4 0 0 32:9 9 drive onln n 931.0 gb dflt n n dflt - n

5 - - - - raid0 optl n 931.0 gb dflt n n dflt n n

5 0 - - - raid0 optl n 931.0 gb dflt n n dflt n n

5 0 0 32:10 10 drive onln n 931.0 gb dflt n n dflt - n

6 - - - - raid0 optl n 931.0 gb dflt n n dflt n n

6 0 - - - raid0 optl n 931.0 gb dflt n n dflt n n

6 0 0 32:11 11 drive onln n 931.0 gb dflt n n dflt - n

7 - - - - raid0 optl n 931.0 gb dflt n n dflt n n

7 0 - - - raid0 optl n 931.0 gb dflt n n dflt n n

7 0 0 32:12 12 drive onln n 931.0 gb dflt n n dflt - n

8 - - - - raid0 optl n 931.0 gb dflt n n dflt n n

8 0 - - - raid0 optl n 931.0 gb dflt n n dflt n n

8 0 0 32:13 13 drive onln n 931.0 gb dflt n n dflt - n

9 - - - - raid0 optl n 931.0 gb dflt n n dflt n n

9 0 - - - raid0 optl n 931.0 gb dflt n n dflt n n

9 0 0 32:14 14 drive onln n 931.0 gb dflt n n dflt - n

10 - - - - raid0 optl n 931.0 gb dflt n n dflt n n

10 0 - - - raid0 optl n 931.0 gb dflt n n dflt n n

10 0 0 32:15 15 drive onln n 931.0 gb dflt n n dflt - n

---------------------------------------------------------------------------

dg=disk group index|arr=array index|row=row index|eid=enclosure device id

did=device id|type=drive type|onln=online|rbld=rebuild|dgrd=degraded

pdgd=partially degraded|offln=offline|bt=background task active

pdc=pd cache|pi=protection info|sed=self encrypting drive|frgn=foreign

ds3=dimmer switch 3|dflt=default|msng=missing|fspace=free space present

tr=transport ready

virtual drives = 11

vd list :

=======

-------------------------------------------------------------

dg/vd type state access consist cache cac scc size name

-------------------------------------------------------------

0/0 raid1 optl rw yes rwbd - off 931.0 gb

1/1 raid0 optl rw yes rwbd - off 931.0 gb

2/2 raid0 optl rw yes rwbd - off 931.0 gb

3/3 raid0 optl rw yes rwbd - off 931.0 gb

4/4 raid0 optl rw yes rwbd - off 931.0 gb

5/5 raid0 optl rw yes rwbd - off 931.0 gb

6/6 raid0 optl rw yes rwbd - off 931.0 gb

7/7 raid0 optl rw yes rwbd - off 931.0 gb

8/8 raid0 optl rw yes rwbd - off 931.0 gb

9/9 raid0 optl rw yes rwbd - off 931.0 gb

10/10 raid0 optl rw yes rwbd - off 931.0 gb

-------------------------------------------------------------

cac=cachecade|rec=recovery|ofln=offline|pdgd=partially degraded|dgrd=degraded

consist=consistent|r=read ahead always|nr=no read ahead|wb=writeback|

fwb=force writeback|wt=writethrough|c=cached io|d=direct io|scc=scheduled

check consistency

physical drives = 16

pd list :

=======

----------------------------------------------------------------------------

eid:slt did state dg size intf med sed pi sesz model sp

----------------------------------------------------------------------------

32:0 0 jbod - 185.75 gb sata ssd n n 512b intel ssdsc2bx200g4r u

32:1 1 jbod - 185.75 gb sata ssd n n 512b intel ssdsc2bx200g4r u

32:2 2 jbod - 185.75 gb sata ssd n n 512b intel ssdsc2bx200g4r u

32:3 3 jbod - 185.75 gb sata ssd n n 512b intel ssdsc2bx200g4r u

32:4 4 onln 0 931.0 gb sata hdd n n 512b st91000640ns u

32:5 5 onln 0 931.0 gb sata hdd n n 512b st91000640ns u

32:6 6 onln 1 931.0 gb sata hdd n n 512b st91000640ns u

32:7 7 onln 2 931.0 gb sata hdd n n 512b st91000640ns u

32:8 8 onln 3 931.0 gb sata hdd n n 512b st91000640ns u

32:9 9 onln 4 931.0 gb sata hdd n n 512b st91000640ns u

32:10 10 onln 5 931.0 gb sata hdd n n 512b st91000640ns u

32:11 11 onln 6 931.0 gb sata hdd n n 512b st91000640ns u

32:12 12 onln 7 931.0 gb sata hdd n n 512b st91000640ns u

32:13 13 onln 8 931.0 gb sata hdd n n 512b st91000640ns u

32:14 14 onln 9 931.0 gb sata hdd n n 512b st91000640ns u

32:15 15 onln 10 931.0 gb sata hdd n n 512b st91000640ns u

----------------------------------------------------------------------------

eid-enclosure device id|slt-slot no.|did-device id|dg-drivegroup

dhs-dedicated hot spare|ugood-unconfigured good|ghs-global hotspare

ubad-unconfigured bad|onln-online|offln-offline|intf-interface

med-media type|sed-self encryptive drive|pi-protection info

ugunsp-unsupported|ugshld-unconfigured shielded|hspshld-hotspare shielded

cfshld-configured shielded|cpybck-copyback|cbshld-copyback shielded

bbu_info :

========

----------------------------------------------

model state retentiontime temp mode mfgdate

----------------------------------------------

bbu optimal 0 hour(s) 38c - 0/00/00

----------------------------------------------

看磁盤的device id、slot no. 以及drivegroup

[root@node-15 ~]# perccli64 /c0/eall/sall show

controller = 0

status = success

description = show drive information succeeded.

drive information :

=================

----------------------------------------------------------------------------

eid:slt did state dg size intf med sed pi sesz model sp

----------------------------------------------------------------------------

32:0 0 jbod - 185.75 gb sata ssd n n 512b intel ssdsc2bx200g4r u

32:1 1 jbod - 185.75 gb sata ssd n n 512b intel ssdsc2bx200g4r u

32:2 2 jbod - 185.75 gb sata ssd n n 512b intel ssdsc2bx200g4r u

32:3 3 jbod - 185.75 gb sata ssd n n 512b intel ssdsc2bx200g4r u

32:4 4 onln 0 931.0 gb sata hdd n n 512b st91000640ns u

32:5 5 onln 0 931.0 gb sata hdd n n 512b st91000640ns u

32:6 6 onln 1 931.0 gb sata hdd n n 512b st91000640ns u

32:7 7 onln 2 931.0 gb sata hdd n n 512b st91000640ns u

32:8 8 onln 3 931.0 gb sata hdd n n 512b st91000640ns u

32:9 9 onln 4 931.0 gb sata hdd n n 512b st91000640ns u

32:10 10 onln 5 931.0 gb sata hdd n n 512b st91000640ns u

32:11 11 onln 6 931.0 gb sata hdd n n 512b st91000640ns u

32:12 12 onln 7 931.0 gb sata hdd n n 512b st91000640ns u

32:13 13 onln 8 931.0 gb sata hdd n n 512b st91000640ns u

32:14 14 onln 9 931.0 gb sata hdd n n 512b st91000640ns u

32:15 15 onln 10 931.0 gb sata hdd n n 512b st91000640ns u

----------------------------------------------------------------------------

eid-enclosure device id|slt-slot no.|did-device id|dg-drivegroup

dhs-dedicated hot spare|ugood-unconfigured good|ghs-global hotspare

ubad-unconfigured bad|onln-online|offln-offline|intf-interface

med-media type|sed-self encryptive drive|pi-protection info

ugunsp-unsupported|ugshld-unconfigured shielded|hspshld-hotspare shielded

cfshld-configured shielded|cpybck-copyback|cbshld-copyback shielded

note:

? 根據經驗，jbod的分區在raid的分區之前

查看指定硬盤的信息

[root@node-15 ~]# perccli64 /c0/e32/s6 show all

controller = 0

status = success

description = show drive information succeeded.

drive /c0/e32/s6 :

================

-------------------------------------------------------------------

eid:slt did state dg size intf med sed pi sesz model sp

-------------------------------------------------------------------

32:6 6 onln 1 931.0 gb sata hdd n n 512b st91000640ns u

-------------------------------------------------------------------

eid-enclosure device id|slt-slot no.|did-device id|dg-drivegroup

dhs-dedicated hot spare|ugood-unconfigured good|ghs-global hotspare

ubad-unconfigured bad|onln-online|offln-offline|intf-interface

med-media type|sed-self encryptive drive|pi-protection info

ugunsp-unsupported|ugshld-unconfigured shielded|hspshld-hotspare shielded

cfshld-configured shielded|cpybck-copyback|cbshld-copyback shielded

drive /c0/e32/s6 - detailed information :

=======================================

drive /c0/e32/s6 state :

======================

shield counter = 0

media error count = 46431 *** 很明顯的問題發生了46431次介質錯誤 ***

other error count = 0

drive temperature = 31c (87.80 f)

predictive failure count = 126 *** 預測故障次數126次 ***

s.m.a.r.t alert flagged by drive = yes

drive /c0/e32/s6 device attributes :

==================================

sn = 9xga228l

manufacturer id = ata

model number = st91000640ns

nand vendor = na

wwn = 5000c500918f2f8a

firmware revision = aa63

raw size = 931.512 gb [0x74706db0 sectors]

coerced size = 931.0 gb [0x74600000 sectors]

non coerced size = 931.012 gb [0x74606db0 sectors]

device speed = 6.0gb/s

link speed = 12.0gb/s

ncq setting = n/a

write cache = enabled

logical sector size = 512b

physical sector size = 512b

connector name = 00

drive /c0/e32/s6 policies/settings :

==================================

drive position = drivegroup:1, span:0, row:0

enclosure position = 0

connected port number = 0(path0)

sequence number = 2

commissioned spare = no

emergency spare = no

last predictive failure event sequence number = 95183 *** 上一次預測錯誤的序號95183 ***

successful diagnostics completion on = n/a

sed capable = no

sed enabled = no

secured = no

cryptographic erase capable = no

locked = no

needs ekm attention = no

pi eligible = no

certified = yes

wide port capable = no

port information :

================

-----------------------------------------

port status linkspeed sas address

-----------------------------------------

0 active 12.0gb/s 0x500056b33fefe586

-----------------------------------------

inquiry data =

5a 0c ff 3f 37 c8 10 00 00 00 00 00 3f 00 00 00

00 00 00 00 20 20 20 20 20 20 20 20 20 20 20 20

58 39 41 47 32 32 4c 38 00 00 00 00 04 00 20 20

20 20 41 41 33 36 54 53 31 39 30 30 36 30 30 34

53 4e 20 20 20 20 20 20 20 20 20 20 20 20 20 20

20 20 20 20 20 20 20 20 20 20 20 20 20 20 10 80

00 40 00 2f 00 40 00 02 00 02 07 00 ff 3f 10 00

3f 00 10 fc fb 00 10 00 ff ff ff 0f 00 00 07 00

note:

通過單個卷組的信息查看，發現了media error，說明了硬盤是有問題的

查看磁盤與系統磁盤分區的對應

[root@node-15 ~]# perccli64 /c0/vall show

controller = 0

status = success

description = none

virtual drives :

==============

-------------------------------------------------------------

dg/vd type state access consist cache cac scc size name

-------------------------------------------------------------

0/0 raid1 optl rw yes rwbd - off 931.0 gb

1/1 raid0 optl rw yes rwbd - off 931.0 gb

2/2 raid0 optl rw yes rwbd - off 931.0 gb

3/3 raid0 optl rw yes rwbd - off 931.0 gb

4/4 raid0 optl rw yes rwbd - off 931.0 gb

5/5 raid0 optl rw yes rwbd - off 931.0 gb

6/6 raid0 optl rw yes rwbd - off 931.0 gb

7/7 raid0 optl rw yes rwbd - off 931.0 gb

8/8 raid0 optl rw yes rwbd - off 931.0 gb

9/9 raid0 optl rw yes rwbd - off 931.0 gb

10/10 raid0 optl rw yes rwbd - off 931.0 gb

-------------------------------------------------------------

cac=cachecade|rec=recovery|ofln=offline|pdgd=partially degraded|dgrd=degraded

consist=consistent|r=read ahead always|nr=no read ahead|wb=writeback|

fwb=force writeback|wt=writethrough|c=cached io|d=direct io|scc=scheduled

check consistency

note:

vd：一般認為是該硬盤在系統里的設備順序，一般如果只有raid分區，那么vd=0的就是系統里的/dev/sda，vd=1就是/dev/sdb以此類推，但是如果有jbod的分區，先排列jbod分區，如jbod的到了/dev/sdc，vd０則是/dev/sdd，以此類推；

dg：是在raid卡里配置卷組的順序；

raid卡日志收集相關命令

storcli64 /c0 show time 顯示raid的時間

storcli64 /c0 show alilog logfile=node-x.alilog 獲取alilog，所有的log都包括了

storcli64 /c0 show all logfile=node-x.all.log raid卡的信息

storcli64 /c0 show badblocks 磁盤壞道的信息

perccli64 /c0 show events filter=fatal 顯示事件級別為fatal的，可以獲取所有毀滅性事件的信息，發現磁盤故障或raid卡故障

perccli64 /c0 show cc 數據一致性檢測，raid1以上的級別多個盤的數據是需要進行一致性檢測的，但是單盤raid0可能是不需要的，是否影響性能不確定

section five : smartctl get error info of disks

common commands usage description

--scan scan for devices

--scan-open scan for devices and try to open each device

-x, --xall show all information for device

-a, --all show all smart information for device

-i, --info show identity information for device

-d type, --device=type specify device type to one of: ata, scsi, nvme[,nsid], sat[,auto][,n][+type], usbcypress[,x], usbjmicron[,p][,x][,n], usbprolific, usbsunplus, marvell, areca,n/e, 3ware,n, hpt,l/m/n, megaraid,n, aacraid,h,l,id, cciss,n, auto, test

-s value, --smart=value enable/disable smart on device (on/off)

-o value, --offlineauto=value(ata) enable/disable automatic offline testing on device (on/off)

-s value, --saveauto=value(ata) enable/disable attribute autosave on device (on/off)

-h, --health show device smart health status

-c, --capabilities(ata,nvme) show device smart capabilities

-a, --attributes show device smart vendor-specific attributes and values

-l type, --log=type show device log. type: error, selftest, selective, directory[,g|s],

? xerror[,n][,error], xselftest[,n][,selftest],

? background, sasphy[,reset], sataphy[,reset],

? scttemp[sts,hist], scttempint,n[,p],

? scterc[,n,m], devstat[,n], ssd,

? gplog,n[,range], smartlog,n[,range],

? nvmelog,n,size

-t test, --test=test run test. test: offline, short, long, conveyance, force, vendor,n,

? select,m-n, pending,n, afterselect,[on|off]

-x, --abort abort any non-captive test on device

get info for /dev/sdf

查看所有設備列表

[root@node-15 ~]# smartctl --scan

/dev/sda -d scsi # /dev/sda, scsi device

/dev/sdb -d scsi # /dev/sdb, scsi device

/dev/sdc -d scsi # /dev/sdc, scsi device

/dev/sdd -d scsi # /dev/sdd, scsi device

/dev/sde -d scsi # /dev/sde, scsi device

/dev/sdf -d scsi # /dev/sdf, scsi device

/dev/sdg -d scsi # /dev/sdg, scsi device

/dev/sdh -d scsi # /dev/sdh, scsi device

/dev/sdi -d scsi # /dev/sdi, scsi device

/dev/sdj -d scsi # /dev/sdj, scsi device

/dev/sdk -d scsi # /dev/sdk, scsi device

/dev/sdl -d scsi # /dev/sdl, scsi device

/dev/sdm -d scsi # /dev/sdm, scsi device

/dev/sdn -d scsi # /dev/sdn, scsi device

/dev/sdo -d scsi # /dev/sdo, scsi device

/dev/bus/0 -d megaraid,0 # /dev/bus/0 [megaraid_disk_00], scsi device

/dev/bus/0 -d megaraid,1 # /dev/bus/0 [megaraid_disk_01], scsi device

/dev/bus/0 -d megaraid,2 # /dev/bus/0 [megaraid_disk_02], scsi device

/dev/bus/0 -d megaraid,3 # /dev/bus/0 [megaraid_disk_03], scsi device

/dev/bus/0 -d megaraid,4 # /dev/bus/0 [megaraid_disk_04], scsi device

/dev/bus/0 -d megaraid,5 # /dev/bus/0 [megaraid_disk_05], scsi device

/dev/bus/0 -d megaraid,6 # /dev/bus/0 [megaraid_disk_06], scsi device

/dev/bus/0 -d megaraid,7 # /dev/bus/0 [megaraid_disk_07], scsi device

/dev/bus/0 -d megaraid,8 # /dev/bus/0 [megaraid_disk_08], scsi device

/dev/bus/0 -d megaraid,9 # /dev/bus/0 [megaraid_disk_09], scsi device

/dev/bus/0 -d megaraid,10 # /dev/bus/0 [megaraid_disk_10], scsi device

/dev/bus/0 -d megaraid,11 # /dev/bus/0 [megaraid_disk_11], scsi device

/dev/bus/0 -d megaraid,12 # /dev/bus/0 [megaraid_disk_12], scsi device

/dev/bus/0 -d megaraid,13 # /dev/bus/0 [megaraid_disk_13], scsi device

/dev/bus/0 -d megaraid,14 # /dev/bus/0 [megaraid_disk_14], scsi device

/dev/bus/0 -d megaraid,15 # /dev/bus/0 [megaraid_disk_15], scsi device

note:

通過前面的章節我們定位到了磁盤/dev/sdf在perccli里的did即device_id為6，也就是/dev/bus/0 -d megaraid,6

查看磁盤信息

[root@node-15 ~]# smartctl -i -d megaraid,6 /dev/sdf

smartctl 6.5 2016-05-07 r4318 [x86_64-linux-3.10.0-327.20.1.es2.el7.x86_64] (local build)

=== start of information section ===

model family: seagate constellation.2 (sata)

device model: st91000640ns

serial number: 9xga228l

lu wwn device id: 5 000c50 0918f2f8a

add. product id: dell(tm)

firmware version: aa63

user capacity: 1,000,204,886,016 bytes [1.00 tb]

sector size: 512 bytes logical/physical

rotation rate: 7200 rpm

form factor: 2.5 inches

device is: in smartctl database [for details use: -p show]

ata version is: ata8-acs t13/1699-d revision 4

sata version is: sata 3.0, 6.0 gb/s (current: 6.0 gb/s)

local time is: fri jan 11 11:28:46 2019 cst

smart support is: available - device has smart capability.

smart support is: enabled

查看磁盤的屬性信息

一般此處可以用來查看磁盤的整體健康狀態指標參數

針對以下輸出信息，字段的解釋

id：屬性id，通常是一個1到255之間的十進制或十六進制的數字。

attribute_name：硬盤制造商定義的屬性名。

flag：屬性操作標志(可以忽略)。

value：這是表格中最重要的信息之一，代表給定屬性的標準化值，在1到253之間。253意味著最好情況，1意味著最壞情況。取決于屬性和制造商，初始化value可以被設置成100或200.

worst：所記錄的最小value。

thresh：在報告硬盤failed狀態前，worst可以允許的最小值，也就是worst如果小于thresh,磁盤就會報告failed。

type：屬性的類型(pre-fail或oldage)。pre-fail類型的屬性可被看成一個關鍵屬性，表示參與磁盤的整體smart健康評估(passed/failed)。如果任何pre-fail類型的屬性故障，那么可視為磁盤將要發生故障。另一方面，oldage類型的屬性可被看成一個非關鍵的屬性(如正常的磁盤磨損)，表示不會使磁盤本身發生故障。

updated：表示屬性的更新頻率。offline代表磁盤上執行離線測試的時間。

when_failed：如果value小于等于thresh，會被設置成“failing_now”；如果worst小于等于thresh會被設置成“in_the_past”；如果都不是，會被設置成“-”。在“failing_now”情況下，需要盡快備份重要文件，特別是屬性是pre-fail類型時。“in_the_past”代表屬性已經故障了，但在運行測試的時候沒問題。“-”代表這個屬性從沒故障過。

raw_value：制造商定義的原始值，從value派生。

[root@node-15 ~]# smartctl -a -d megaraid,6 /dev/sdf

smartctl 6.5 2016-05-07 r4318 [x86_64-linux-3.10.0-327.20.1.es2.el7.x86_64] (local build)

=== start of read smart data section ===

smart attributes data structure revision number: 10

vendor specific smart attributes with thresholds:

id# attribute_name flag value worst thresh type updated when_failed raw_value

1 raw_read_error_rate 0x010f 081 038 044 pre-fail always in_the_past 151546765

3 spin_up_time 0x0103 094 094 000 pre-fail always - 0

4 start_stop_count 0x0032 100 100 020 old_age always - 21

5 reallocated_sector_ct 0x0133 100 100 036 pre-fail always - 0

7 seek_error_rate 0x000f 085 060 030 pre-fail always - 338813105

9 power_on_hours 0x0032 079 079 000 old_age always - 18784

10 spin_retry_count 0x0013 100 100 097 pre-fail always - 0

12 power_cycle_count 0x0032 100 100 020 old_age always - 21

184 end-to-end_error 0x0032 100 100 099 old_age always - 0

187 reported_uncorrect 0x0032 001 001 000 old_age always - 1710

188 command_timeout 0x0032 100 100 000 old_age always - 0

189 high_fly_writes 0x003a 100 100 000 old_age always - 0

190 airflow_temperature_cel 0x0022 069 053 045 old_age always - 31 (min/max 24/40)

191 g-sense_error_rate 0x0032 100 100 000 old_age always - 0

192 power-off_retract_count 0x0032 100 100 000 old_age always - 19

193 load_cycle_count 0x0032 100 100 000 old_age always - 852

194 temperature_celsius 0x0022 031 047 000 old_age always - 31 (0 14 0 0 0)

195 hardware_ecc_recovered 0x001a 117 099 000 old_age always - 151546765

197 current_pending_sector 0x0012 084 084 000 old_age always - 688

198 offline_uncorrectable 0x0010 084 084 000 old_age offline - 688

199 udma_crc_error_count 0x003e 200 200 000 old_age always - 0

240 head_flying_hours 0x0000 100 253 000 old_age offline - 8093 (164 214 0)

241 total_lbas_written 0x0000 100 253 000 old_age offline - 1870535293

242 total_lbas_read 0x0000 100 253 000 old_age offline - 1530387871

查看磁盤的健康檢測狀態

note:

關于以下檢測結果，說明檢測結果是passed的，就是磁盤還可以使用，但是列出了一條檢測異常的worst

[root@node-15 ~]# smartctl -h -d megaraid,6 /dev/sdf

smartctl 6.5 2016-05-07 r4318 [x86_64-linux-3.10.0-327.20.1.es2.el7.x86_64] (local build)

=== start of read smart data section ===

smart status not supported: ata return descriptor not supported by controller firmware

smart overall-health self-assessment test result: passed

warning: this result is based on an attribute check.

please note the following marginal attributes:

id# attribute_name flag value worst thresh type updated when_failed raw_value

1 raw_read_error_rate 0x010f 081 038 044 pre-fail always in_the_past 151546765

查看磁盤的錯誤日志

[root@node-15 ~]# smartctl -l error -d megaraid,6 /dev/sdf

smartctl 6.5 2016-05-07 r4318 [x86_64-linux-3.10.0-327.20.1.es2.el7.x86_64] (local build)

=== start of read smart data section ===

smart error log version: 1

ata error count: 46431 (device log contains only the most recent five errors)

cr = command register [hex]

fr = features register [hex]

sc = sector count register [hex]

sn = sector number register [hex]

cl = cylinder low register [hex]

ch = cylinder high register [hex]

dh = device/head register [hex]

dc = device command register [hex]

er = error register [hex]

st = status register [hex]

powered_up_time is measured from power on, and printed as

ddd+hh:mm:ss.sss where dd=days, hh=hours, mm=minutes,

ss=sec, and sss=millisec. it "wraps" after 49.710 days.

error 46431 occurred at disk power-on lifetime: 18640 hours (776 days + 16 hours)

when the command that caused the error occurred, the device was active or idle.

after command completion occurred, registers were:

er st sc sn cl ch dh

-- -- -- -- -- -- --

40 51 00 ff ff ff 0f error: unc at lba = 0x0fffffff = 268435455

commands leading to the command that caused the error were:

cr fr sc sn cl ch dh dc powered_up_time command/feature_name

-- -- -- -- -- -- -- -- ---------------- --------------------

42 00 00 ff ff ff 4f 00 46d+15:15:32.968 read verify sector(s) ext

42 00 00 ff ff ff 4f 00 46d+15:15:29.901 read verify sector(s) ext

42 00 00 ff ff ff 4f 00 46d+15:15:26.825 read verify sector(s) ext

42 00 00 ff ff ff 4f 00 46d+15:15:23.965 read verify sector(s) ext

42 00 00 ff ff ff 4f 00 46d+15:15:20.905 read verify sector(s) ext

error 46430 occurred at disk power-on lifetime: 18640 hours (776 days + 16 hours)

when the command that caused the error occurred, the device was active or idle.

after command completion occurred, registers were:

er st sc sn cl ch dh

-- -- -- -- -- -- --

40 51 00 ff ff ff 0f error: unc at lba = 0x0fffffff = 268435455

commands leading to the command that caused the error were:

cr fr sc sn cl ch dh dc powered_up_time command/feature_name

-- -- -- -- -- -- -- -- ---------------- --------------------

42 00 00 ff ff ff 4f 00 46d+15:15:29.901 read verify sector(s) ext

42 00 00 ff ff ff 4f 00 46d+15:15:26.825 read verify sector(s) ext

42 00 00 ff ff ff 4f 00 46d+15:15:23.965 read verify sector(s) ext

42 00 00 ff ff ff 4f 00 46d+15:15:20.905 read verify sector(s) ext

42 00 00 ff ff ff 4f 00 46d+15:15:18.093 read verify sector(s) ext

error 46429 occurred at disk power-on lifetime: 18640 hours (776 days + 16 hours)

when the command that caused the error occurred, the device was active or idle.

after command completion occurred, registers were:

er st sc sn cl ch dh

-- -- -- -- -- -- --

40 51 00 ff ff ff 0f error: unc at lba = 0x0fffffff = 268435455

commands leading to the command that caused the error were:

cr fr sc sn cl ch dh dc powered_up_time command/feature_name

-- -- -- -- -- -- -- -- ---------------- --------------------

42 00 00 ff ff ff 4f 00 46d+15:15:26.825 read verify sector(s) ext

42 00 00 ff ff ff 4f 00 46d+15:15:23.965 read verify sector(s) ext

42 00 00 ff ff ff 4f 00 46d+15:15:20.905 read verify sector(s) ext

42 00 00 ff ff ff 4f 00 46d+15:15:18.093 read verify sector(s) ext

b0 da 00 00 4f c2 00 00 46d+15:15:17.838 smart return status

error 46428 occurred at disk power-on lifetime: 18640 hours (776 days + 16 hours)

when the command that caused the error occurred, the device was active or idle.

after command completion occurred, registers were:

er st sc sn cl ch dh

-- -- -- -- -- -- --

40 51 00 ff ff ff 0f error: unc at lba = 0x0fffffff = 268435455

commands leading to the command that caused the error were:

cr fr sc sn cl ch dh dc powered_up_time command/feature_name

-- -- -- -- -- -- -- -- ---------------- --------------------

42 00 00 ff ff ff 4f 00 46d+15:15:23.965 read verify sector(s) ext

42 00 00 ff ff ff 4f 00 46d+15:15:20.905 read verify sector(s) ext

42 00 00 ff ff ff 4f 00 46d+15:15:18.093 read verify sector(s) ext

b0 da 00 00 4f c2 00 00 46d+15:15:17.838 smart return status

2f 00 01 e0 00 00 40 00 46d+15:15:17.703 read log ext

error 46427 occurred at disk power-on lifetime: 18640 hours (776 days + 16 hours)

when the command that caused the error occurred, the device was active or idle.

after command completion occurred, registers were:

er st sc sn cl ch dh

-- -- -- -- -- -- --

40 51 00 ff ff ff 0f error: unc at lba = 0x0fffffff = 268435455

commands leading to the command that caused the error were:

cr fr sc sn cl ch dh dc powered_up_time command/feature_name

-- -- -- -- -- -- -- -- ---------------- --------------------

42 00 00 ff ff ff 4f 00 46d+15:15:20.905 read verify sector(s) ext

42 00 00 ff ff ff 4f 00 46d+15:15:18.093 read verify sector(s) ext

b0 da 00 00 4f c2 00 00 46d+15:15:17.838 smart return status

2f 00 01 e0 00 00 40 00 46d+15:15:17.703 read log ext

42 00 00 ff ff ff 4f 00 46d+15:15:15.276 read verify sector(s) ext

補充

如果沒有開啟磁盤的smart可以通過-s on device開啟

一般來說如果samrtctl -i 獲取info時沒有什么信息輸出且smart support是允許的可用的，那么說明可能需要做test才能獲取到-t short/long，該測試不會破壞硬盤上的數據，但對于存儲一般不適用離線offline測試

收集時可以通過-x -a參數獲取更全面的磁盤信息

smartctl是可以配置服務的/etc/smartmontools/smartd.conf，對此目前沒有研究，后續有研究成果再更新

如您對本文有疑問或者有任何想說的，請點擊進行留言回復，萬千網友為您解惑！

總結

以上是生活随笔為你收集整理的strocli64 源码_storcli64和smartctl定位硬盘的故障信息的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：计算机硬件简笔画,电脑的鼠标上色简笔画图
下一篇： LDO（低压差线性稳压器）选型小结

3atv精品不卡视频,97人人超碰国产精品最新,中文字幕av一区二区三区人妻少妇,久久久精品波多野结衣,日韩一区二区三区精品

编程问答

strocli64 源码_storcli64和smartctl定位硬盘的故障信息

總結