探索NTFS
| 探索NTFS |
| 探索NTFS ?????????????????????WebCrazy(tsu00@263.net) ????NTFS是Windows?NT引入的新型文件系統,它具有許多新特性。本文旨在探索NTFS的底層結構,所敘述的也僅是文件在NTFS卷上的分布。NTFS中,卷中所有存放的數據均在一個叫$MFT的文件中,叫主文件表(Master?File?Table)。而$MFT則由文件記錄(File?Record)數組構成。File?Record的大小一般是固定的,通常情況下均為1KB,這個概念相當于Linux中的inode。File?Record在$MFT文件中物理上是連續的,且從0開始編號。$MFT僅供File?System本身組織、架構文件系統使用,這在NTFS中稱為元數據(Metadata)。以下列出Windows?2000?Release出的NTFS的元數據文件(我將要給出的示例代碼的部分輸出結果)。 ????File?Record(inode)?FileName ????------------------?-------- ??????????0?????????????$MFT ??????????1?????????????$MFTMirr ??????????2?????????????$LogFile ??????????3?????????????$Volume ??????????4?????????????$AttrDef ??????????5?????????????. ??????????6?????????????$Bitmap ??????????7?????????????$Boot ??????????8?????????????$BadClus ??????????9?????????????$Secure ?????????10?????????????$UpCase ?????????11?????????????$Extend ????Windows?2000中不能使用dir命令(甚至加上/ah參數)像普通文件一樣列出這些元數據文件。實際上File?System?Driver(ntfs.sys)維護了一個系統變量NtfsProtectSystemFiles用于隱藏這些元數據。默認情況下,這個變量被設為TRUE,所以使用dir?/ah將得不到任何文件。知道這個行為后使用i386kd修改NtfsProtectSystemFiles后即可以列出元數據文件: ????kd>?x?ntfs!NtfsProtect* ????fe213498??Ntfs!NtfsProtectSystemFiles ????fe21349c??Ntfs!NtfsProtectSystemAttributes ????kd>?dd?ntfs!NtfsProtectSystemFiles?l?2 ????fe213498??00000001?00000001 ????kd>?ed?ntfs!NtfsProtectSystemFiles?0 ????kd>?dd?ntfs!NtfsProtectSystemFiles?l?2 ????fe213498??00000000?00000001 ????kd> ????D:/>ver ????Microsoft?Windows?2000?[Version?5.00.2195] ????D:/>dir?/ah?$* ?????驅動器?D?中的卷是?W2KNTFS ?????卷的序列號是?E831-9D04 ?????D:/?的目錄 ????2000-04-27??19:31???????????????36,000?$AttrDef ????2000-04-27??19:31????????????????????0?$BadClus ????2000-04-27??19:31???????????????67,336?$Bitmap ????2000-04-27??19:31????????????????8,192?$Boot ????2000-04-27??19:31???????<DIR>??????????$Extend ????2000-04-27??19:31???????????13,139,968?$LogFile ????2000-04-27??19:31???????????27,575,296?$MFT ????2000-04-27??19:31????????????????4,096?$MFTMirr ????2000-04-27??19:31??????????????131,072?$UpCase ????2000-04-27??19:31????????????????????0?$Volume ???????????????????9?個文件?????40,961,960?字節 ???????????????????1?個目錄?????51,863,552?可用字節 ????需要指出的是ntfs.sys將元數據文件以一種特殊的方式打開,所以在打開NtfsProtectSystemFiles后,如果使用ReadFile等產生IRP_MJ_READ等IRP包時將會導致Page?Fault(詳見Gary?Nebbett的《Windows?NT/2000?Native?API?Reference》)。 ????以上的討論均是基于$MFT文件而討論的,即基于$MFT中的File?Record(inode)討論的。為更好的繼續以下的討論,這兒我列出File?Record?Header的結構: ????typedef?struct?{ ????????ULONG?Type; ????????USHORT?UsaOffset; ????????USHORT?UsaCount; ????????USN?Usn; ????}?NTFS_RECORD_HEADER,?*PNTFS_RECORD_HEADER; ????typedef?struct?{ ????????NTFS_RECORD_HEADER?Ntfs; ????????USHORT?SequenceNumber; ????????USHORT?LinkCount; ????????USHORT?AttributesOffset; ????????USHORT?Flags;???????????????//?0x0001?=?InUse,?0x0002?=?Directory ????????ULONG?BytesInUse; ????????ULONG?BytesAllocated; ????????ULONGLONG?BaseFileRecord; ????????USHORT?NextAttributeNumber; ????}?FILE_RECORD_HEADER,?*PFILE_RECORD_HEADER; ????下面我將討論如何定位$MFT。稍微有點操作系統知識的人都會知道引導扇區(Boot?Sector),其物理位置為卷中的第一個扇區。以下由dskprobe.exe(Windows?2000?Resource?Kit中的一個小工具)分析的第一個扇區(當然也可以使用WinHex等其他應用程序): ????File:?d:/Sector00.bin ????Size:?0x00000200?(512) ????Address??|?00?01?02?03-04?05?06?07?:?08?09?0A?0B-0C?0D?0E?0F?|?0123456789ABCDEF ????---------|-------------------------:-------------------------|----------------- ????00000000?|?EB?52?90?4E-54?46?53?20?:?20?20?20?00-02?08?00?00?|??R?NTFS????..... ????00000010?|?00?00?00?00-00?F8?00?00?:?3F?00?F0?00-3F?00?00?00?|?.....?..?.e.?... ????00000020?|?00?00?00?00-80?00?80?00?:?90?C0?41?00-00?00?00?00?|?....€.€.惱A..... ????00000030?|?04?00?00?00-00?00?00?00?:?09?1C?04?00-00?00?00?00?|?................ ????00000040?|?F6?00?00?00-01?00?00?00?:?04?9D?31?E8-BB?31?E8?94?|??.......?杌1钄 ????????????????????????.?????????????????????????. ????????????????????????.?????????????????????????. ????????????????????????.?????????????????????????. ????000001F0?|?00?00?00?00-00?00?00?00?:?83?A0?B3?C9-00?00?55?AA?|?........儬成..U? ????這512字節為如下的格式:(摘自Gary?Nebbett書中,本文許多代碼均來自或參考此書。) ????#pragma?pack(push,?1) ????typedef?struct?{ ????????UCHAR?Jump[3]; ????????UCHAR?Format[8]; ????????USHORT?BytesPerSector; ????????UCHAR?SectorsPerCluster; ????????USHORT?BootSectors; ????????UCHAR?Mbz1; ????????USHORT?Mbz2; ????????USHORT?Reserved1; ????????UCHAR?MediaType; ????????USHORT?Mbz3; ????????USHORT?SectorsPerTrack; ????????USHORT?NumberOfHeads; ????????ULONG?PartitionOffset; ????????ULONG?Reserved2[2]; ????????ULONGLONG?TotalSectors; ????????ULONGLONG?MftStartLcn; ????????ULONGLONG?Mft2StartLcn; ????????ULONG?ClustersPerFileRecord; ????????ULONG?ClustersPerIndexBlock; ????????ULONGLONG?VolumeSerialNumber; ????????UCHAR?Code[0x1AE]; ????????USHORT?BootSignature; ????}?BOOT_BLOCK,?*PBOOT_BLOCK; ????#pragma?pack(pop) ????各個字段的詳細意義從字段名中即可大致清楚。在linux-ntfs的GNU工程(http://sf.net/projects/linux-ntfs)中也有詳細的文檔,限于篇幅我不將其列出。可以使用如下代碼讀出卷中的第一個扇區: ????hVolume?=?CreateFile(drive,?GENERIC_READ,?FILE_SHARE_READ?|?FILE_SHARE_WRITE,?0, ?????????????????????????OPEN_EXISTING,?0,?0); ????ReadFile(hVolume,?&bootb,?sizeof(bootb),?&n,?0); ????bootb是一個BOOT_BLOCK結構,在我的卷中如下格式(請對應Sector00.bin分析): ????Dump?BootBlock?at?below: ????????BytesPerSector:200 ????????SectorsPerCluster:8 ????????BootSectors:0 ????????SectorsPerTrack:3F ????????NumberOfHeads:F0 ????????PartitionOffset:3F ????????TotalSectors:41C090 ????????MftStartLcn:4 ????????Mft2StartLcn:41C09 ????????ClustersPerFileRecord:F6 ????????ClustersPerIndexBlock:1 ????????VolumeSerialNumber:E8319D04 ????????BootSignature:AA55 ????以上的MftStartLcn其實是$MFT在卷中的簇(Cluster)號。簇是NTFS的基本單位,最小單位。一個只有1Byte的文件也要占用一簇的空間。NTFS使用LCN(Logical?Cluster?Number)來代表NTFS卷中的物理位置,其簡單的從0到卷中的總簇數減一進行編號。對于一個特定的文件NTFS則使用VCN(Virtual?Cluster?Number)來映射LCN實現文件的組織。從MftStartLcn的值4可以知道$MFT的LCN為4與SectorsPerCluster、BytesPerSector的大小即可定位$MFT的位置。得到$MFT的位置后,如果遍歷$MFT中所有的File?Record即可以得到卷中所有的文件列表(前面已經提到File?Record只是簡單的從0開始編號)。也就是說到目前為止已經可以對文件組織有最簡單的認識,但如何得到文件的信息呢,如文件名等等。NTFS中所有文件包括普通的用戶文件、元數據文件均用同樣的方式組織數據、屬性等。我將nfi.exe(來自Windows?NT/2000?OEM?Support?Tools)的輸出結果列出,作為我敘述的開始: ????D:/>copy?con?file ????testforntfs^Z ????已復制?????????1?個文件。 ????D:/>nfi?d:/file ????NTFS?File?Sector?Information?Utility. ????Copyright?(C)?Microsoft?Corporation?1999.?All?rights?reserved. ????/file ????????$STANDARD_INFORMATION?(resident) ????????$FILE_NAME?(resident) ????????$DATA?(resident) ????D:/>echo?testforattr>file:ATTR ????D:/>nfi?d:/file ????NTFS?File?Sector?Information?Utility. ????Copyright?(C)?Microsoft?Corporation?1999.?All?rights?reserved. ????/file ????????$STANDARD_INFORMATION?(resident) ????????$FILE_NAME?(resident) ????????$DATA?(resident) ????????$DATA?ATTR?(resident) ????nfi的輸出結果$STANDARD_INFORMATION、$FILE_NAME、$DATA等在NTFS中稱為屬性(Attribute)。屬性分為常駐屬性(Resident?Attribute)與非常駐屬性(Nonresident?Attribute)。文件的數據也包含在屬性中,似乎與屬性這個名稱有點混謠。不過這又讓NTFS有了更加統一的組織文件的形式。這也同時讓NTFS有MultiStreams的特性(上面也演示了這個特性)。通過指定的File?Record定位給定的Attribute的實現代碼如下: ????template?<class?T1,?class?T2>?inline? ????T1*?Padd(T1*?p,?T2?n)?{?return?(T1*)((char?*)p?+?n);?} ????PATTRIBUTE?FindAttribute(PFILE_RECORD_HEADER?file, ?????????????????????????????ATTRIBUTE_TYPE?type,?PWSTR?name) ????{ ????????for?(PATTRIBUTE?attr?=?PATTRIBUTE(Padd(file,?file->AttributesOffset)); ?????????????attr->AttributeType?!=?-1; ?????????????attr?=?Padd(attr,?attr->Length))?{ ????????????if?(attr->AttributeType?==?type)?{ ????????????????if?(name?==?0?&&?attr->NameLength?==?0)?return?attr; ????????????????if?(name?!=?0?&&?wcslen(name)?==?attr->NameLength ????????????????????&&?_wcsicmp(name,?PWSTR(Padd(attr,?attr->NameOffset)))?==?0)?return?attr; ????????????} ????????} ????????return?0; ????}?? ????Gary?Nebbett提供的這個FindAttribute函數在Attribute?name(即第三個參數)不為空串時可能會出現bug,主要原因是_wcsicmp對UNICODE字符串比較時應該是以/0結束的標準的C字符串。我在提供的代碼中已經糾正了這個錯誤。? ????下面我將通過使用SoftICE來分析這段代碼得到$MFT的$FILE_NAME屬性來得到$MFT的file?name。這個示例同樣適用于得到其它文件的$FILE_NAME(如上面的file)、還有其它的屬性如$DATA等等。 ????:bpx?FindAttribute ????Break?due?to?BPX?FindAttribute??(ET=6.89?seconds) ????:locals ????????[EBP-4]?+struct?ATTRIBUTE?*?attr?=?0x00344D68?<{...}> ????????[EBP+8]?+struct?FILE_RECORD_HEADER?*?file?=?0x00344D38?<{...}>? ????????[EBP+C]??enum?ATTRIBUTE_TYPE?type?=?AttributeFileName?(30) ????????[EBP+10]?+unsigned?short?*?name?=?0x004041BC?<"$MFT"> ????:?file ????struct?FILE_RECORD_HEADER?*?=?0x00344D38?<{...}> ?????struct?NTFS_RECORD_HEADER?Ntfs?=?{...} ?????unsigned?short?SequenceNumber?=?0x1,?"/0/x01" ?????unsigned?short?LinkCount?=?0x1,?"/0/x01" ?????unsigned?short?AttributesOffset?=?0x30,?"/00" ?????unsigned?short?Flags?=?0x1,?"/0/x01" ?????unsigned?long?BytesInUse?=?0x2D8,?"/0/0/x02/xD8" ?????unsigned?long?BytesAllocated?=?0x400,?"/0/0/x04/0" ?????unsigned?quad?BaseFileRecord?=?0x0,?"/0/0/0/0/0/0/0/0" ?????unsigned?short?NextAttributeNumber?=?0x6,?"/0/x06" ????file參數我傳入的是$MFT,從$MFT的LCN=4可以得到其在卷中的物理地址,這在上面已說明。你也可以使用dskprobe(我機子中為第LCN*SectorsPerCluster=4*8扇區)得到底下SoftICE的輸出結果: ????:dd?@file?//以下的注釋可對照文中開頭列出的FILE_RECORD_HEADER定義。 ????0023:00344D38?454C4946??0003002A??6D4AC04D??00000000??????FILE*...M.Jm.... ????0023:00344D48?00010001??00010030??000002D8??00000400??????....0........... ????????????????????????????????---- ?????????????????????????????????|__AttributeOffset ????0023:00344D58?00000000??00000000??04340006??0000FA0D??????..........4..... ????0023:00344D68?00000010??00000060??00180000??00000000??????....`........... ??????????????????--------??-------- ?????????????????????|?????????|_指出這個Attribute的長度。定義如下。 ?????????????????????|_根據AttributeOffset得到的Attribute頭,定義如下。00000010指出這個Attribute為StandardInformation ????0023:00344D78?00000048??00000018??2C1761D0??01BFB03C??????H........a.,<... ????Attribute頭如下定義: ????typedef?struct?{ ????????ATTRIBUTE_TYPE?AttributeType; ????????ULONG?Length; ????????BOOLEAN?Nonresident; ????????UCHAR?NameLength; ????????USHORT?NameOffset; ????????USHORT?Flags;???????????????//?0x0001?=?Compressed ????????USHORT?AttributeNumber; ????}?ATTRIBUTE,?*PATTRIBUTE; ????typedef?struct?{ ????????ATTRIBUTE?Attribute; ????????ULONG?ValueLength; ????????USHORT?ValueOffset; ????????USHORT?Flags;???????????????//?0x0001?=?Indexed ????}?RESIDENT_ATTRIBUTE,?*PRESIDENT_ATTRIBUTE; ????typedef?struct?{ ????????ULONGLONG?DirectoryFileReferenceNumber; ????????ULONGLONG?CreationTime;???//?Saved?when?filename?last?changed ????????ULONGLONG?ChangeTime;?????//?ditto ????????ULONGLONG?LastWriteTime;??//?ditto ????????ULONGLONG?LastAccessTime;?//?ditto ????????ULONGLONG?AllocatedSize;??//?ditto ????????ULONGLONG?DataSize;???????//?ditto ????????ULONG?FileAttributes;?????//?ditto ????????ULONG?AlignmentOrReserved; ????????UCHAR?NameLength; ????????UCHAR?NameType;???????????//?0x01?=?Long,?0x02?=?Short ????????WCHAR?Name[1]; ????}?FILENAME_ATTRIBUTE,?*PFILENAME_ATTRIBUTE; ????ATTRIBUTE_TYPE是一個Enum型定義。其中00000010為StandardInformation。30為FileName。因為FileNameAttribute總是一個常駐Attribute,所以我將RESIDENT_ATTRIBUTE定義也給出。OK,現在可以繼續Dump下一個Attribute: ????//?dd?@file+file->AttributeOffset+length(StandardInformationAttribute) ????:dd?@file+30+60 ????0023:00344DC8?00000030??00000068??00180000??00030000??????0...h........... ??????????????????--------????????????------ ?????????????????????|???????????????????|___這里的NameLength與NameOffset指FileNameAttribute名。不要與$MFT?FileName混謠。 ?????????????????????|_指出這是一個FileNameAttribute。 ????0023:00344DD8?0000004A??00010018??00000005??00050000??????J............... ??????????????????--------??????----??-------- ?????????????????????|????????????|???????|_根據ValueOffset的值,得到FILENAME_ATTRIBUTE的具體位置。 ?????????????????????|????????????|_ValueOffset值 ?????????????????????|_ValueLength值 ????0023:00344DE8?2C1761D0??01BFB03C??2C1761D0??01BFB03C??????.a.,<....a.,<... ????0023:00344DF8?2C1761D0??01BFB03C??2C1761D0??01BFB03C??????.a.,<....a.,<... ????0023:00344E08?00004000??00000000??00004000??00000000??????.@.......@...... ????0023:00344E18?00000006??00000000??00240304??0046004D??????..........$.M.F. ????????????????????????????????????????????--??-------- ????????????????????????????????????????????|??????|___找到$MFT的FileName了吧。 ????????????????????????????????????????????|_NameLength ????0023:00344E28?00000054??00000000??00000080??00000190??????T............... ????0023:00344E38?00400001??00010000??00000000??00000000??????..@............. ????這兒給出了Dump?Attribute的一個具體方法。最后我將給出遍歷File?Record的代碼,在給出代碼前應該說明一下$MFT中$BITMAP屬性。NTFS的這個Attribute相當于LINUX?EXT2的s_inode_bitmap數組(Linux?2.0版本)。所以很容易明白$BITMAP的作用,即每bit指出相應File?Record的在用情況。以下是DumpAllFileRecord的代碼: ????BOOL?bitset(PUCHAR?bitmap,?ULONG?i) ????{ ????????return?(bitmap[i?>>?3]?&?(1?<<?(i?&?7)))?!=?0; ????} ????VOID?DumpAllFileRecord() ????{ ????????PATTRIBUTE?attr?=?FindAttribute(MFT,?AttributeBitmap,?0); ????????PUCHAR?bitmap?=?new?UCHAR[AttributeLengthAllocated(attr)]; ????????ReadAttribute(attr,?bitmap); ????????ULONG?n?=?AttributeLength(FindAttribute(MFT,?AttributeData,?0))?/?BytesPerFileRecord; ????????PFILE_RECORD_HEADER?file?=?PFILE_RECORD_HEADER(new?UCHAR[BytesPerFileRecord]); ????????for?(ULONG?i?=?0;?i?<?n;?i++)?{ ????????????if?(!bitset(bitmap,?i))?continue; ????????????ReadFileRecord(i,?file); ????????????if?(file->Ntfs.Type?==?'ELIF'?&&?(file->Flags?&?3?))?{ ????????????????attr?=?FindAttribute(file,?AttributeFileName,?0); ????????????????if?(attr?==?0)?continue; ????????????????PFILENAME_ATTRIBUTE?name ????????????????????=?PFILENAME_ATTRIBUTE(Padd(attr,?PRESIDENT_ATTRIBUTE(attr)->ValueOffset)); ????????????????printf("%8lu?%.*ws/n",?i,?int(name->NameLength),name->Name) ?????????????} ????????} ????} ????本文引用Gary?Nebbett的些定義可能對Windows?2000版本有些很小的出入,不過Internet有其神奇的地方,雖然Microsoft不提供這些信息,但諸如linux-ntfs?GNU工程等均是學習NTFS的一個很好的資料,本文也參考了很多它提供的文檔。另外Mark?Russinovich的《Inside?Win2K?NTFS》、《Inside?NTFS》、《Exploring?NTFS?On-disk?Structures》等也是很好的NTFS資料。本文仍未涉及NTFS中目錄的組織(B+樹)等等,可能的話我會另行介紹。文中介紹的完整代碼可到http://webcrazy.yeah.net下載。出現的錯誤也歡迎來信指教(tsu00@263.net)! ????最后感謝Anton?Altaparmakov,感謝我的同事在出差時抽空給我買到Gary?Nebbett的書。感謝我看到的所有資料的原作者們。感謝他們! 參考資料: ????1.Gary?Nebbett《Windows?NT/2000?Native?API?Reference》 ????2.Linux-NTFS?Project?NTFS?Documentation?Version?0.4 ????3.Mark?Russinovich相關文檔 ????4.David?Solomom《Inside?Windows?NT,2nd?Edition》 |
總結
- 上一篇: 量子计算机打破智子封锁,人类科技会被“智
- 下一篇: 山西职称计算机考试报名时间 2014,2