10张图22段代码,万字长文带你搞懂虚拟内存模型和malloc内部原理
成功是急不來的。不計較眼前得失,將注意力真正著眼于正在做的事情本身,持續付出努力,才能一步步向前邁進,逐漸達到理想的目標。不著急,才能從容不迫,結果自會水到渠成。
大家好,我是程序喵!
攤牌了,不裝了,其實我是程序喵辛苦工作一天還要回家編輯公眾號到大半夜的老婆,希望各位大哥能踴躍轉發,完成我一千閱讀量的KPI(夢想),謝謝!
咳咳,有點跑題,以下是程序喵的廢話,麻煩給個面子劃到最后點擊在看或者贊,證明我比程序喵人氣高,謝謝!
通過/proc文件系統探究虛擬內存
我們會通過/proc文件系統找到正在運行的進程的字符串所在的虛擬內存地址,并通過更改此內存地址的內容來更改字符串內容,使你更深入了解虛擬內存這個概念!這之前先介紹下虛擬內存的定義!
虛擬內存
虛擬內存是一種實現在計算機軟硬件之間的內存管理技術,它將程序使用到的內存地址(虛擬地址)映射到計算機內存中的物理地址,虛擬內存使得應用程序從繁瑣的管理內存空間任務中解放出來,提高了內存隔離帶來的安全性,虛擬內存地址通常是連續的地址空間,由操作系統的內存管理模塊控制,在觸發缺頁中斷時利用分頁技術將實際的物理內存分配給虛擬內存,而且64位機器虛擬內存的空間大小遠超出實際物理內存的大小,使得進程可以使用比物理內存大小更多的內存空間。
在深入研究虛擬內存前,有幾個關鍵點:
- 每個進程都有它自己的虛擬內存 
- 虛擬內存的大小取決于系統的體系結構 
- 不同操作管理有著不同的管理虛擬內存的方式,但大多數操作系統的虛擬內存結構如下圖: 
virtual_memory.png
上圖并不是特別詳細的內存管理圖,高地址其實還有內核空間等等,但這不是這篇文章的主題。從圖中可以看到高地址存儲著命令行參數和環境變量,之后是棧空間、堆空間和可執行程序,其中棧空間向下延申,堆空間向上增長,堆空間需要使用malloc分配,是動態分配的內存的一部分。
首先通過一個簡單的C程序探究虛擬內存。
我的機器是64位機器,進程的虛擬內存高地址為0xffffffffffffffff, 低地址為0x0,而0x88f010遠小于0xffffffffffffffff,因此大概可以推斷出被復制的字符串的地址(堆地址)是在內存低地址附近,具體可以通過/proc文件系統驗證.
ls /proc目錄可以看到好多文件,這里主要關注/proc/[pid]/mem和/proc/[pid]/maps
mem & maps
man?proc/proc/[pid]/memThis?file?can?be?used?to?access?the?pages?of?a?process's?memory?through?open(2),?read(2),?and?lseek(2)./proc/[pid]/mapsA??file?containing?the?currently?mapped?memory?regions?and?their?access?permissions.See?mmap(2)?for?some?further?information?about?memory?mappings.The?format?of?the?file?is:address???????????perms?offset??dev???inode???????pathname00400000-00452000?r-xp?00000000?08:02?173521??????/usr/bin/dbus-daemon00651000-00652000?r--p?00051000?08:02?173521??????/usr/bin/dbus-daemon00652000-00655000?rw-p?00052000?08:02?173521??????/usr/bin/dbus-daemon00e03000-00e24000?rw-p?00000000?00:00?0???????????[heap]00e24000-011f7000?rw-p?00000000?00:00?0???????????[heap]...35b1800000-35b1820000?r-xp?00000000?08:02?135522??/usr/lib64/ld-2.15.so35b1a1f000-35b1a20000?r--p?0001f000?08:02?135522??/usr/lib64/ld-2.15.so35b1a20000-35b1a21000?rw-p?00020000?08:02?135522??/usr/lib64/ld-2.15.so35b1a21000-35b1a22000?rw-p?00000000?00:00?035b1c00000-35b1dac000?r-xp?00000000?08:02?135870??/usr/lib64/libc-2.15.so35b1dac000-35b1fac000?---p?001ac000?08:02?135870??/usr/lib64/libc-2.15.so35b1fac000-35b1fb0000?r--p?001ac000?08:02?135870??/usr/lib64/libc-2.15.so35b1fb0000-35b1fb2000?rw-p?001b0000?08:02?135870??/usr/lib64/libc-2.15.so...f2c6ff8c000-7f2c7078c000?rw-p?00000000?00:00?0????[stack:986]...7fffb2c0d000-7fffb2c2e000?rw-p?00000000?00:00?0???[stack]7fffb2d48000-7fffb2d49000?r-xp?00000000?00:00?0???[vdso]The?address?field?is?the?address?space?in?the?process?that?the?mapping?occupies.The?perms?field?is?a?set?of?permissions:r?=?readw?=?writex?=?executes?=?sharedp?=?private?(copy?on?write)The?offset?field?is?the?offset?into?the?file/whatever;dev?is?the?device?(major:minor);?inode?is?the?inode?on?that?device.???0??indicatesthat?no?inode?is?associated?with?the?memory?region,as?would?be?the?case?with?BSS?(uninitialized?data).The??pathname?field?will?usually?be?the?file?that?is?backing?the?mapping.For?ELF?files,?you?can?easily?coordinate?with?the?offset?fieldby?looking?at?the?Offset?field?in?the?ELF?program?headers?(readelf?-l).There?are?additional?helpful?pseudo-paths:[stack]The?initial?process's?(also?known?as?the?main?thread's)?stack.[stack:<tid>]?(since?Linux?3.4)A?thread's?stack?(where?the?<tid>?is?a?thread?ID).It?corresponds?to?the?/proc/[pid]/task/[tid]/?path.[vdso]?The?virtual?dynamically?linked?shared?object.[heap]?The?process's?heap.If?the?pathname?field?is?blank,?this?is?an?anonymous?mapping?as?obtained?via?the?mmap(2)?function.There?is?no?easy??way??to??coordinatethis?back?to?a?process's?source,?short?of?running?it?through?gdb(1),?strace(1),?or?similar.Under?Linux?2.0?there?is?no?field?giving?pathname.通過mem文件可以訪問和修改整個進程的內存頁,通過maps可以看到進程當前已映射的內存區域,有地址和訪問權限偏移量等,從maps中可以看到堆空間是在低地址而棧空間是在高地址. ?從maps中可以看到heap的訪問權限是rw,即可寫,所以可以通過堆地址找到上個示例程序中字符串的地址,并通過修改mem文件對應地址的內容,就可以修改字符串的內容啦,程序:
#include?<stdlib.h> #include?<stdio.h> #include?<string.h> #include?<unistd.h>/**??????????????*?main?-?uses?strdup?to?create?a?new?string,?loops?forever-ever*????????????????*?Return:?EXIT_FAILURE?if?malloc?failed.?Other?never?returns*/ int?main(void) {char?*s;unsigned?long?int?i;s?=?strdup("test_memory");if?(s?==?NULL){fprintf(stderr,?"Can't?allocate?mem?with?malloc\n");return?(EXIT_FAILURE);}i?=?0;while?(s){printf("[%lu]?%s?(%p)\n",?i,?s,?(void?*)s);sleep(1);i++;}return?(EXIT_SUCCESS); } 編譯運行:gcc -Wall -Wextra -pedantic -Werror main.c -o loop; ./loop 輸出: [0]?test_memory?(0x21dc010) [1]?test_memory?(0x21dc010) [2]?test_memory?(0x21dc010) [3]?test_memory?(0x21dc010) [4]?test_memory?(0x21dc010) [5]?test_memory?(0x21dc010) [6]?test_memory?(0x21dc010) ...這里可以寫一個腳本通過/proc文件系統找到字符串所在位置并修改其內容,相應的輸出也會更改。
首先找到進程的進程號
2542即為loop程序的進程號,cat /proc/2542/maps得到
00400000-00401000?r-xp?00000000?08:01?811716?????????????????????????????/home/zjucad/wangzhiqiang/loop 00600000-00601000?r--p?00000000?08:01?811716?????????????????????????????/home/zjucad/wangzhiqiang/loop 00601000-00602000?rw-p?00001000?08:01?811716?????????????????????????????/home/zjucad/wangzhiqiang/loop 021dc000-021fd000?rw-p?00000000?00:00?0??????????????????????????????????[heap] 7f2adae2a000-7f2adafea000?r-xp?00000000?08:01?8661324????????????????????/lib/x86_64-linux-gnu/libc-2.23.so 7f2adafea000-7f2adb1ea000?---p?001c0000?08:01?8661324????????????????????/lib/x86_64-linux-gnu/libc-2.23.so 7f2adb1ea000-7f2adb1ee000?r--p?001c0000?08:01?8661324????????????????????/lib/x86_64-linux-gnu/libc-2.23.so 7f2adb1ee000-7f2adb1f0000?rw-p?001c4000?08:01?8661324????????????????????/lib/x86_64-linux-gnu/libc-2.23.so 7f2adb1f0000-7f2adb1f4000?rw-p?00000000?00:00?0 7f2adb1f4000-7f2adb21a000?r-xp?00000000?08:01?8661310????????????????????/lib/x86_64-linux-gnu/ld-2.23.so 7f2adb3fa000-7f2adb3fd000?rw-p?00000000?00:00?0 7f2adb419000-7f2adb41a000?r--p?00025000?08:01?8661310????????????????????/lib/x86_64-linux-gnu/ld-2.23.so 7f2adb41a000-7f2adb41b000?rw-p?00026000?08:01?8661310????????????????????/lib/x86_64-linux-gnu/ld-2.23.so 7f2adb41b000-7f2adb41c000?rw-p?00000000?00:00?0 7ffd51bb3000-7ffd51bd4000?rw-p?00000000?00:00?0??????????????????????????[stack] 7ffd51bdd000-7ffd51be0000?r--p?00000000?00:00?0??????????????????????????[vvar] 7ffd51be0000-7ffd51be2000?r-xp?00000000?00:00?0??????????????????????????[vdso] ffffffffff600000-ffffffffff601000?r-xp?00000000?00:00?0??????????????????[vsyscall]看見堆地址范圍021dc000-021fd000,并且可讀可寫,而且021dc000<0x21dc010<021fd000,這就可以確認字符串的地址在堆中,在堆中的索引是0x10(至于為什么是0x10,后面會講到),這時可以通過mem文件到0x21dc010地址修改內容,字符串輸出的內容也會隨之更改,這里通過python腳本實現此功能.
#!/usr/bin/env?python3 '''????????????? Locates?and?replaces?the?first?occurrence?of?a?string?in?the?heap of?a?process????Usage:?./read_write_heap.py?PID?search_string?replace_by_string Where:??????????? -?PID?is?the?pid?of?the?target?process -?search_string?is?the?ASCII?string?you?are?looking?to?overwrite -?replace_by_string?is?the?ASCII?string?you?want?to?replacesearch_string?with '''import?sysdef?print_usage_and_exit():print('Usage:?{}?pid?search?write'.format(sys.argv[0]))sys.exit(1)#?check?usage?? if?len(sys.argv)?!=?4:print_usage_and_exit()#?get?the?pid?from?args pid?=?int(sys.argv[1]) if?pid?<=?0:print_usage_and_exit() search_string?=?str(sys.argv[2]) if?search_string??==?"":print_usage_and_exit() write_string?=?str(sys.argv[3]) if?search_string??==?"":print_usage_and_exit()#?open?the?maps?and?mem?files?of?the?process maps_filename?=?"/proc/{}/maps".format(pid) print("[*]?maps:?{}".format(maps_filename)) mem_filename?=?"/proc/{}/mem".format(pid) print("[*]?mem:?{}".format(mem_filename))#?try?opening?the?maps?file try:maps_file?=?open('/proc/{}/maps'.format(pid),?'r') except?IOError?as?e:print("[ERROR]?Can?not?open?file?{}:".format(maps_filename))print("????????I/O?error({}):?{}".format(e.errno,?e.strerror))sys.exit(1)for?line?in?maps_file:sline?=?line.split('?')#?check?if?we?found?the?heapif?sline[-1][:-1]?!=?"[heap]":continueprint("[*]?Found?[heap]:")#?parse?lineaddr?=?sline[0]perm?=?sline[1]offset?=?sline[2]device?=?sline[3]inode?=?sline[4]pathname?=?sline[-1][:-1]print("\tpathname?=?{}".format(pathname))print("\taddresses?=?{}".format(addr))print("\tpermisions?=?{}".format(perm))print("\toffset?=?{}".format(offset))print("\tinode?=?{}".format(inode))#?check?if?there?is?read?and?write?permissionif?perm[0]?!=?'r'?or?perm[1]?!=?'w':print("[*]?{}?does?not?have?read/write?permission".format(pathname))maps_file.close()exit(0)#?get?start?and?end?of?the?heap?in?the?virtual?memoryaddr?=?addr.split("-")if?len(addr)?!=?2:?#?never?trust?anyone,?not?even?your?OS?:)print("[*]?Wrong?addr?format")maps_file.close()exit(1)addr_start?=?int(addr[0],?16)addr_end?=?int(addr[1],?16)print("\tAddr?start?[{:x}]?|?end?[{:x}]".format(addr_start,?addr_end))#?open?and?read?memtry:mem_file?=?open(mem_filename,?'rb+')except?IOError?as?e:print("[ERROR]?Can?not?open?file?{}:".format(mem_filename))print("????????I/O?error({}):?{}".format(e.errno,?e.strerror))maps_file.close()exit(1)#?read?heap??mem_file.seek(addr_start)heap?=?mem_file.read(addr_end?-?addr_start)#?find?stringtry:i?=?heap.index(bytes(search_string,?"ASCII"))except?Exception:print("Can't?find?'{}'".format(search_string))maps_file.close()mem_file.close()exit(0)print("[*]?Found?'{}'?at?{:x}".format(search_string,?i))#?write?the?new?stringprint("[*]?Writing?'{}'?at?{:x}".format(write_string,?addr_start?+?i))mem_file.seek(addr_start?+?i)mem_file.write(bytes(write_string,?"ASCII"))#?close?filesmaps_file.close()mem_file.close()#?there?is?only?one?heap?in?our?examplebreak運行這個Python腳本
zjucad@zjucad-ONDA-H110-MINI-V3-01:~/wangzhiqiang$?sudo?./loop.py?2542?test_memory?test_hello [*]?maps:?/proc/2542/maps [*]?mem:?/proc/2542/mem [*]?Found?[heap]:pathname?=?[heap]addresses?=?021dc000-021fd000permisions?=?rw-poffset?=?00000000inode?=?0Addr?start?[21dc000]?|?end?[21fd000] [*]?Found?'test_memory'?at?10 [*]?Writing?'test_hello'?at?21dc010同時字符串輸出的內容也已更改
[633]?test_memory?(0x21dc010) [634]?test_memory?(0x21dc010) [635]?test_memory?(0x21dc010) [636]?test_memory?(0x21dc010) [637]?test_memory?(0x21dc010) [638]?test_memory?(0x21dc010) [639]?test_memory?(0x21dc010) [640]?test_helloy?(0x21dc010) [641]?test_helloy?(0x21dc010) [642]?test_helloy?(0x21dc010) [643]?test_helloy?(0x21dc010) [644]?test_helloy?(0x21dc010) [645]?test_helloy?(0x21dc010)實驗成功.
通過實踐畫出虛擬內存空間分布圖
再列出內存空間分布圖
基本上每個人或多或少都了解虛擬內存的空間分布,那如何驗證它呢,下面會提到.
堆棧空間
首先驗證棧空間的位置,我們都知道C中局部變量是存儲在棧空間的,malloc分配的內存是存儲在堆空間,所以可以通過打印出局部變量地址和malloc的返回內存地址的方式來驗證堆棧空間在整個虛擬空間中的位置.
#include?<stdlib.h> #include?<stdio.h> #include?<string.h>/***?main?-?print?locations?of?various?elements**?Return:?EXIT_FAILURE?if?something?failed.?Otherwise?EXIT_SUCCESS*/ int?main(void) {int?a;void?*p;printf("Address?of?a:?%p\n",?(void?*)&a);p?=?malloc(98);if?(p?==?NULL){fprintf(stderr,?"Can't?malloc\n");return?(EXIT_FAILURE);}printf("Allocated?space?in?the?heap:?%p\n",?p);return?(EXIT_SUCCESS); } 編譯運行:gcc?-Wall?-Wextra?-pedantic?-Werror?main.c?-o?test;?./test 輸出: Address?of?a:?0x7ffedde9c7fc Allocated?space?in?the?heap:?0x55ca5b360670通過結果可以看出堆地址空間在棧地址空間下面,整理如圖:
可執行程序
可執行程序也在虛擬內存中,可以通過打印main函數的地址,并與堆棧地址相比較,即可知道可執行程序地址相對于堆棧地址的分布.
#include?<stdlib.h> #include?<stdio.h> #include?<string.h>/***?main?-?print?locations?of?various?elements**?Return:?EXIT_FAILURE?if?something?failed.?Otherwise?EXIT_SUCCESS*/ int?main(void) {int?a;void?*p;printf("Address?of?a:?%p\n",?(void?*)&a);p?=?malloc(98);if?(p?==?NULL){fprintf(stderr,?"Can't?malloc\n");return?(EXIT_FAILURE);}printf("Allocated?space?in?the?heap:?%p\n",?p);printf("Address?of?function?main:?%p\n",?(void?*)main);return?(EXIT_SUCCESS); } 編譯運行:gcc?main.c?-o?test;?./test 輸出: Address?of?a:?0x7ffed846de2c Allocated?space?in?the?heap:?0x561b9ee8c670 Address?of?function?main:?0x561b9deb378a由于main(0x561b9deb378a) < heap(0x561b9ee8c670) < (0x7ffed846de2c),可以畫出分布圖如下:
virtual_memory_stack_heap_executable.png命令行參數和環境變量
程序入口main函數可以攜帶參數:
- 第一個參數(argc): 命令行參數的個數 
- 第二個參數(argv): 指向命令行參數數組的指針 
- 第三個參數(env): 指向環境變量數組的指針 
通過程序可以看見這些元素在虛擬內存中的位置:
#include?<stdlib.h> #include?<stdio.h> #include?<string.h>/***?main?-?print?locations?of?various?elements**?Return:?EXIT_FAILURE?if?something?failed.?Otherwise?EXIT_SUCCESS*/ int?main(int?ac,?char?**av,?char?**env) {int?a;void?*p;int?i;printf("Address?of?a:?%p\n",?(void?*)&a);p?=?malloc(98);if?(p?==?NULL){fprintf(stderr,?"Can't?malloc\n");return?(EXIT_FAILURE);}printf("Allocated?space?in?the?heap:?%p\n",?p);printf("Address?of?function?main:?%p\n",?(void?*)main);printf("First?bytes?of?the?main?function:\n\t");for?(i?=?0;?i?<?15;?i++){printf("%02x?",?((unsigned?char?*)main)[i]);}printf("\n");printf("Address?of?the?array?of?arguments:?%p\n",?(void?*)av);printf("Addresses?of?the?arguments:\n\t");for?(i?=?0;?i?<?ac;?i++){printf("[%s]:%p?",?av[i],?av[i]);}printf("\n");printf("Address?of?the?array?of?environment?variables:?%p\n",?(void?*)env);printf("Address?of?the?first?environment?variable:?%p\n",?(void?*)(env[0]));return?(EXIT_SUCCESS); } 編譯運行:gcc?main.c?-o?test;?./test?nihao?hello 輸出: Address?of?a:?0x7ffcc154a748 Allocated?space?in?the?heap:?0x559bd1bee670 Address?of?function?main:?0x559bd09807ca First?bytes?of?the?main?function:55?48?89?e5?48?83?ec?40?89?7d?dc?48?89?75?d0 Address?of?the?array?of?arguments:?0x7ffcc154a848 Addresses?of?the?arguments:[./test]:0x7ffcc154b94f?[nihao]:0x7ffcc154b956?[hello]:0x7ffcc154b95c Address?of?the?array?of?environment?variables:?0x7ffcc154a868 Address?of?the?first?environment?variable:?0x7ffcc154b962結果如下:
main(0x559bd09807ca) < heap(0x559bd1bee670) < stack(0x7ffcc154a748) < argv(0x7ffcc154a848) < env(0x7ffcc154a868) < arguments(0x7ffcc154b94f->0x7ffcc154b95c + 6)(6為hello+1('\0')) < env first(0x7ffcc154b962)
可以看出所有的命令行參數都是相鄰的,并且緊接著就是環境變量.
argv和env數組地址是相鄰的嗎
上例中argv有4個元素,命令行中有三個參數,還有一個NULL指向標記數組的末尾,每個指針是8字節,8*4=32, argv(0x7ffcc154a848) + 32(0x20) = env(0x7ffcc154a868),所以argv和env數組指針是相鄰的.
命令行參數地址緊隨環境變量地址之后嗎
首先需要獲取環境變量數組的大小,環境變量數組是以NULL結束的,所以可以遍歷env數組,檢查是否為NULL,獲取數組大小,代碼如下:
#include?<stdlib.h> #include?<stdio.h> #include?<string.h>/**??????????????????????????????????????????????????????????????????????????????????????????????????????*?main?-?print?locations?of?various?elements????????????????????????????????????????????????????????????*???????????????????????????????????????????????????????????????????????????????????????????????????????*?Return:?EXIT_FAILURE?if?something?failed.?Otherwise?EXIT_SUCCESS??????????????????????????????????????*/ int?main(int?ac,?char?**av,?char?**env) {int?a;void?*p;int?i;int?size;printf("Address?of?a:?%p\n",?(void?*)&a);p?=?malloc(98);if?(p?==?NULL){fprintf(stderr,?"Can't?malloc\n");return?(EXIT_FAILURE);}printf("Allocated?space?in?the?heap:?%p\n",?p);printf("Address?of?function?main:?%p\n",?(void?*)main);printf("First?bytes?of?the?main?function:\n\t");for?(i?=?0;?i?<?15;?i++){printf("%02x?",?((unsigned?char?*)main)[i]);}printf("\n");printf("Address?of?the?array?of?arguments:?%p\n",?(void?*)av);printf("Addresses?of?the?arguments:\n\t");for?(i?=?0;?i?<?ac;?i++){printf("[%s]:%p?",?av[i],?av[i]);}printf("\n");printf("Address?of?the?array?of?environment?variables:?%p\n",?(void?*)env);printf("Address?of?the?first?environment?variables:\n");for?(i?=?0;?i?<?3;?i++){printf("\t[%p]:\"%s\"\n",?env[i],?env[i]);}/*?size?of?the?env?array?*/i?=?0;while?(env[i]?!=?NULL){i++;}i++;?/*?the?NULL?pointer?*/size?=?i?*?sizeof(char?*);printf("Size?of?the?array?env:?%d?elements?->?%d?bytes?(0x%x)\n",?i,?size,?size);return?(EXIT_SUCCESS); }編譯運行:gcc?main.c?-o?test;?./test?nihao?hello 輸出: Address?of?a:?0x7ffd5ebadff4 Allocated?space?in?the?heap:?0x562ba4e13670 Address?of?function?main:?0x562ba2f1881a First?bytes?of?the?main?function:55?48?89?e5?48?83?ec?40?89?7d?dc?48?89?75?d0 Address?of?the?array?of?arguments:?0x7ffd5ebae0f8 Addresses?of?the?arguments:[./test]:0x7ffd5ebae94f?[nihao]:0x7ffd5ebae956?[hello]:0x7ffd5ebae95c Address?of?the?array?of?environment?variables:?0x7ffd5ebae118 Address?of?the?first?environment?variables:[0x7ffd5ebae962]:"LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.zst=01;31:*.tzst=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.wim=01;31:*.swm=01;31:*.dwm=01;31:*.esd=01;31:*.jpg=01;35:*.jpeg=01;35:*.mjpg=01;35:*.mjpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.m4a=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.oga=00;36:*.opus=00;36:*.spx=00;36:*.xspf=00;36:"[0x7ffd5ebaef4e]:"HOSTNAME=3e8650948c0c"[0x7ffd5ebaef64]:"OLDPWD=/" Size?of?the?array?env:?11?elements?->?88?bytes?(0x58)運算結果如下: root@3e8650948c0c:/ubuntu#?bc bc?1.07.1 Copyright?1991-1994,?1997,?1998,?2000,?2004,?2006,?2008,?2012-2017?Free?Software?Foundation,?Inc. This?is?free?software?with?ABSOLUTELY?NO?WARRANTY. For?details?type?`warranty'. obase=16 ibase=16 58+7ffd5ebae118 (standard_in)?3:?syntax?error 58+7FFD5EBAE118 7FFD5EBAE170 quit通過結果可知7FFD5EBAE170 != 0x7ffd5ebae94f,所以命令行參數地址不是緊隨環境變量地址之后。
截至目前畫出圖表如下:
棧內存真的向下增長嗎
可以通過調用函數來確認,如果真的是向下增長,那么調用函數的地址應該高于被調用函數地址, 代碼如下:
#include?<stdlib.h> #include?<stdio.h> #include?<string.h>void?f(void) {int?a;int?b;int?c;a?=?98;b?=?1024;c?=?a?*?b;printf("[f]?a?=?%d,?b?=?%d,?c?=?a?*?b?=?%d\n",?a,?b,?c);printf("[f]?Adresses?of?a:?%p,?b?=?%p,?c?=?%p\n",?(void?*)&a,?(void?*)&b,?(void?*)&c); }int?main(int?ac,?char?**av,?char?**env) {int?a;void?*p;int?i;int?size;printf("Address?of?a:?%p\n",?(void?*)&a);p?=?malloc(98);if?(p?==?NULL){fprintf(stderr,?"Can't?malloc\n");return?(EXIT_FAILURE);}printf("Allocated?space?in?the?heap:?%p\n",?p);printf("Address?of?function?main:?%p\n",?(void?*)main);f();return?(EXIT_SUCCESS); } 編譯運行:gcc?main.c?-o?test;?./test 輸出: Address?of?a:?0x7ffefc75083c Allocated?space?in?the?heap:?0x564d46318670 Address?of?function?main:?0x564d45b9880e [f]?a?=?98,?b?=?1024,?c?=?a?*?b?=?100352 [f]?Adresses?of?a:?0x7ffefc7507ec,?b?=?0x7ffefc7507f0,?c?=?0x7ffefc7507f4結果可知: f{a} 0x7ffefc7507ec < main{a} 0x7ffefc75083c
可畫圖如下:
其實也可以寫一個簡單的代碼,通過查看/proc文件系統中map內容來查看內存分布,這里就不舉例啦.
堆內存(malloc)
malloc
malloc是常用的動態分配內存的函數,malloc申請的內存分配在堆中,注意malloc是glibc函數,不是系統調用.
man malloc:
不調用malloc,就不會有堆空間[heap]
看一段不調用malloc的代碼
#include?<stdlib.h> #include?<stdio.h>/***?main?-?do?nothing**?Return:?EXIT_FAILURE?if?something?failed.?Otherwise?EXIT_SUCCESS*/ int?main(void) {getchar();return?(EXIT_SUCCESS); } 編譯運行:gcc test.c -o 2; ./2 step?1?:?ps?aux?|?grep?\?\./2$ 輸出: zjucad????3023??0.0??0.0???4352???788?pts/3????S+???13:58???0:00?./2 step?2?:?/proc/3023/maps 輸出: 00400000-00401000?r-xp?00000000?08:01?811723?????????????????????????????/home/zjucad/wangzhiqiang/2 00600000-00601000?r--p?00000000?08:01?811723?????????????????????????????/home/zjucad/wangzhiqiang/2 00601000-00602000?rw-p?00001000?08:01?811723?????????????????????????????/home/zjucad/wangzhiqiang/2 007a4000-007c5000?rw-p?00000000?00:00?0??????????????????????????????????[heap] 7f954ca02000-7f954cbc2000?r-xp?00000000?08:01?8661324????????????????????/lib/x86_64-linux-gnu/libc-2.23.so 7f954cbc2000-7f954cdc2000?---p?001c0000?08:01?8661324????????????????????/lib/x86_64-linux-gnu/libc-2.23.so 7f954cdc2000-7f954cdc6000?r--p?001c0000?08:01?8661324????????????????????/lib/x86_64-linux-gnu/libc-2.23.so 7f954cdc6000-7f954cdc8000?rw-p?001c4000?08:01?8661324????????????????????/lib/x86_64-linux-gnu/libc-2.23.so 7f954cdc8000-7f954cdcc000?rw-p?00000000?00:00?0 7f954cdcc000-7f954cdf2000?r-xp?00000000?08:01?8661310????????????????????/lib/x86_64-linux-gnu/ld-2.23.so 7f954cfd2000-7f954cfd5000?rw-p?00000000?00:00?0 7f954cff1000-7f954cff2000?r--p?00025000?08:01?8661310????????????????????/lib/x86_64-linux-gnu/ld-2.23.so 7f954cff2000-7f954cff3000?rw-p?00026000?08:01?8661310????????????????????/lib/x86_64-linux-gnu/ld-2.23.so 7f954cff3000-7f954cff4000?rw-p?00000000?00:00?0 7ffed68a1000-7ffed68c2000?rw-p?00000000?00:00?0??????????????????????????[stack] 7ffed690e000-7ffed6911000?r--p?00000000?00:00?0??????????????????????????[vvar] 7ffed6911000-7ffed6913000?r-xp?00000000?00:00?0??????????????????????????[vdso] ffffffffff600000-ffffffffff601000?r-xp?00000000?00:00?0??????????????????[vsyscall]可以看到,如果不調用malloc,maps中就沒有[heap]
下面運行一個帶有malloc的程序
#include?<stdio.h> #include?<stdlib.h>/***?main?-?prints?the?malloc?returned?address**?Return:?EXIT_FAILURE?if?something?failed.?Otherwise?EXIT_SUCCESS*/ int?main(void) {void?*p;p?=?malloc(1);printf("%p\n",?p);getchar();return?(EXIT_SUCCESS); } 編譯運行:gcc test.c -o 3; ./3 輸出:0xcc7010 驗證步驟及輸出: zjucad@zjucad-ONDA-H110-MINI-V3-01:~/wangzhiqiang$?ps?aux?|?grep?\?\./3$ zjucad????3113??0.0??0.0???4352???644?pts/3????S+???14:06???0:00?./3 zjucad@zjucad-ONDA-H110-MINI-V3-01:~/wangzhiqiang$?cat?/proc/3113/maps 00400000-00401000?r-xp?00000000?08:01?811726?????????????????????????????/home/zjucad/wangzhiqiang/3 00600000-00601000?r--p?00000000?08:01?811726?????????????????????????????/home/zjucad/wangzhiqiang/3 00601000-00602000?rw-p?00001000?08:01?811726?????????????????????????????/home/zjucad/wangzhiqiang/3 00cc7000-00ce8000?rw-p?00000000?00:00?0??????????????????????????????????[heap] 7fc7e9128000-7fc7e92e8000?r-xp?00000000?08:01?8661324????????????????????/lib/x86_64-linux-gnu/libc-2.23.so 7fc7e92e8000-7fc7e94e8000?---p?001c0000?08:01?8661324????????????????????/lib/x86_64-linux-gnu/libc-2.23.so 7fc7e94e8000-7fc7e94ec000?r--p?001c0000?08:01?8661324????????????????????/lib/x86_64-linux-gnu/libc-2.23.so 7fc7e94ec000-7fc7e94ee000?rw-p?001c4000?08:01?8661324????????????????????/lib/x86_64-linux-gnu/libc-2.23.so 7fc7e94ee000-7fc7e94f2000?rw-p?00000000?00:00?0 7fc7e94f2000-7fc7e9518000?r-xp?00000000?08:01?8661310????????????????????/lib/x86_64-linux-gnu/ld-2.23.so 7fc7e96f8000-7fc7e96fb000?rw-p?00000000?00:00?0 7fc7e9717000-7fc7e9718000?r--p?00025000?08:01?8661310????????????????????/lib/x86_64-linux-gnu/ld-2.23.so 7fc7e9718000-7fc7e9719000?rw-p?00026000?08:01?8661310????????????????????/lib/x86_64-linux-gnu/ld-2.23.so 7fc7e9719000-7fc7e971a000?rw-p?00000000?00:00?0 7ffc91c18000-7ffc91c39000?rw-p?00000000?00:00?0??????????????????????????[stack] 7ffc91d5f000-7ffc91d62000?r--p?00000000?00:00?0??????????????????????????[vvar] 7ffc91d62000-7ffc91d64000?r-xp?00000000?00:00?0??????????????????????????[vdso] ffffffffff600000-ffffffffff601000?r-xp?00000000?00:00?0??????????????????[vsyscall]程序中帶有malloc,那maps中就有[heap]段,并且malloc返回的地址在heap的地址段中,但是返回的地址卻不再heap的最開始地址上,相差了0x10字節,為什么呢?看下面:
strace, brk, sbrk
malloc不是系統調用,它是一個正常函數,它必須調用某些系統調用才可以操作堆內存,通過使用strace工具可以追蹤進程的系統調用和信號,為了確認系統調用是malloc產生的,所以在malloc前后添加write系統調用方便定位問題。
#include?<stdio.h> #include?<stdlib.h> #include?<unistd.h>/***?main?-?let's?find?out?which?syscall?malloc?is?using**?Return:?EXIT_FAILURE?if?something?failed.?Otherwise?EXIT_SUCCESS*/ int?main(void) {void?*p;write(1,?"BEFORE?MALLOC\n",?14);p?=?malloc(1);write(1,?"AFTER?MALLOC\n",?13);printf("%p\n",?p);getchar();return?(EXIT_SUCCESS); } 編譯運行:gcc test.c -o 4 zjucad@zjucad-ONDA-H110-MINI-V3-01:~/wangzhiqiang$?strace?./4 execve("./4",?["./4"],?[/*?34?vars?*/])?=?0 brk(NULL)???????????????????????????????=?0x781000 access("/etc/ld.so.nohwcap",?F_OK)??????=?-1?ENOENT?(No?such?file?or?directory) access("/etc/ld.so.preload",?R_OK)??????=?-1?ENOENT?(No?such?file?or?directory) open("/etc/ld.so.cache",?O_RDONLY|O_CLOEXEC)?=?3 fstat(3,?{st_mode=S_IFREG|0644,?st_size=111450,?...})?=?0 mmap(NULL,?111450,?PROT_READ,?MAP_PRIVATE,?3,?0)?=?0x7f37720fa000 close(3)????????????????????????????????=?0 access("/etc/ld.so.nohwcap",?F_OK)??????=?-1?ENOENT?(No?such?file?or?directory) open("/lib/x86_64-linux-gnu/libc.so.6",?O_RDONLY|O_CLOEXEC)?=?3 read(3,?"\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0P\t\2\0\0\0\0\0"...,?832)?=?832 fstat(3,?{st_mode=S_IFREG|0755,?st_size=1868984,?...})?=?0 mmap(NULL,?4096,?PROT_READ|PROT_WRITE,?MAP_PRIVATE|MAP_ANONYMOUS,?-1,?0)?=?0x7f37720f9000 mmap(NULL,?3971488,?PROT_READ|PROT_EXEC,?MAP_PRIVATE|MAP_DENYWRITE,?3,?0)?=?0x7f3771b27000 mprotect(0x7f3771ce7000,?2097152,?PROT_NONE)?=?0 mmap(0x7f3771ee7000,?24576,?PROT_READ|PROT_WRITE,?MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE,?3,?0x1c0000)?=?0x7f3771ee7000 mmap(0x7f3771eed000,?14752,?PROT_READ|PROT_WRITE,?MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS,?-1,?0)?=?0x7f3771eed000 close(3)????????????????????????????????=?0 mmap(NULL,?4096,?PROT_READ|PROT_WRITE,?MAP_PRIVATE|MAP_ANONYMOUS,?-1,?0)?=?0x7f37720f8000 mmap(NULL,?4096,?PROT_READ|PROT_WRITE,?MAP_PRIVATE|MAP_ANONYMOUS,?-1,?0)?=?0x7f37720f7000 arch_prctl(ARCH_SET_FS,?0x7f37720f8700)?=?0 mprotect(0x7f3771ee7000,?16384,?PROT_READ)?=?0 mprotect(0x600000,?4096,?PROT_READ)?????=?0 mprotect(0x7f3772116000,?4096,?PROT_READ)?=?0 munmap(0x7f37720fa000,?111450)??????????=?0 write(1,?"BEFORE?MALLOC\n",?14BEFORE?MALLOC )?????????=?14 brk(NULL)???????????????????????????????=?0x781000 brk(0x7a2000)???????????????????????????=?0x7a2000 write(1,?"AFTER?MALLOC\n",?13AFTER?MALLOC )??????????=?13 fstat(1,?{st_mode=S_IFCHR|0620,?st_rdev=makedev(136,?3),?...})?=?0 write(1,?"0x781010\n",?90x781010 )???????????????=?9 fstat(0,?{st_mode=S_IFCHR|0620,?st_rdev=makedev(136,?3),?...})?=?0最后幾行的輸出可知,malloc主要調用brk系統調用來操作堆內存.
man?brk ...int?brk(void?*addr);void?*sbrk(intptr_t?increment); ... DESCRIPTIONbrk()?and?sbrk()?change?the?location?of?the?program??break,??which??definesthe?end?of?the?process's?data?segment?(i.e.,?the?program?break?is?the?firstlocation?after?the?end?of?the?uninitialized?data?segment).??Increasing??theprogram??break?has?the?effect?of?allocating?memory?to?the?process;?decreas‐ing?the?break?deallocates?memory.brk()?sets?the?end?of?the?data?segment?to?the?value?specified?by?addr,?whenthat??value??is??reasonable,??the?system?has?enough?memory,?and?the?processdoes?not?exceed?its?maximum?data?size?(see?setrlimit(2)).sbrk()?increments?the?program's?data?space??by??increment??bytes.???Callingsbrk()??with??an?increment?of?0?can?be?used?to?find?the?current?location?ofthe?program?break.程序中斷是虛擬內存中程序數據段結束后的第一個位置的地址,malloc通過調用brk或者sbrk,增加程序中斷的值就可以創建新空間來動態分配內存,首次調用brk會返回當前程序中斷的地址,第二次調用brk也會返回程序中斷的地址,可以發現第二次brk返回地址大于第一次brk返回地址,brk就是通過增加程序中斷地址的方式來分配內存,可以看出現在的堆地址范圍是0x781000-0x7a2000,通過cat /proc/[pid]/maps也可以驗證,此處就不貼上實際驗證的結果啦。
多次malloc
如果多次malloc會出現什么現象呢,代碼如下:
#include?<stdio.h> #include?<stdlib.h> #include?<unistd.h>/***?main?-?many?calls?to?malloc**?Return:?EXIT_FAILURE?if?something?failed.?Otherwise?EXIT_SUCCESS*/ int?main(void) {void?*p;write(1,?"BEFORE?MALLOC?#0\n",?17);p?=?malloc(1024);write(1,?"AFTER?MALLOC?#0\n",?16);printf("%p\n",?p);write(1,?"BEFORE?MALLOC?#1\n",?17);p?=?malloc(1024);write(1,?"AFTER?MALLOC?#1\n",?16);printf("%p\n",?p);write(1,?"BEFORE?MALLOC?#2\n",?17);p?=?malloc(1024);write(1,?"AFTER?MALLOC?#2\n",?16);printf("%p\n",?p);write(1,?"BEFORE?MALLOC?#3\n",?17);p?=?malloc(1024);write(1,?"AFTER?MALLOC?#3\n",?16);printf("%p\n",?p);getchar();return?(EXIT_SUCCESS); } 編譯運行:gcc test.c -o 5; strace ./5 摘要輸出結果如下: write(1,?"BEFORE?MALLOC?#0\n",?17BEFORE?MALLOC?#0 )??????=?17 brk(NULL)???????????????????????????????=?0x561605c7a000 brk(0x561605c9b000)?????????????????????=?0x561605c9b000 write(1,?"AFTER?MALLOC?#0\n",?16AFTER?MALLOC?#0 )???????=?16 fstat(1,?{st_mode=S_IFCHR|0620,?st_rdev=makedev(136,?0),?...})?=?0 write(1,?"0x561605c7a260\n",?150x561605c7a260 )????????=?15 write(1,?"BEFORE?MALLOC?#1\n",?17BEFORE?MALLOC?#1 )??????=?17 write(1,?"AFTER?MALLOC?#1\n",?16AFTER?MALLOC?#1 )???????=?16 write(1,?"0x561605c7aa80\n",?150x561605c7aa80 )????????=?15 write(1,?"BEFORE?MALLOC?#2\n",?17BEFORE?MALLOC?#2 )??????=?17 write(1,?"AFTER?MALLOC?#2\n",?16AFTER?MALLOC?#2 )???????=?16 write(1,?"0x561605c7ae90\n",?150x561605c7ae90 )????????=?15 write(1,?"BEFORE?MALLOC?#3\n",?17BEFORE?MALLOC?#3 )??????=?17 write(1,?"AFTER?MALLOC?#3\n",?16AFTER?MALLOC?#3 )???????=?16 write(1,?"0x561605c7b2a0\n",?150x561605c7b2a0 )????????=?15 fstat(0,?{st_mode=S_IFCHR|0620,?st_rdev=makedev(136,?0),?...})?=?0可以發現并不是每次調用malloc都會觸發brk系統調用,首次調用malloc,內部會通過brk系統調用更改程序中斷地址,分配出一大塊內存空間,后續再調用malloc,malloc內部會優先使用之前分配出來的內存空間,直到內部內存空間已經不夠再次分配給外部時才會再次觸發brk系統調用.
0x10 那丟失的16字節是什么
上面分析可以看見程序第一次調用malloc返回的地址并不是heap段的首地址,而是相差了0x10個字節,那這16個字節究竟是什么,可以通過程序打印出這前16個字節的內容.
編譯運行:gcc test.c -o test;./test 輸出: 0x5589436ce260 bytes?at?0x5589436ce250: 00?00?00?00?00?00?00?00?11?04?00?00?00?00?00?00 0x5589436cea80 bytes?at?0x5589436cea70: 00?00?00?00?00?00?00?00?11?08?00?00?00?00?00?00 0x5589436cf290 bytes?at?0x5589436cf280: 00?00?00?00?00?00?00?00?11?0c?00?00?00?00?00?00 0x5589436cfea0 bytes?at?0x5589436cfe90: 00?00?00?00?00?00?00?00?11?10?00?00?00?00?00?00 0x5589436d0eb0 bytes?at?0x5589436d0ea0: 00?00?00?00?00?00?00?00?11?14?00?00?00?00?00?00 0x5589436d22c0 bytes?at?0x5589436d22b0: 00?00?00?00?00?00?00?00?11?18?00?00?00?00?00?00 0x5589436d3ad0 bytes?at?0x5589436d3ac0: 00?00?00?00?00?00?00?00?11?1c?00?00?00?00?00?00 0x5589436d56e0 bytes?at?0x5589436d56d0: 00?00?00?00?00?00?00?00?11?20?00?00?00?00?00?00 0x5589436d76f0 bytes?at?0x5589436d76e0: 00?00?00?00?00?00?00?00?11?24?00?00?00?00?00?00 0x5589436d9b00 bytes?at?0x5589436d9af0: 00?00?00?00?00?00?00?00?11?28?00?00?00?00?00?00可以看出規律:這16個字節相當于malloc出來的地址的頭,包含一些信息,目前可以看出它包括已經分配的地址空間的大小,第一次malloc申請了0x400(1024)字節,可以發現11 04 00 00 00 00 00 00大于0x400(1024),這8個字節表示數字 0x 00 00 00 00 00 00 04 11 = 0x400(1024) + 0x10(頭的大小16) + 1(后面會說明它的含義),可以發現每次調用malloc,這前8個字節代表的含義都是malloc字節數+16+1.
可以猜測,malloc內部會把這前16個字節強轉成某種數據結構,數據結構包含某些信息,最主要的是已經分配的字節數,盡管我們不了解具體結構,但是也可以通過代碼操作這16個字節驗證我們上面總結的規律是否正確,注意代碼中不調用free釋放內存.
結果可以看出,malloc返回的地址往前的16個字節可以表示已經分配的內存大小, 如圖:
注意上述是沒有調用free釋放內存的結果,然而malloc只用了8個字節表示已經分配的內存大小,那么另外8個字節被用來表示什么含義呢,看下malloc函數的注釋:
1055?/* 1056???????malloc_chunk?details: 1057???? 1058????????(The?following?includes?lightly?edited?explanations?by?Colin?Plumb.) 1059???? 1060????????Chunks?of?memory?are?maintained?using?a?`boundary?tag'?method?as 1061????????described?in?e.g.,?Knuth?or?Standish.??(See?the?paper?by?Paul 1062????????Wilson?ftp://ftp.cs.utexas.edu/pub/garbage/allocsrv.ps?for?a 1063????????survey?of?such?techniques.)??Sizes?of?free?chunks?are?stored?both 1064????????in?the?front?of?each?chunk?and?at?the?end.??This?makes 1065????????consolidating?fragmented?chunks?into?bigger?chunks?very?fast.??The 1066????????size?fields?also?hold?bits?representing?whether?chunks?are?free?or 1067????????in?use. 1068???? 1069????????An?allocated?chunk?looks?like?this: 1070???? 1071???? 1072????????chunk->?+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1073????????????????|?????????????Size?of?previous?chunk,?if?unallocated?(P?clear)??| 1074????????????????+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1075????????????????|?????????????Size?of?chunk,?in?bytes?????????????????????|A|M|P| 1076??????????mem->?+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1077????????????????|?????????????User?data?starts?here...??????????????????????????. 1078????????????????.???????????????????????????????????????????????????????????????. 1079????????????????.?????????????(malloc_usable_size()?bytes)??????????????????????. 1080????????????????.???????????????????????????????????????????????????????????????| 1081????nextchunk->?+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1082????????????????|?????????????(size?of?chunk,?but?used?for?application?data)????| 1083????????????????+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1084????????????????|?????????????Size?of?next?chunk,?in?bytes????????????????|A|0|1| 1085????????????????+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1086???? 1087????????Where?"chunk"?is?the?front?of?the?chunk?for?the?purpose?of?most?of 1088????????the?malloc?code,?but?"mem"?is?the?pointer?that?is?returned?to?the 1089????????user.??"Nextchunk"?is?the?beginning?of?the?next?contiguous?chunk.可以看出這16字節有兩個含義,前8個字節表示之前的空間有多少沒有被分配的字節大小,后8個字節表示當前malloc已經分配的字節大小,通過一段調用free的代碼查看:
#include?<stdio.h> #include?<stdlib.h> #include?<unistd.h>/**????????????????????????????????????????????????????????????????????????????????????????????*?pmem?-?print?mem????????????????????????????????????????????????????????????????????????????*?@p:?memory?address?to?start?printing?from???????????????????????????????????????????????????*?@bytes:?number?of?bytes?to?print????????????????????????????????????????????????????????????*?????????????????????????????????????????????????????????????????????????????????????????????*?Return:?nothing?????????????????????????????????????????????????????????????????????????????*/ void?pmem(void?*p,?unsigned?int?bytes) {unsigned?char?*ptr;unsigned?int?i;ptr?=?(unsigned?char?*)p;for?(i?=?0;?i?<?bytes;?i++){if?(i?!=?0){printf("?");}printf("%02x",?*(ptr?+?i));}printf("\n"); }/***?main?-?confirm?the?source?code**?Return:?EXIT_FAILURE?if?something?failed.?Otherwise?EXIT_SUCCESS*/ int?main(void) {void?*p;int?i;size_t?size_of_the_chunk;size_t?size_of_the_previous_chunk;void?*chunks[10];for?(i?=?0;?i?<?10;?i++){p?=?malloc(1024?*?(i?+?1));chunks[i]?=?(void?*)((char?*)p?-?0x10);printf("%p\n",?p);}free((char?*)(chunks[3])?+?0x10);free((char?*)(chunks[7])?+?0x10);for?(i?=?0;?i?<?10;?i++){p?=?chunks[i];printf("chunks[%d]:?",?i);pmem(p,?0x10);size_of_the_chunk?=?*((size_t?*)((char?*)p?+?8))?-?1;size_of_the_previous_chunk?=?*((size_t?*)((char?*)p));printf("chunks[%d]:?%p,?size?=?%li,?prev?=?%li\n",i,?p,?size_of_the_chunk,?size_of_the_previous_chunk);}return?(EXIT_SUCCESS); }編譯運行輸出: root@3e8650948c0c:/ubuntu#?gcc?test.c?-o?test root@3e8650948c0c:/ubuntu#?./test 0x55fbebf20260 0x55fbebf20a80 0x55fbebf21290 0x55fbebf21ea0 0x55fbebf22eb0 0x55fbebf242c0 0x55fbebf25ad0 0x55fbebf276e0 0x55fbebf296f0 0x55fbebf2bb00 chunks[0]:?00?00?00?00?00?00?00?00?11?04?00?00?00?00?00?00 chunks[0]:?0x55fbebf20250,?size?=?1040,?prev?=?0 chunks[1]:?00?00?00?00?00?00?00?00?11?08?00?00?00?00?00?00 chunks[1]:?0x55fbebf20a70,?size?=?2064,?prev?=?0 chunks[2]:?00?00?00?00?00?00?00?00?11?0c?00?00?00?00?00?00 chunks[2]:?0x55fbebf21280,?size?=?3088,?prev?=?0 chunks[3]:?00?00?00?00?00?00?00?00?11?10?00?00?00?00?00?00 chunks[3]:?0x55fbebf21e90,?size?=?4112,?prev?=?0 chunks[4]:?10?10?00?00?00?00?00?00?10?14?00?00?00?00?00?00 chunks[4]:?0x55fbebf22ea0,?size?=?5135,?prev?=?4112 chunks[5]:?00?00?00?00?00?00?00?00?11?18?00?00?00?00?00?00 chunks[5]:?0x55fbebf242b0,?size?=?6160,?prev?=?0 chunks[6]:?00?00?00?00?00?00?00?00?11?1c?00?00?00?00?00?00 chunks[6]:?0x55fbebf25ac0,?size?=?7184,?prev?=?0 chunks[7]:?00?00?00?00?00?00?00?00?11?20?00?00?00?00?00?00 chunks[7]:?0x55fbebf276d0,?size?=?8208,?prev?=?0 chunks[8]:?10?20?00?00?00?00?00?00?10?24?00?00?00?00?00?00 chunks[8]:?0x55fbebf296e0,?size?=?9231,?prev?=?8208 chunks[9]:?00?00?00?00?00?00?00?00?11?28?00?00?00?00?00?00 chunks[9]:?0x55fbebf2baf0,?size?=?10256,?prev?=?0程序代碼通過free釋放了3和7數據塊的空間,所以4和8的前8個字節已經不全是0啦,和其它不同,它們表示之前數據塊沒有被分配的大小,也可以注意到4和8塊的后8個字節不像其它塊一樣需要加1啦,可以得出結論,malloc通過是否加1來作為前一個數據塊是否已經分配的標志,加1表示前一個數據塊已經分配。所以之前的程序代碼可以修改為如下形式:
#include?<stdio.h> #include?<stdlib.h> #include?<unistd.h>/**????????????????????????????????????????????????????????????????????????????????????????????*?pmem?-?print?mem????????????????????????????????????????????????????????????????????????????*?@p:?memory?address?to?start?printing?from???????????????????????????????????????????????????*?@bytes:?number?of?bytes?to?print????????????????????????????????????????????????????????????*?????????????????????????????????????????????????????????????????????????????????????????????*?Return:?nothing?????????????????????????????????????????????????????????????????????????????*/ void?pmem(void?*p,?unsigned?int?bytes) {unsigned?char?*ptr;unsigned?int?i;ptr?=?(unsigned?char?*)p;for?(i?=?0;?i?<?bytes;?i++){if?(i?!=?0){printf("?");}printf("%02x",?*(ptr?+?i));}printf("\n"); }/***?main?-?updating?with?correct?checks**?Return:?EXIT_FAILURE?if?something?failed.?Otherwise?EXIT_SUCCESS*/ int?main(void) {void?*p;int?i;size_t?size_of_the_chunk;size_t?size_of_the_previous_chunk;void?*chunks[10];char?prev_used;for?(i?=?0;?i?<?10;?i++){p?=?malloc(1024?*?(i?+?1));chunks[i]?=?(void?*)((char?*)p?-?0x10);}free((char?*)(chunks[3])?+?0x10);free((char?*)(chunks[7])?+?0x10);for?(i?=?0;?i?<?10;?i++){p?=?chunks[i];printf("chunks[%d]:?",?i);pmem(p,?0x10);size_of_the_chunk?=?*((size_t?*)((char?*)p?+?8));prev_used?=?size_of_the_chunk?&?1;size_of_the_chunk?-=?prev_used;size_of_the_previous_chunk?=?*((size_t?*)((char?*)p));printf("chunks[%d]:?%p,?size?=?%li,?prev?(%s)?=?%li\n",i,?p,?size_of_the_chunk,(prev_used??"allocated":?"unallocated"),?size_of_the_previous_chunk);}return?(EXIT_SUCCESS); } 編譯運行輸出: root@3e8650948c0c:/ubuntu#?gcc?test.c?-o?test root@3e8650948c0c:/ubuntu#?./test chunks[0]:?00?00?00?00?00?00?00?00?11?04?00?00?00?00?00?00 chunks[0]:?0x56254f888250,?size?=?1040,?prev?(allocated)?=?0 chunks[1]:?00?00?00?00?00?00?00?00?11?08?00?00?00?00?00?00 chunks[1]:?0x56254f888660,?size?=?2064,?prev?(allocated)?=?0 chunks[2]:?00?00?00?00?00?00?00?00?11?0c?00?00?00?00?00?00 chunks[2]:?0x56254f888e70,?size?=?3088,?prev?(allocated)?=?0 chunks[3]:?00?00?00?00?00?00?00?00?11?04?00?00?00?00?00?00 chunks[3]:?0x56254f889a80,?size?=?1040,?prev?(allocated)?=?0 chunks[4]:?00?0c?00?00?00?00?00?00?10?14?00?00?00?00?00?00 chunks[4]:?0x56254f88aa90,?size?=?5136,?prev?(unallocated)?=?3072 chunks[5]:?00?00?00?00?00?00?00?00?11?18?00?00?00?00?00?00 chunks[5]:?0x56254f88bea0,?size?=?6160,?prev?(allocated)?=?0 chunks[6]:?00?00?00?00?00?00?00?00?11?1c?00?00?00?00?00?00 chunks[6]:?0x56254f88d6b0,?size?=?7184,?prev?(allocated)?=?0 chunks[7]:?00?00?00?00?00?00?00?00?11?20?00?00?00?00?00?00 chunks[7]:?0x56254f88f2c0,?size?=?8208,?prev?(allocated)?=?0 chunks[8]:?10?20?00?00?00?00?00?00?10?24?00?00?00?00?00?00 chunks[8]:?0x56254f8912d0,?size?=?9232,?prev?(unallocated)?=?8208 chunks[9]:?00?00?00?00?00?00?00?00?11?28?00?00?00?00?00?00 chunks[9]:?0x56254f8936e0,?size?=?10256,?prev?(allocated)?=?0堆空間是向上增長嗎?
通過代碼驗證:
#include?<stdio.h> #include?<stdlib.h> #include?<unistd.h>/***?main?-?moving?the?program?break**?Return:?EXIT_FAILURE?if?something?failed.?Otherwise?EXIT_SUCCESS*/ int?main(void) {int?i;write(1,?"START\n",?6);malloc(1);getchar();write(1,?"LOOP\n",?5);for?(i?=?0;?i?<?0x25000?/?1024;?i++){malloc(1024);}write(1,?"END\n",?4);getchar();return?(EXIT_SUCCESS); } 編譯運行部分摘要輸出: root@3e8650948c0c:/ubuntu#?gcc?test.c?-o?test root@3e8650948c0c:/ubuntu#?strace?./test execve("./test",?["./test"],?0x7ffe0d7cbd80?/*?10?vars?*/)?=?0 brk(NULL)???????????????????????????????=?0x555a2428f000 access("/etc/ld.so.nohwcap",?F_OK)??????=?-1?ENOENT?(No?such?file?or?directory) access("/etc/ld.so.preload",?R_OK)??????=?-1?ENOENT?(No?such?file?or?directory) openat(AT_FDCWD,?"/etc/ld.so.cache",?O_RDONLY|O_CLOEXEC)?=?3 fstat(3,?{st_mode=S_IFREG|0644,?st_size=13722,?...})?=?0 mmap(NULL,?13722,?PROT_READ,?MAP_PRIVATE,?3,?0)?=?0x7f6423455000 close(3)????????????????????????????????=?0 access("/etc/ld.so.nohwcap",?F_OK)??????=?-1?ENOENT?(No?such?file?or?directory) openat(AT_FDCWD,?"/lib/x86_64-linux-gnu/libc.so.6",?O_RDONLY|O_CLOEXEC)?=?3 read(3,?"\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\260\34\2\0\0\0\0\0"...,?832)?=?832 fstat(3,?{st_mode=S_IFREG|0755,?st_size=2030544,?...})?=?0 mmap(NULL,?8192,?PROT_READ|PROT_WRITE,?MAP_PRIVATE|MAP_ANONYMOUS,?-1,?0)?=?0x7f6423453000 mmap(NULL,?4131552,?PROT_READ|PROT_EXEC,?MAP_PRIVATE|MAP_DENYWRITE,?3,?0)?=?0x7f6422e41000 mprotect(0x7f6423028000,?2097152,?PROT_NONE)?=?0 mmap(0x7f6423228000,?24576,?PROT_READ|PROT_WRITE,?MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE,?3,?0x1e7000)?=?0x7f6423228000 mmap(0x7f642322e000,?15072,?PROT_READ|PROT_WRITE,?MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS,?-1,?0)?=?0x7f642322e000 close(3)????????????????????????????????=?0 arch_prctl(ARCH_SET_FS,?0x7f64234544c0)?=?0 mprotect(0x7f6423228000,?16384,?PROT_READ)?=?0 mprotect(0x555a22f5f000,?4096,?PROT_READ)?=?0 mprotect(0x7f6423459000,?4096,?PROT_READ)?=?0 munmap(0x7f6423455000,?13722)???????????=?0 write(1,?"START\n",?6START )??????????????????=?6 brk(NULL)???????????????????????????????=?0x555a2428f000 brk(0x555a242b0000)?????????????????????=?0x555a242b0000 fstat(0,?{st_mode=S_IFCHR|0620,?st_rdev=makedev(136,?0),?...})?=?0 read(0, "\n",?1024)?????????????????????=?1 write(1,?"LOOP\n",?5LOOP )???????????????????=?5 brk(0x555a242d1000)?????????????????????=?0x555a242d1000 brk(0x555a242f2000)?????????????????????=?0x555a242f2000 brk(0x555a24313000)?????????????????????=?0x555a24313000 brk(0x555a24334000)?????????????????????=?0x555a24334000 brk(0x555a24355000)?????????????????????=?0x555a24355000 brk(0x555a24376000)?????????????????????=?0x555a24376000 brk(0x555a24397000)?????????????????????=?0x555a24397000 brk(0x555a243b8000)?????????????????????=?0x555a243b8000 brk(0x555a243d9000)?????????????????????=?0x555a243d9000 brk(0x555a243fa000)?????????????????????=?0x555a243fa000可以看出堆空間是向上增長的.
隨機化地址空間布局
從開始到現在運行了好多個進程,通過查看對應進程的maps,發現每個進程的heap的起始地址和可執行程序的結束地址都不緊鄰,而且差距還每次都不相同.
[3718]:?01195000?–?00602000?=?b93000 [3834]:?024d6000?–?00602000?=?1ed4000 [4014]:?00e70000?–?00602000?=?86e000 [4172]:?01314000?–?00602000?=?d12000 [7972]:?00901000?–?00602000?=?2ff000可以看出這個差值是隨機的,查看fs/binfmt_elf.c源代碼
if?((current->flags?&?PF_RANDOMIZE)?&&?(randomize_va_space?>?1))?{current->mm->brk?=?current->mm->start_brk?=arch_randomize_brk(current->mm); #ifdef?compat_brk_randomizedcurrent->brk_randomized?=?1; #endif} //?current->mm->brk是當前進程程序中斷的地址arch_randomize_brk函數在arch/x86/kernel/process.c中
unsigned?long?arch_randomize_brk(struct?mm_struct?*mm) {unsigned?long?range_end?=?mm->brk?+?0x02000000;return?randomize_range(mm->brk,?range_end,?0)???:?mm->brk; }randomize_range函數在drivers/char/random.c中
/**?randomize_range()?returns?a?start?address?such?that**????[......?<range>?.....]*??start??????????????????end**?a?<range>?with?size?"len"?starting?at?the?return?value?is?inside?in?the*?area?defined?by?[start,?end],?but?is?otherwise?randomized.*/ unsigned?long randomize_range(unsigned?long?start,?unsigned?long?end,?unsigned?long?len) {unsigned?long?range?=?end?-?len?-?start;if?(end?<=?start?+?len)return?0;return?PAGE_ALIGN(get_random_int()?%?range?+?start); }可以看出上面所說的這個差值其實就是0-0x02000000中的一個隨機數,這種技術稱為ASLR(Address Space Layout Randomisation),是一種計算機安全技術,隨機安排虛擬內存中堆棧空間的位置,可以有效防止黑客攻擊。通過以上分析,可以畫出內存分布圖如下:
malloc(0)發生了什么?
當調用malloc(0)會發生什么,代碼如下:
#include?<stdio.h> #include?<stdlib.h> #include?<unistd.h>/**????????????????????????????????????????????????????????????????????????????????????????????*?pmem?-?print?mem????????????????????????????????????????????????????????????????????????????*?@p:?memory?address?to?start?printing?from???????????????????????????????????????????????????*?@bytes:?number?of?bytes?to?print????????????????????????????????????????????????????????????*?????????????????????????????????????????????????????????????????????????????????????????????*?Return:?nothing?????????????????????????????????????????????????????????????????????????????*/ void?pmem(void?*p,?unsigned?int?bytes) {unsigned?char?*ptr;unsigned?int?i;ptr?=?(unsigned?char?*)p;for?(i?=?0;?i?<?bytes;?i++){if?(i?!=?0){printf("?");}printf("%02x",?*(ptr?+?i));}printf("\n"); }/***?main?-?moving?the?program?break**?Return:?EXIT_FAILURE?if?something?failed.?Otherwise?EXIT_SUCCESS*/ int?main(void) {void?*p;size_t?size_of_the_chunk;char?prev_used;p?=?malloc(0);printf("%p\n",?p);pmem((char?*)p?-?0x10,?0x10);size_of_the_chunk?=?*((size_t?*)((char?*)p?-?8));prev_used?=?size_of_the_chunk?&?1;size_of_the_chunk?-=?prev_used;printf("chunk?size?=?%li?bytes\n",?size_of_the_chunk);return?(EXIT_SUCCESS); } 編譯運行輸出如下: root@3e8650948c0c:/ubuntu#?gcc?test.c?-o?test root@3e8650948c0c:/ubuntu#?./test 0x564ece64b260 00?00?00?00?00?00?00?00?21?00?00?00?00?00?00?00 chunk?size?=?32?bytes可以看出malloc(0)實際使用了32個字節,其中包括我們之前說的16個字節頭部,然而有時候malloc(0)可能會有不同的結果輸出,也有可能會返回NULL.
man?malloc NULL?may?also?be?returned?by?a?successful?call?to?malloc()?with?a?size?of?zero操作環境
示例代碼主要在兩種環境下跑過: ubuntu?16.04 gcc?(Ubuntu?7.4.0-1ubuntu1~16.04~ppa1)?7.4.0ubuntu?18.04?docker gcc?(Ubuntu?7.4.0-1ubuntu1~18.04.1)?7.4.0本文是從這一系列文章翻譯并結合自己理解提煉出來的,代碼都自己實踐過,有時間的也可以直接閱讀英文原鏈接
Hack The Virtual Memory: C strings & /proc ?
Hack the Virtual Memory: drawing the VM diagram?
Hack the Virtual Memory: malloc, the heap & the program break
推薦閱讀:
? ??專輯|Linux文章匯總
? ??專輯|程序人生
? ??專輯|C語言
嵌入式Linux
微信掃描二維碼,關注我的公眾號?
總結
以上是生活随笔為你收集整理的10张图22段代码,万字长文带你搞懂虚拟内存模型和malloc内部原理的全部內容,希望文章能夠幫你解決所遇到的問題。
 
                            
                        - 上一篇: UTM投影与高斯克吕格投影中分带带号与中
- 下一篇: 基于tensorflow的iris数据集
