asm 32 /64
我使用NASM編寫的,運行在32位windows和linux主機上,但后來需求增加了,需要在64位windows和linux上運行,windows自身有個wow(windows on windows)機制,32位程序根本不用移植就能在64位機器上跑,而linux雖然沒有LOL機制(是Linux on linux,不是laugth out loud哈,呵呵 ~),但linux 可以安裝ia-libs庫(ia 應該是 Intel x86 Archive的簡寫)達到LOL效果,不過,編譯ELF64和WIN64OBJ也是我比較感興趣的,所以我要移植程序!
首先是了解CPU,寄存器,基本上所有的32位寄存器都升級了,eax變成了rax,ebx變成了rbx,等等,它們帶寬變長了,用起來自然也爽了,一次處理 8個字節,一步可以做很多以前需要幾步的操作了。寄存器增加了r8,r9,r10,r11,r12,r13,r14,r15,這么多寄存器,又要少用多 少內存做中間變量,效率又高了,可以自己保存使用的是r12-r15,以前一般只有esi,edi,ebx三個寄存器用作自己保存,現在可好了,有 r12-r15,rbx,一共有5個!為什么沒有rsi和rdi?問得好,在Linux系統里,這兩個寄存器在64位CPU上用作參數傳遞,所以它們一般不用作保存了,但 是,rsi,rdi這兩個寄存器還是很重要的,lodsb,stosb之類的指令還是得用rsi,rdi保存源地址和目的地址。這點,我覺得做得很不好, 為什么不拿新加的寄存器來傳參數,偏要用到我心愛的rsi和rdi寄存器呢。。。我不會做CPU,我還不能抱怨啊!抱怨歸抱怨,這種情況下,要方便移植,最好就是不要用lodsb之類的指令,而是直接用基址加變址的方式訪問內存。
接下來是函數調用,Unix 64 ABI規定使用rdi,rsi,rdx,rcx,r8,r9來傳遞前6個參數,少于6個的,按上面的順 序,要幾個就用幾個,超過6個的,前6個按上面的順序放入寄存器,剩下的從后向前壓入堆棧,然后,設置rax=0,最后使用call指令調用函數,如果超 過6個參數,函數返回后需要修復堆棧,你以前壓入了幾個參數,就把棧頂指針回移 幾*8 個字節,以平衡堆棧。注意的是Windows的ABI規定又不一樣了!
另外64位CPU不支持將32位寄存器直接入棧,所以,不好意思,你的push eax 不能用了,使用push rax,pop rax。不過,直接操作堆棧指針rsp/esp是一種可同時在32位和64位CPU上編譯通過,且不會出問題的方式,而且要連續push多個數值時(比如函數調用),往往一次性減掉esp/rsp,再用基址加變址的形式存參數,會比一個一個push參數的效率高!GCC進行API調用的時候就是這么實現的,所以其實寫匯編是不如用gcc的,一不注意,GCC編譯的C程序都會比匯編寫的程序效率還高。我一般正式的項目都是用C語言的,但NASM可以讓我了解得更深,這點是無話可說的!!
而自己實現的函數,還是可以用以前的c-call方式,如下:
| 1 2 3 4 5 6 7 8 9 10 | Function: %define param1???rbp+16 %define param2???rbp+24 %define param3???rbp+32 enter 16,0 %define local1???rbp-8 %define local2???rbp-16 ;..... leave ret |
最后,就是在移植時困擾了我的問題,就是C函數的返回值,64位CPU中C函數的返回值不是在rax中,而是在edx:eax中。其實大多數函數都沒問題, 一般在返回-1的時候,這個問題就出來了,edx:eax是-1,但是rax不是-1,高32位全是0.低32位全是1。。
現在時間不多,下次再寫一篇文章詳細討論。
結束之前,引用C語言的部分文檔。
==========================================
Interfacing HLL code with asm
C calling convention – standard stack frame
Arguments passed to a C function are pushed onto the stack,?right to left, before the function is called. The first thing the called function does is push the (E)BP register, then copy (E)SP into it. This creates a data structure called the?standard C stack frame.
| ? | 32-bit code | 16-bit code, TINY, SMALL, or COMPACT memory models | 16-bit code, MEDIUM, LARGE, or HUGE memory models |
| Create standard stack frame, allocate 16 bytes for local variables, save registers | push ebp ? mov ebp,esp sub esp,16 push edi push esi … | push bp ? mov bp,sp sub sp,16 push di push si … | push bp ? mov bp,sp sub sp,16 push di push si … |
| Restore registers, destroy stack frame, and return | … ? pop esi pop edi mov esp,ebp pop ebp ret | … ? pop si pop di mov sp,bp pop bp ret | … ? pop si pop di mov sp,bp pop bp retf |
| Size of ‘slots’ in stack frame, i.e. stack width | 32 bits | 16 bits | 16 bits |
| Location of stack frame ‘slots’ | [ebp + 8] [ebp + 12] [ebp + 16]… | [bp + 4] [bp + 6] [bp + 8]… | [bp + 6] [bp + 8] [bp + 10]… |
If an argument passed to a function is wider than the stack, it will occupy more than one ‘slot’ in the stack frame. A 64-bit value passed to a function (long long or double) will occupy 2 stack slots in 32-bit code or 4 stack slots in 16-bit code.
Function arguments are accessed with positive offsets from the BP or EBP registers. Local variables are accessed with negative offsets. The previous value of BP or EBP is stored at [bp + 0] or [ebp + 0]. The return address (IP or EIP) is stored at [bp + 2] or [ebp + 4].
C calling convention – return values
A C function usually stores its return value in one or more registers.
| ? | 32-bit code | 16-bit code, all memory models |
| 8-bit return value | AL | AL |
| 16-bit return value | AX | AX |
| 32-bit return value | EAX | DX:AX |
| 64-bit return value | EDX:EAX | space for the return value is allocated on the stack of the calling function, and a ‘hidden’ pointer to this space is passed to the called function |
| 128-bit return value | hidden pointer | hidden pointer |
C calling convention – saving registers
GCC expects functions to preserve the?callee-save?registers:
EBX, EDI, ESI, EBP, DS, ES, SS
You need not save these registers:
EAX, ECX, EDX, FS, GS, EFLAGS, floating point registers
In some OSes, FS or GS may be used as a pointer to thread local storage (TLS), and must be saved if you modify it.
C calling convention – leading underscores
Some C compilers (those for DOS and Windows, and those with COFF output) prepend an underscore to the names of C functions and global variables. If a C global variable, e.g. conv_mem_size, is accessed by asm code, it should be declared with a leading underscore in the asm code:
EXTERN _conv_mem_size??????; NASM syntax
mov [_conv_mem_size],ax
Linux ELF does NOT use underscores.?Watcom C uses?trailing?underscores?for function names, and leading underscores for global variables.
If your GCC supports it,?leading underscores can be turned off with the compiler option -fno-leading-underscore
Pascal calling conventions
Function arguments are pushed onto the stack from?left to right?before the function is called. C-style variable-length argument lists are not possible in Pascal. (Look in file STDARG.H and think about it.)
In C, the?calling?function must ‘clean up the stack’ (remove function arguments from the stack after the called function returns). In Pascal, the?called?function must do this, before returning.
Pascal identifiers are case-insensitive. MyKewlProc() will be stored in the object code file as MYKEWLPROC
Other calling conventions
The?__stdcall calling convention, used by Windows, is a hybrid of the C and Pascal calling conventions. Like C, function arguments are pushed right-to-left. Like Pascal, the called function must clean up the stack.?Exception: the caller must clean up the stack for functions that accept a variable number of arguments, e.g. printf(const char *format, …);
Watcom C uses a register-based calling convention. See sections 7.4, 7.5, 10.4, and 10.5 in cuserguide.pdf in the Watcom documentation. Individual functions can be declared to use the normal, stack-based calling convention.
GCC can be made to use a register calling convention by compiling with gcc -mregparm=NNN …
See the GCC documentation for details.
轉載于:https://www.cnblogs.com/SZLLQ2000/p/4841771.html
總結
以上是生活随笔為你收集整理的asm 32 /64的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 关于年终奖励的扣税算法BUG
- 下一篇: Linux 命令行界面-GUI界面