![]() |
![]() |
#45 |
"Bob Silverman"
Nov 2003
North of Boston
24·32·53 Posts |
![]() |
![]() |
![]() |
![]() |
#46 | |
Undefined
"The unspeakable one"
Jun 2006
My evil lair
11010110011102 Posts |
![]() Quote:
AT&T syntax is awful IMO. MASM syntax is less bad (but still awful). Intel syntax is better (i.e. usable). |
|
![]() |
![]() |
![]() |
#47 | |
Jun 2003
10101011000102 Posts |
![]() Quote:
According to https://learn.microsoft.com/en-us/cp...?view=msvc-170, this is available in x64. Might the compiler be generating a code sequence to emulate the behavior? BTW, if you have to do multiple divisions with same divisor, and this is performance critical, stay away from the DIV instruction - they are _slow_. |
|
![]() |
![]() |
![]() |
#48 | |
"Bob Silverman"
Nov 2003
North of Boston
24·32·53 Posts |
![]() Quote:
We are in agreement. |
|
![]() |
![]() |
![]() |
#49 |
Tribal Bullet
Oct 2004
1101111110112 Posts |
![]()
Note that the gcc assembler toolchain has supported intel format for many years (example).
RDS: In Gladman's assembly, PROC and ENDPROC are not instructions but hints to the assembler that guide what section of an object file the generated assembly is linked into. You can probably get MSVC to generate assembly language and look at the directives like these that it uses. Unfortunately this doesn't get you out of knowing the parameter passing and stack handling conventions in x86 and x64. In particular, Gladman's code doesn't need to do any pushes and pops because the x64 calling conventions use registers for the first few input parameters and treat rax/rdx as volatile across calls. There's no stack handling because the function doesn't need a frame pointer, which is good because the stack setup conventions for x64 are very painful, and you have to use them to make debuggers work on x64. Last fiddled with by jasonp on 2023-09-08 at 13:22 |
![]() |
![]() |
![]() |
#50 | |
"Bob Silverman"
Nov 2003
North of Boston
24×32×53 Posts |
![]() Quote:
increment the stack pointer by the appropriate amount. Then do the reverse on exit. Is there anything else that needs to be done? When using MASM it takes care of managing stack space for you. I found 'Microsoft Learn' : https://learn.microsoft.com/en-us/cp...?view=msvc-170 it has been helpful. |
|
![]() |
![]() |
![]() |
#51 | |
"Bob Silverman"
Nov 2003
North of Boston
24·32·53 Posts |
![]() Quote:
does use the entire address space; that 4 byte pointers are inadequate and that it uses 8 bytes, including void*. Up to 4 inputs integers are passed in registers by the compiler as you indicated. Are pointers passed the same way? I also guess that if passing ints, rather than int 64's that one could pack two of them together then unpack inside the .asm, allowing 8 params to be passed and avoiding stack handling. I have a copy of Dunne's book on Windows 64 bit ASM, but it doesn't say a lot about conventions used by different compilers. The thing I hate most about this is that .asm's are so <expletive deleted> non-portable. If I write a routine for windows and want to port it to Linux, it must be completely re-written. gcc syntax is totally different from Microsoft's ML64. There is much weirdness. The x64 kernel library is named kernel32.lib, for example. Dunne's book assumes the use of VS 2017. I assume that calling conventions are the same for VS 2019 and VS 2022. I have not installed VS 2022 yet. I'm still using VS 2019. I don't want to introduce another potential source of trouble. I have decided to take plunge and do a full convert of my NFS code to x64. However, to make things easier much of my .asm code can be replaced by C with the additional use of _umul64 and _udiv64. I have looked for an intrinsic that does multiply and add, but can't find one. There are others that would be useful as well if they exist. e.g. add with carry, sign extend a _int64 to _int128 using the rax:rdx register pair, 64 bit shifts using rax:rdx (one can do sign extend if 64 bit arithmetic shifts are available etc.). I will keep on looking. etc. etc. Hey! I'm retired. It will keep me busy for a while. I might also redo my BL code using AVX.... another learning curve to climb.... |
|
![]() |
![]() |
![]() |
#52 | |
Undefined
"The unspeakable one"
Jun 2006
My evil lair
2×47×73 Posts |
![]() Quote:
There are compiler settings to set which one to use. 32-bit pointers requires the OS to cooperate when allocating memory to keep all addresses below 4G. Windows and Linux support 32-bit pointers in 64-bit mode. Other OSes vary in their support. |
|
![]() |
![]() |
![]() |
#53 | |
Jan 2005
Caught in a sieve
18C16 Posts |
![]() Quote:
The only processor I've found with an integer fused multiply-add (FMA) is an Nvidia GPU. SSE and AVX have FMA for floats and doubles, if that should happen to float your boat. |
|
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Please vote about upgrading MASM | Prime95 | Software | 19 | 2017-11-09 19:16 |