Go Back > Extra Stuff > Programming

Thread Tools
Old 2016-09-30, 02:27   #1
ewmayer's Avatar
Sep 2002
República de California

100110010001002 Posts
Default Turn off GCC sse-using optimizations?

I am trying to debug some inline-asm used in a loop body, and am running into a problem where a simple bit of print-some-variable-values debug code is clobbering key xmm-register data used in the inline-asm data-processing macro I trying to debug. Examination of an assembler-dump of the source file shows what's gong on - In the following simple print of 3 dereferenced pointer values:

	printf("\tax = %20.15f, %20.15f, %20.15f\n",*ax0,*ax1,*ax2);
Here is what GCC does to deref these:
	movsd	(%rax), %xmm0	#* ax0,
	movsd	(%rsi), %xmm1	#* ax1,
	movsd	(%rbx), %xmm2	#* ax2,
Note the code in question benefits performance-wise from not writing the data in question back to memory at the end of each loop pass and reading them back at beginning of the next (i.e. it only writes-to-mem after the loop finishes, and thus relies on the fact that GCC's implementation of the loop-logic uses no SSE2) - if there is not a simple compiler-options tweak to solve the clobber problem that is what I will have to do.

Things I tried:
o I can't simply fiddle the xmm-regs in the inline-asm to avoid using xmm0-2 since the code block in question uses as 16 xmm-regs.
o Using -O0 does not solve the problem - apparently this is a non-turn-off-able kind of optimization.
o Using -mno-sse2 does not work, since it GCC also applies it to all of my SSE2 inline-asm, this yielding a raft of "error: unknown register name ‘xmm0’ in ‘asm’."

Any help is appreciated!

Last fiddled with by ewmayer on 2016-09-30 at 02:33
ewmayer is offline   Reply With Quote
Old 2016-09-30, 03:39   #2
bsquared's Avatar
Feb 2007

2·13·127 Posts

Wrap the printf in a my_printf function that resides in another file, then apply -mno-sse2 to that other file? If you have several such statements you can use vprintf to retain variable length arguments lists in your my_printf function.
bsquared is offline   Reply With Quote
Old 2016-09-30, 04:35   #3
ldesnogu's Avatar
Jan 2008

52810 Posts

The x86-64 function calling convention uses xmm registers to pass FP numbers arguments to functions, so I'm afraid there's little you can do :(

The only simple way to get around the problem I can think of is to call a function with no parameter, save all regs in that function and then do whatever call you want. Not sure you can do that at gcc level, or if you'll have to write your wrapping function in assembly.

Last fiddled with by ldesnogu on 2016-09-30 at 04:38
ldesnogu is offline   Reply With Quote
Old 2016-09-30, 07:15   #4
ewmayer's Avatar
Sep 2002
República de California

22×31×79 Posts

Thanks, guys - I ended up just adding extra load/store code to my inline-asm, as that is the most straightforward of any of the workarounds proposed so far.
ewmayer is offline   Reply With Quote

Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Glibc 2.27 Is Being Released Soon With Numerous Performance Optimizations heliosh Lounge 7 2018-02-02 18:19
Msieve 64 bit optimizations.. Carlo Msieve 1 2011-09-09 20:44
compiler/assembler optimizations possible? ixfd64 Software 7 2011-02-25 20:05
northwood optimizations E_tron Software 8 2006-01-08 15:15
turn off your integrated Snd card in CMOS nngs Hardware 0 2005-05-20 01:31

All times are UTC. The time now is 07:35.

Sat Oct 31 07:35:36 UTC 2020 up 51 days, 4:46, 2 users, load averages: 2.03, 1.94, 1.86

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.