mersenneforum.org Useless SSE instructions
 Register FAQ Search Today's Posts Mark Forums Read

2009-02-25, 16:51   #23
R.D. Silverman

"Bob Silverman"
Nov 2003
North of Boston

22·1,877 Posts

Quote:
 Originally Posted by ewmayer Note I said "runtime", not "performance". My reaction was something along the lines of "You know, if I needed a way to get my CPU to run cooler, I'd just switch my system power options to max-battery-life mode or fill my assembly code with no-ops."

There is famous quote from Seymour Cray: What do we need software for?
It just slows the machine down.......

 2009-02-26, 21:33 #24 geoff     Mar 2003 New Zealand 13×89 Posts My whinge: SSE has AND and AND-NOT, but no NOT. So I synthesize NOT from AND, AND-NOT, and say PCMPEQD, which uses an extra scratch register. Why not have AND and NOT and let the programmer synthesize AND-NOT, no scratch register required? I suppose there must be a reason.
2009-02-26, 22:09   #25
__HRB__

Dec 2008
Boycotting the Soapbox

24·32·5 Posts

Quote:
 Originally Posted by geoff My whinge: SSE has AND and AND-NOT, but no NOT. So I synthesize NOT from AND, AND-NOT, and say PCMPEQD, which uses an extra scratch register. Why not have AND and NOT and let the programmer synthesize AND-NOT, no scratch register required? I suppose there must be a reason.
Synthesizing AND-NOT requires two instructions, so PANDN can be twice as fast. If you need NOT, you can do that with XOR in one instruction, using a memory operand and a location filled with FFFF, if you're experiencing register pressure.

 2009-04-22, 04:47 #26 __HRB__     Dec 2008 Boycotting the Soapbox 72010 Posts rcpps, but no rcppd! WTF? (nt) no text
2009-04-22, 05:38   #27
retina
Undefined

"The unspeakable one"
Jun 2006
My evil lair

11010000101112 Posts

Quote:
 Originally Posted by __HRB__ rcpps, but no rcppd! WTF? (nt)
Since the result is only 12bit there seems little sense in expanding it to a 53bit mantissa. Do four conversions in one cycle and go from there to whatever final precision is needed.

2009-04-22, 16:47   #28
__HRB__

Dec 2008
Boycotting the Soapbox

24·32·5 Posts
divps, divpd

I should have payed attention to the thread title. What I meant was that divps & divpd are superfluous, since rcpps/rcppd & newton-raphson are faster and can be pipelined.

Quote:
 Originally Posted by retina Since the result is only 12bit there seems little sense in expanding it to a 53bit mantissa. Do four conversions in one cycle and go from there to whatever final precision is needed.
The issue is that the missing rcppd forces one to use two extra instructions - convert doubles to floats and floats to doubles - blocking the execution ports for 2 cycles and adding 6-8 cycles in latency.

 2012-03-28, 17:58 #29 bsquared     "Ben" Feb 2007 72238 Posts pcmpgtw Ok, so pcmpgtw isn't exactly useless, but I'm really quite upset right now over the fact that there is no unsigned equivalent.
2012-03-28, 19:45   #30
axn

Jun 2003

22×32×151 Posts

Quote:
 Originally Posted by bsquared Ok, so pcmpgtw isn't exactly useless, but I'm really quite upset right now over the fact that there is no unsigned equivalent.
PSUBUSW should get you to almost all the way.

2012-03-28, 20:34   #31
bsquared

"Ben"
Feb 2007

7·13·41 Posts

Quote:
 Originally Posted by axn PSUBUSW should get you to almost all the way.
Yeah, cool!

This will do the job:
Code:

"pxor %%xmm0, %%xmm0 \n\t"/* xmm0 := 0 */
"psubusw %%xmm1, %%xmm2 \n\t"/* xmm2 := b - a */
"pcmpeqw %%xmm0, %%xmm2 \n\t"/* xmm2 := a >= b ? 1 : 0 */

The extra dependency costs a cycle of latency, a "0" register must be set up (which can be reused for additional tests), and it's actually a ">=" test, but it's still a decent workaround.

In the spirit of this thread, though, it still sucks that this is necessary...

Last fiddled with by bsquared on 2012-03-28 at 20:35

 2012-03-28, 23:05 #32 Batalov     "Serge" Mar 2008 Phi(4,2^7658614+1)/2 235038 Posts "The only thing in the house that didn't suck was the vacuum cleaner."
2012-03-29, 06:50   #33
davieddy

"Lucan"
Dec 2006
England

2×3×13×83 Posts
THX

Quote:
 Originally Posted by Batalov "The only thing in the house that didn't suck was the vacuum cleaner."
I don't laugh often enough these days.

Sounds like Raymond Chandler or similar.

David

 Similar Threads Thread Thread Starter Forum Replies Last Post jasong Forum Feedback 1054 2022-06-20 22:34 EdH Linux 11 2016-05-13 15:36 lycorn PrimeNet 16 2009-09-08 18:16 jocelynl Data 4 2004-11-28 13:28

All times are UTC. The time now is 21:52.

Sun Jan 29 21:52:02 UTC 2023 up 164 days, 19:20, 0 users, load averages: 1.06, 0.84, 0.80