![]() |
![]() |
#34 |
∂2ω=0
Sep 2002
República de California
2DEB16 Posts |
![]()
[reviving this too-long-dormant thread]
OK, but why does the ISA give us not one but *two* separate instructions - PXOR and XORPD - to do exactly the same thing (a whole-xmm-register bitwise XOR), but neither a logical (1s-comp) nor arithmetic (2s-comp) NOT of any kind? |
![]() |
![]() |
![]() |
#35 | |
"Ben"
Feb 2007
1110100100112 Posts |
![]() Quote:
|
|
![]() |
![]() |
![]() |
#36 | |
May 2008
Worcester, United Kingdom
72×11 Posts |
![]() Quote:
Repeated cycles of messy design decisions bolted on earlier mess has now produced a need for backwards compatibility that is so strong that all attempts to produce something better have inevitably failed in mass market terms. It seems we will never escape this abomination :-( |
|
![]() |
![]() |
![]() |
#37 |
Undefined
"The unspeakable one"
Jun 2006
My evil lair
1A1716 Posts |
![]()
If and when the tablets and smart phones finally push out the desktop and laptop markets then we will all be using ARM. Let's just hope the MSs latest x86 "surface" fails and sanity prevails with ARM taking over.
|
![]() |
![]() |
![]() |
#38 |
∂2ω=0
Sep 2002
República de California
5·2,351 Posts |
![]() |
![]() |
![]() |
![]() |
#39 |
∂2ω=0
Sep 2002
República de California
5·2,351 Posts |
![]()
Closely related to Useless SSE Instructions is the category "SSE Instructions Which Would Be Useful But Which Are Inexplicably Absent". Onesuch which has caused me annoyance this day is the lack of anh SSE Instruction to perform floating-double <--> 64-bit integer conversions. I have an application whose outputs are packed-double (xmm-register) representations of 50-bit nonnegative ints, which I need to convert to integer form for further manipulation as 64-bit ints.
I am thinking of emulating the missing conversion by taking advantage of the 50-bit normalization, like so: 0. Star with the outputs in packed-double form; 1. Add (packed double)250 to each to yield identical exponent fields, effectively "aligning the hidden bits", which allows us to use a constant set of mask and shift parameters in the ensuing step; 2. Now treating the operands as packed 64-bit ints, do some simple integer-mask magic to mask off the IEEE-double exponent bits and right-justify the mantissas. This would also wipe away the extra power 250 we added in step [1]. Does anyone know if the operand/operation-type-mixing which occurs in step [2] will impose a significant cycle penalty? Any other ideas - not necessarily SSE-based - for efficiently doing the above type conversions are also welcome. |
![]() |
![]() |
![]() |
#40 |
P90 years forever!
Aug 2002
Yeehaw, FL
11111110110102 Posts |
![]()
IIRC, most Intel CPUs have a one clock penalty for moving an operand from the FPU to the integer units.
|
![]() |
![]() |
![]() |
#41 |
Undefined
"The unspeakable one"
Jun 2006
My evil lair
6,679 Posts |
![]()
If you add 252 instead then you only need the mask and can eliminate the shift.
|
![]() |
![]() |
![]() |
#42 | ||
∂2ω=0
Sep 2002
República de California
5×2,351 Posts |
![]() Quote:
Quote:
Time to code it and time it... |
||
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Posts that seem less than useless, or something like that | jasong | Forum Feedback | 1054 | 2022-06-20 22:34 |
Fedora gedit for bash has become useless | EdH | Linux | 11 | 2016-05-13 15:36 |
Useless DC assignment | lycorn | PrimeNet | 16 | 2009-09-08 18:16 |
Useless p-1 work | jocelynl | Data | 4 | 2004-11-28 13:28 |