![]() |
![]() |
#151 | |
Aug 2002
8,563 Posts |
![]() Quote:
Also, can the pauser program maybe kick back on if say 30 minutes of idle time goes by? I'd hate for someone to forget to kill it... (I check everything every 3-4 hours so I suppose I might catch it eventually!) I'm curious, how much of an effect does having a program like mprime running at nice 19 have on running timings? Do you think us having a revision B Opteron will be a problem? I'm half tempted to call AMD and complain, since our primary development focus deals with SSE2... Thoughts? |
|
![]() |
![]() |
![]() |
#152 | ||||
P90 years forever!
Aug 2002
Yeehaw, FL
23·1,021 Posts |
![]() Quote:
Quote:
Quote:
Quote:
|
||||
![]() |
![]() |
![]() |
#153 |
P90 years forever!
Aug 2002
Yeehaw, FL
23·1,021 Posts |
![]()
BTW, would anyone like to volunteer to get some mlucas and glucas timings?
|
![]() |
![]() |
![]() |
#154 | |
Aug 2002
8,563 Posts |
![]() Quote:
|
|
![]() |
![]() |
![]() |
#155 |
Aug 2002
8,563 Posts |
![]()
Looks like P-1 time!
Code:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ Command 3112 gw 39 19 359m 359m 736 R 99.9 72.5 691:50.32 ./mprime |
![]() |
![]() |
![]() |
#156 | |
Apr 2003
Berlin, Germany
36110 Posts |
![]() Quote:
Here's what I found out: All Opterons have following values: family 15 model 5 Revision B or earlier show stepping 0 or 1 Revision C shows stepping 8 (Athlon 64 1.8GHz shows family 15, model 4, stepping 8) CPU-Z additionally shows a revision string, which is "SH7-B3" for some rev B CPUs. I also read "B0" somewhere. So it could be that "B" in this string really means revision B and that there are different B steppings too. Some really new benchmark results (http://pcweb.mycom.co.jp/benchmarkla.../25/index.html) show that there is not just a little difference but a difference in the two digit percent range. I don't show SSE2 differences in Sandra again (since it's an synthetic benchmark and can suffer overproportionally from a few wrongly (or not at all) optimized instructions). Instead this one (especially MPEG-2 and DIVX) is more interesting - and a real world benchmark (taken from http://pcweb.mycom.co.jp/benchmarkla.../25/page5.html): http://pcweb.mycom.co.jp/benchmarkla...images/g15.png Remember: the clock difference between 240 (1.4GHz) and 144 (1.8GHz) is only 28.6%! The 240 could be a CPU which was sold months ago while the 144 is really new. Regards, DDB |
|
![]() |
![]() |
![]() |
#157 |
Aug 2002
8,563 Posts |
![]() Code:
mv@opteron:/proc> cat cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 5 model name : AMD Opteron(tm) Processor 140 stepping : 1 cpu MHz : 1396.059 cache size : 1024 KB fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm 3dnowext 3dnow bogomips : 2785.28 TLB size : 1088 4K pages clflush size : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts ttp |
![]() |
![]() |
![]() |
#158 |
Aug 2002
8,563 Posts |
![]()
$925.13 in donations, $835.32 spent, and $89.81 left...
I just got around to ordering the CD-ROM... It was $16.99 plus $5 for delivery... http://www.newegg.com/app/viewProduct.asp?description=27-101-204&depa=1 |
![]() |
![]() |
![]() |
#159 |
Aug 2003
52 Posts |
![]()
Back in town today, this is my cpuinfo
Opteron64:~ # cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 5 model name : AMD Engineering Sample stepping : 8 cpu MHz : 1800.028 cache size : 1024 KB fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm 3dnowext 3dnow bogomips : 3591.37 TLB size : 1088 4K pages clflush size : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts ttp so I guess it is a 'C' I'll start running some tests if I can find the right software. |
![]() |
![]() |
![]() |
#160 | ||
Apr 2003
Berlin, Germany
192 Posts |
![]() Quote:
Quote:
ScienceMark 2.0 (www.sciencemark.org) has some matrix multiply benchmarks (SGEMM, DGEMM) optimized for different architectures. And most interesting: Prime95/mprime on a revision C! :) Plus the x87-only results (CpuSupportsSSE2=0). |
||
![]() |
![]() |
![]() |
#161 | |
Aug 2003
52 Posts |
![]() Quote:
I just ran mprime (what is p95tst ??) these are the results. Please let me know if this should be something different. Main Menu 1. Test/Primenet 2. Test/User Information 3. Test/Vacation or Holiday 4. Test/Status 5. Test/Continue 6. Test/Exit 7. Advanced/Test 8. Advanced/Time 9. Advanced/P-1 10. Advanced/ECM 11. Advanced/Priority 12. Advanced/Manual Communication 13. Advanced/Unreserve Exponent 14. Advanced/Quit Gimps 15. Options/CPU 16. Options/Preferences 17. Options/Torture Test 18. Options/Benchmark 19. Help/About 20. Help/About PrimeNet Server Your choice: 8 Exponent to time (10000000): 20000000 Number of Iterations (10): Accept the answers above? (Y): p: 20000000. Time: 76.813 ms. p: 20000000. Time: 77.903 ms. p: 20000000. Time: 78.012 ms. p: 20000000. Time: 76.832 ms. p: 20000000. Time: 76.813 ms. p: 20000000. Time: 76.771 ms. p: 20000000. Time: 76.808 ms. p: 20000000. Time: 76.771 ms. p: 20000000. Time: 76.889 ms. p: 20000000. Time: 77.663 ms. Iterations: 10. Total time: 0.771 sec. Estimated time to complete this exponent: 17 days, 20 hours, 29 minutes. Hit enter to continue: Bok |
|
![]() |
![]() |
![]() |
#162 |
Apr 2003
Berlin, Germany
192 Posts |
![]()
@bok:
You can find p95tst here: http://www.mersenne.org/gimps/p95tst.zip. |
![]() |
![]() |
![]() |
#163 |
Aug 2003
1916 Posts |
![]()
yuk,
I'm only running linux I'm afraid, that's a win executable. I've got a drive with XP (32bit) installed on it for the opteron, but as I'm at work (ssh'd ) I can't swap... any linux tests I can do ?? Bok |
![]() |
![]() |
![]() |
#164 |
Apr 2003
Berlin, Germany
16916 Posts |
![]()
There are some small and quickly available benchmarks (like bytemark: http://www.tux.org/~mayer/linux/bmark.html) but they are compiler dependend.
Later you could run some tests on WinXP. It's not that urgent to know the results :) I'm still at work, will go home soon (GMT +1). Then I'll have a look at some SSE2 stuff on Opteron. |
![]() |
![]() |
![]() |
#165 | |
Aug 2002
11011112 Posts |
![]() Quote:
These are the results for latest snapshot of Glucas v2.9.1 [code:1] milsec/iter (user time) FFT(k) round check on / off --- --------- 512 52/ 50 576 60/ 59 640 68/ 64 768 79/ 76 896 99/ 97 1024 108/105 1152 125/122 1280 140/135 1536 167/163 1792 206/205 2048 230/223 [/code:1] The binary is made using SSE2 and the system compiler GCC 3.3. You can try with a Revision C chip downloading my latest home snapshot ftp://ftp.oxixares.com/glucas/glucas-2.9.1.tar.gz Then a usual configure and make. To test, do a selftest ./Glucas -s p And you will see the timings in 'selftest.out' file. Guillermo |
|
![]() |
![]() |
![]() |
#166 |
Aug 2002
3×37 Posts |
![]()
Just for comparation, here are the results of my Athlon XP (Barton) BOX,
[code:1] gbv@gauss:~/glucas/glucas-2.9.1> less selftest.out You have new mail in /var/spool/mail/gbv gbv@gauss:~/glucas/glucas-2.9.1> cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 6 model : 10 model name : AMD Athlon(tm) XP 2500+ stepping : 0 cpu MHz : 1830.138 cache size : 512 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse syscall mmxext 3dnowext 3dnow bogomips : 3643.80 [/code:1] And the timings: [code:1] milsec/iter (user time) FFT(k) round check on / off --- --------- 512 53/ 50 576 63/ 60 640 74/ 70 768 88/ 84 896 107/100 1024 115/116 1152 138/130 1280 149/142 1536 181/172 1792 217/213 2048 238/228 [/code:1] Guillermo |
![]() |
![]() |
![]() |
#167 |
Aug 2003
52 Posts |
![]()
Ok, ran that Glucas test (your link needs /pub/ btw)
I guess you cleaned up the results so I'll try and do the same, I think I'm interpreting it correctly. milsec/iter (user time) FFT(k) round check on / off --- --------- 512 43/40 576 52/48 640 56/51 768 66/62 896 80/76 1024 88/84 1152 107/100 1280 115/107 1536 137/130 1792 166/158 2048 183/175 I'll try making with -m64 -m128bit-long-double as well Bok |
![]() |
![]() |
![]() |
#168 |
P90 years forever!
Aug 2002
Yeehaw, FL
1FE816 Posts |
![]()
Bok, try this:
Run "mprime -m" Choose 18 exit mprime vi local.ini add line "CpuSupportsSSE2=0" repeat choice 18 |
![]() |
![]() |
![]() |
#169 |
P90 years forever!
Aug 2002
Yeehaw, FL
816810 Posts |
![]()
Bok, also please try this in a fresh directory
Get ftp://mersenne.org/gimps/mprtst.tar.gz Run mprime -m Choose 5 exit delete worktodo.ini Run mprime -m Choose 8, exponent = 20000000, iterations = 10 exit Post the results.txt file - full of weird timer numbers |
![]() |
![]() |
![]() |
#170 | |
Aug 2003
318 Posts |
![]() Quote:
AMD Engineering Sample CPU speed: 1799.79 MHz CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE, SSE2 L1 cache size: 64 KB L2 cache size: 1024 KB L1 cache line size: 64 bytes L2 cache line size: 64 bytes L1 TLBS: 32 L2 TLBS: 512 Prime95 version 22.12, RdtscTiming=1 Best time for 256K FFT length: 16.347 ms. Best time for 320K FFT length: 21.512 ms. Best time for 384K FFT length: 25.947 ms. Best time for 448K FFT length: 31.366 ms. Best time for 512K FFT length: 34.787 ms. Best time for 640K FFT length: 46.058 ms. Best time for 768K FFT length: 56.882 ms. Best time for 896K FFT length: 69.160 ms. Best time for 1024K FFT length: 76.882 ms. Best time for 1280K FFT length: 98.280 ms. Best time for 1536K FFT length: 118.428 ms. Best time for 1792K FFT length: 145.785 ms. AMD Engineering Sample CPU speed: 1800.20 MHz CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE L1 cache size: 64 KB L2 cache size: 1024 KB L1 cache line size: 64 bytes L2 cache line size: 64 bytes L1 TLBS: 32 L2 TLBS: 512 Prime95 version 22.12, RdtscTiming=1 Best time for 256K FFT length: 17.196 ms. Best time for 320K FFT length: 22.334 ms. Best time for 384K FFT length: 29.183 ms. Best time for 448K FFT length: 32.451 ms. Best time for 512K FFT length: 35.374 ms. Best time for 640K FFT length: 46.979 ms. Best time for 768K FFT length: 56.599 ms. Best time for 896K FFT length: 68.014 ms. Best time for 1024K FFT length: 75.773 ms. Best time for 1280K FFT length: 100.296 ms. Best time for 1536K FFT length: 120.103 ms. Best time for 1792K FFT length: 145.532 ms. |
|
![]() |
![]() |
![]() |
#171 | |
Aug 2003
52 Posts |
![]() Quote:
Test 0: 0.000 sec. (121 clocks), avg: 0.000 sec. (127 clocks) Test 1: 0.000 sec. (2131 clocks), avg: 0.000 sec. (2140 clocks) Test 2: 0.000 sec. (654142 clocks), avg: 0.000 sec. (660652 clocks) Test 3: 0.001 sec. (2623567 clocks), avg: 0.001 sec. (2685738 clocks) Test 4: 0.002 sec. (2840212 clocks), avg: 0.002 sec. (2854997 clocks) Test 1000: 0.000 sec. (528147 clocks), avg: 0.000 sec. (537882 clocks) Test 1001: 0.001 sec. (1547728 clocks), avg: 0.001 sec. (1581690 clocks) Test 1002: 0.001 sec. (2323676 clocks), avg: 0.001 sec. (2347559 clocks) Test 1003: 0.000 sec. (525146 clocks), avg: 0.000 sec. (525155 clocks) Test 1004: 0.001 sec. (2460613 clocks), avg: 0.001 sec. (2470975 clocks) Test 1005: 0.002 sec. (4224755 clocks), avg: 0.002 sec. (4397632 clocks) Test 1006: 0.000 sec. (528147 clocks), avg: 0.000 sec. (529015 clocks) Test 1007: 0.001 sec. (2485652 clocks), avg: 0.001 sec. (2512956 clocks) Test 1008: 0.002 sec. (3813135 clocks), avg: 0.002 sec. (4017296 clocks) Test 1009: 0.001 sec. (1040145 clocks), avg: 0.001 sec. (1042419 clocks) Test 1010: 0.003 sec. (4907879 clocks), avg: 0.003 sec. (4914776 clocks) Test 1011: 0.003 sec. (4732079 clocks), avg: 0.003 sec. (4976143 clocks) Test 1012: 0.000 sec. (504955 clocks), avg: 0.000 sec. (505812 clocks) Test 1013: 0.000 sec. (831666 clocks), avg: 0.000 sec. (834674 clocks) Test 1014: 0.001 sec. (1120767 clocks), avg: 0.001 sec. (1142366 clocks) Test 1015: 0.001 sec. (1581311 clocks), avg: 0.001 sec. (1590955 clocks) Test 1016: 0.001 sec. (1290771 clocks), avg: 0.001 sec. (1312691 clocks) Test 1017: 0.001 sec. (1901221 clocks), avg: 0.001 sec. (1910335 clocks) Test 1018: 0.001 sec. (1152645 clocks), avg: 0.001 sec. (1154434 clocks) Test 1019: 0.001 sec. (2025428 clocks), avg: 0.001 sec. (2029746 clocks) [Fri Sep 5 16:23:34 2003] timer 0: 72921256 timer 1: 57044016 timer 2: 579204 timer 3: 72340708 timer 4: 10620074 timer 5: 6002809 timer 6: 13630747 timer 9: 12004685 timer 10: 13410368 timer 13: 7685964 timer 14: 7410400 timer 16: 9099429 timer 17: 5984708 timer 18: 8094904 timer 20: 12457308 timer 21: 13360638 timer 24: 7918447 timer 26: 18079 timer 27: 36327 Bok |
|
![]() |
![]() |
![]() |
#172 |
Sep 2003
3×863 Posts |
![]()
Bok, perhaps you ought to try mprime version 23.5 instead of 22.12?
|
![]() |
![]() |
![]() |
#173 |
Aug 2003
52 Posts |
![]()
ok, ran it again with version 23.5
AMD Engineering Sample CPU speed: 1799.80 MHz CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE, SSE2 L1 cache size: 64 KB L2 cache size: 1024 KB L1 cache line size: 64 bytes L2 cache line size: 64 bytes L1 TLBS: 32 L2 TLBS: 512 Prime95 version 23.5, RdtscTiming=1 Best time for 384K FFT length: 25.615 ms. Best time for 448K FFT length: 30.690 ms. Best time for 512K FFT length: 34.733 ms. Best time for 640K FFT length: 43.128 ms. Best time for 768K FFT length: 52.786 ms. Best time for 896K FFT length: 63.544 ms. Best time for 1024K FFT length: 71.629 ms. Best time for 1280K FFT length: 95.534 ms. Best time for 1536K FFT length: 116.598 ms. Best time for 1792K FFT length: 140.507 ms. Best time for 2048K FFT length: 158.316 ms. [Fri Sep 5 20:38:36 2003] Compare your results to other computers at http://www.mersenne.org/bench.htm That web page also contains instructions on how your results can be included. AMD Engineering Sample CPU speed: 1799.76 MHz CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE L1 cache size: 64 KB L2 cache size: 1024 KB L1 cache line size: 64 bytes L2 cache line size: 64 bytes L1 TLBS: 32 L2 TLBS: 512 Prime95 version 23.5, RdtscTiming=1 Best time for 384K FFT length: 28.408 ms. Best time for 448K FFT length: 32.031 ms. Best time for 512K FFT length: 34.807 ms. Best time for 640K FFT length: 47.125 ms. Best time for 768K FFT length: 55.922 ms. Best time for 896K FFT length: 67.866 ms. Best time for 1024K FFT length: 74.954 ms. Best time for 1280K FFT length: 96.482 ms. Best time for 1536K FFT length: 117.145 ms. Best time for 1792K FFT length: 139.925 ms. Best time for 2048K FFT length: 156.999 ms. Should it have been slower ??? Bok |
![]() |
![]() |
![]() |
#174 | |
Sep 2003
3×863 Posts |
![]() Quote:
So your results show that version 23.5 is actually a bit faster than 22.12 for your Opteron. |
|
![]() |
![]() |
![]() |
#175 | |
Aug 2002
3×37 Posts |
![]()
It seems there is no significative advantage using Revision C for Glucas. Actually, the timings are near the same. The version Bok tested is a bit slower because I supressed an optimization causing problems in other targets.
Other interesting thing is that Glucas and mprime are closest than ever ![]() And I also have work to do with Glucas ... Quote:
Guillermo |
|
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Opteron is Hyperthreaded ? | bgbeuning | Information & Answers | 3 | 2016-01-10 08:26 |
Opteron web server... | Xyzzy | Lounge | 14 | 2003-11-05 23:07 |
Opteron Bottleneck?? | Prime95 | Hardware | 31 | 2003-09-17 06:54 |
AMD Opteron | naclosagc | Software | 27 | 2003-08-10 19:14 |
What will an AMD Opteron be classified as ? | dsouza123 | Software | 4 | 2003-08-02 14:29 |