mersenneforum.org Linux memory help
 Register FAQ Search Today's Posts Mark Forums Read

 2021-09-24, 03:53 #1 Prime95 P90 years forever!     Aug 2002 Yeehaw, FL 763810 Posts Linux memory help I'm running prime95 on a very basic Linux install. Ubuntu 14 or 16, no GUI, nothing much running but prime95. I'm trying to understand the random "Killed" problem during stage 2. Machine has 8GB memory, 5.5GB for mprime. Top shows this: Code: Tasks: 118 total, 1 running, 117 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.0 us, 0.2 sy, 99.7 ni, 0.2 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 7860544 total, 169252 free, 7455956 used, 235336 buff/cache KiB Swap: 0 total, 0 free, 0 used. 93612 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 23137 george 30 10 6803204 6.004g 3828 S 398.3 80.1 2662:36 mprime307 Question 1: It seems I have no swapfile (makes sense as there is no hard disk, only an NFS mount). So why does VIRT (virtual memory?) show 6.8GB and RES (resident memory?) show 6GB. Question 2: The above works OK. But if I let mprime use 6GB, VIRT goes up to almost 7GB and RES goes up to about 6.1GB. This configuration usually works. With OutputIterations set pretty low, I get output at a fairly constant 129 sec. Sometimes though mprime gets slow (200+ sec.). To me this looks like thrashing, but how can that be without a swapfile? Perhaps a system process running and the memory marked cache is converted to free memory which then gets reloaded after the system process completes? Question 3: Mprime (and prime95) does not let the user input too large a value for memory to use. We do not want naive users (like me??) inputting a value that is apt to cause thrashing. One can always edit local.txt to put in larger values. The current formula allows up to 90% of system memory (7.2GB in my case -- clearly too much). What would be a better formula? Perhaps reserving 10% of system memory or 2.5GB whichever is larger??? Suggestions welcome. P.S. The above was in 30.7 which has a bug where it sometimes allocates more memory than necessary. The working set obeys the memory limit but peak memory usage is higher than it needs to be. I'm working on a fix and can update these "top" numbers when fixed. Last fiddled with by Prime95 on 2021-09-24 at 03:56
 2021-09-24, 06:46 #2 Nick     Dec 2012 The Netherlands 17·103 Posts The information on this page is several years old but might possibly help.
2021-09-24, 08:31   #3
preda

"Mihai Preda"
Apr 2015

101010111012 Posts

Quote:
 Originally Posted by Prime95 Question 1: It seems I have no swapfile (makes sense as there is no hard disk, only an NFS mount). So why does VIRT (virtual memory?) show 6.8GB and RES (resident memory?) show 6GB.
In my understanding, VIRT in "top" for a process indicates "virtual" memory that is not mapped to physical memory. That would happen for example after a malloc() but before writing anything to the malloc'ed range. And RES is memory mapped to physical pages. Thus, VIRT is limited by the virtual address range (huge), not by the physical memory or swap.

OTOH it's a good idea to have a swap set up, in order for it to collect the "dead" regions of allocated memory (freeing the physical memory from junk). Creating the swap does not require a swap *partition*, can be done in a file using "mkswap" and "swapon".

 2021-09-27, 02:27 #4 Prime95 P90 years forever!     Aug 2002 Yeehaw, FL 167268 Posts Update. I've worked on reducing the peak memory usage during stage 2 init in 30.7. Top now reports this with stage 2 allowed to use 6GB. Roughly the same numbers as the previous 30.7 version with 5.5GB memory allowance. Code: top - 22:20:00 up 321 days, 5:13, 0 users, load average: 3.99, 3.55, 3.29 Tasks: 119 total, 1 running, 118 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.0 us, 0.1 sy, 99.8 ni, 0.2 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 7860544 total, 138896 free, 7452364 used, 269284 buff/cache KiB Swap: 0 total, 0 free, 0 used. 80212 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 25939 george 30 10 6787548 6.004g 5936 S 399.7 80.1 32:02.13 mprime307 25958 george 20 0 43340 3644 3072 R 0.3 0.0 0:00.03 top 1 root 20 0 121836 5952 1980 S 0.0 0.1 1:47.94 systemd I'll see if this suffers from the random "Killed" problem overnight. Next I'll try creating a small swap file.
2021-09-27, 03:39   #5
retina
Undefined

"The unspeakable one"
Jun 2006
My evil lair

628710 Posts

Quote:
 Originally Posted by preda In my understanding, VIRT in "top" for a process indicates "virtual" memory that is not mapped to physical memory. That would happen for example after a malloc() but before writing anything to the malloc'ed range. And RES is memory mapped to physical pages. Thus, VIRT is limited by the virtual address range (huge), not by the physical memory or swap.
Yeah, that is basically correct. I've seen processes have TiB of VIRT without any issue. It can be safely ignored. But RES, that's where the action happens. You can freely allocate PiB of memory (so I have read, not tried it myself) and all is well until you try to actually use too much of it.

ETA: Linux allocations are equivalent to Windows with the MEM_RESERVE flag set, and the MEM_COMMIT flag unset. With the difference being Linux auto-commits, whereas Windows will GPF upon access.
Quote:
 Originally Posted by preda OTOH it's a good idea to have a swap set up, in order for it to collect the "dead" regions of allocated memory (freeing the physical memory from junk). Creating the swap does not require a swap *partition*, can be done in a file using "mkswap" and "swapon".
IMO swap is of no value if you have nowhere to swap to.

And IMO also swap is of no value even when there is somewhere to swap to, just buy more RAM. YMMV :P

Last fiddled with by retina on 2021-09-27 at 03:46 Reason: Add Windows equivalence

 2021-09-27, 06:04 #6 Prime95 P90 years forever!     Aug 2002 Yeehaw, FL 2·3·19·67 Posts More investigation. These Linux machines have 512 huge pages (which I think are 2MB each). Thus 1GB may not be accessible for P-1 stage 2. When large page support was wedged into gwnum, the sin/cos tables and one gwnum were allocated with large pages. This was adequate in a world where LL testing was the norm.
2021-09-28, 18:13   #7
retina
Undefined

"The unspeakable one"
Jun 2006
My evil lair

628710 Posts

Quote:
 Originally Posted by retina You can freely allocate PiB of memory (so I have read, not tried it myself) ...
That last part in brackets is incorrect. I got curious so I tried it.

I allocated 2PiB of RAM on the smallest machine I could find with 4GiB installed.
Code:
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
16894 retina    20   0  0.125p   2052      4 S   0.0  0.1   0:00.10 mem_reserve_tes
16906 retina    20   0  0.125p   2052      4 S   0.0  0.1   0:00.10 mem_reserve_tes
16914 retina    20   0  0.125p   2052      4 S   0.0  0.1   0:00.10 mem_reserve_tes
16922 retina    20   0  0.125p   2052      4 S   0.0  0.1   0:00.10 mem_reserve_tes
16930 retina    20   0  0.125p   2052      4 S   0.0  0.1   0:00.10 mem_reserve_tes
16938 retina    20   0  0.125p   2052      4 S   0.0  0.1   0:00.10 mem_reserve_tes
16946 retina    20   0  0.125p   2052      4 S   0.0  0.1   0:00.11 mem_reserve_tes
16954 retina    20   0  0.125p   2052      4 S   0.0  0.1   0:00.10 mem_reserve_tes
16962 retina    20   0  0.125p   2052      4 S   0.0  0.1   0:00.10 mem_reserve_tes
16970 retina    20   0  0.125p   2052      4 S   0.0  0.1   0:00.10 mem_reserve_tes
16978 retina    20   0  0.125p   2052      4 S   0.0  0.1   0:00.10 mem_reserve_tes
16990 retina    20   0  0.125p   2052      4 S   0.0  0.1   0:00.10 mem_reserve_tes
16998 retina    20   0  0.125p   2052      4 S   0.0  0.1   0:00.10 mem_reserve_tes
17006 retina    20   0  0.125p   2052      4 S   0.0  0.1   0:00.11 mem_reserve_tes
17014 retina    20   0  0.125p   2052      4 S   0.0  0.1   0:00.10 mem_reserve_tes
17017 retina    20   0  0.125p   2052      4 S   0.0  0.1   0:00.10 mem_reserve_tes
Tasks can allocate a maximum of 127TiB each.
Code:
~ for x in {1..16} ; do ./mem_reserve_test & sleep 1 ; done
[1] 16894
7ffec0000000
[2] 16906
7ffec0000000
[3] 16914
7ffec0000000
[4] 16922
7ffec0000000
[5] 16930
7ffec0000000
[6] 16938
7ffec0000000
[7] 16946
7ffec0000000
[8] 16954
7ffec0000000
[9] 16962
7ffec0000000
[10] 16970
7ffec0000000
[11] 16978
7ffec0000000
[12] 16990
7ffec0000000
[13] 16998
7ffec0000000
[14] 17006
7ffec0000000
[15] 17014
7ffec0000000
[16] 17017
7ffec0000000
So that is the full 47 bits of address space maxed out, 16 times over.

And the test code in case anyone cares to try it.
Code:
ALLOC_START		= 1 shl 32
ALLOC_STEPS		= 1 shl 30

format elf64 executable 0 at 1 shl 16
entry main

HANDLE_STD_OUTPUT	= 1

; see /usr/include/asm/unistd_64.h
SYS64_WRITE		= 1
SYS64_MMAP		= 9
SYS64_NANOSLEEP		= 35
SYS64_EXIT		= 60

MMAP_PROT_WRITE		= 0x2
MMAP_MAP_PRIVATE	= 0x2
MMAP_MAP_FIXED		= 0x10
MMAP_MAP_ANONYMOUS	= 0x20
MMAP_MAP_FIXED_NOREPLACE= 0x100000

main:
mov	r15,ALLOC_START
mov	r14,ALLOC_STEPS
.alloc_loop:
xor	r9,r9			;offset
or	r8,-1			;fd
mov	r10,MMAP_MAP_PRIVATE or MMAP_MAP_ANONYMOUS or MMAP_MAP_FIXED or MMAP_MAP_FIXED_NOREPLACE
mov	rsi,r14			;size
mov	eax,SYS64_MMAP
syscall
cmp	rax,r15
jnz	.done
jmp	.alloc_loop
.done:
mov	rax,-ALLOC_START
call	print_hex
mov	al,10
call	print_char
mov	eax,60
call	sleep
mov	rax,SYS64_EXIT
xor	edi,edi
syscall

print_hex:
mov	rdi,rsp
sub	rsp,32
dec	rdi
mov	byte[rdi],0
mov	ecx,16
.next_digit:
xor	edx,edx
div	rcx
xchg	rdx,rax
lea	ebx,[eax+'a'-10]
cmp	al,'9'
cmova	eax,ebx
dec	rdi
mov	[rdi],al
test	rdx,rdx
mov	rax,rdx
jnz	.next_digit
mov	rax,rdi
call	print_string
ret

print_string:
mov	rdi,rax
.next_char:
mov	al,[rdi]
test	al,al
jz	.done
push	rdi
call	print_char
pop	rdi
inc	rdi
jmp	.next_char
.done:
ret

print_char:
push	rax
mov	eax,SYS64_WRITE
mov	edi,HANDLE_STD_OUTPUT
mov	rsi,rsp
mov	edx,1
syscall
pop	rax
ret

sleep:
push	0 rax
mov	eax,SYS64_NANOSLEEP
mov	rdi,rsp
xor	esi,esi
syscall
pop	rax rax
ret
Old school assembly with no macros or other modern rubbish like calling standards or anything. You'll need fasm to compile it. It has a whopping 350 byte executable size.

2021-09-28, 19:44   #8
Prime95
P90 years forever!

Aug 2002
Yeehaw, FL

2×3×19×67 Posts

Quote:
 Originally Posted by retina Old school assembly with no macros .
Manly programming

More on my problem. Latest change was killed again. Found that writing a save file during stage 2 requires 2 more temporaries (~100MB). I've written a "fix" that immediately frees that memory back to the OS. So far, top is reporting a pretty steady 6.01GB and haven't been killed in the last 12 hours.

I do have an idea to reduce the cost of a save file create to one gwnum or perhaps even less (~12MB). Not going to happen for 30.7.

 Similar Threads Thread Thread Starter Forum Replies Last Post tServo Software 0 2019-05-07 16:59 M344587487 Science & Technology 42 2018-11-17 13:07 ixfd64 Hardware 4 2011-12-14 21:24 Jushi Programming 12 2006-11-13 08:52 Dresdenboy Software 3 2003-12-08 14:47

All times are UTC. The time now is 05:01.

Mon Oct 25 05:01:10 UTC 2021 up 93 days, 23:30, 0 users, load averages: 0.79, 1.00, 1.01