![]() |
![]() |
#1 |
Apr 2018
USA
13 Posts |
![]()
I have good history with the system throwing the errors. After every output I get this: Hardware errors have occurred during the test! 1 Jacobi error.
This started after I ran fstrim on the file system that mprime and its files reside on, while it was running. Because of wear-leveling algorithms, SSDs have no way to tell natively which parts of the file system are no longer in use by the operating system, and vice versa. Fstrim is a program that marks the reusable areas of a file system so the SSD firmware knows it can reuse them. I suspect there is a flaw somewhere in the fstrim>kernel>filesystem>mprime>filesystem chain such that fstrim marks parts of mprime's files as not in use when in fact this is an error. Since the problem seems unique to mprime, it is possible it is using some old kernel calls that fail under certain more recently developed circumstances, or with less sophisticated file formats. I am really not fit to troubleshoot this possibility. But I will say it is probably better to close mprime before running fstrim. |
![]() |
![]() |
![]() |
#2 |
Nov 2019
1018 Posts |
![]()
It is possible that the SSD has buggy firmware.
The fstrim command should only tell the SSD to TRIM unallocated space, unless there is a kernel bug. A work-around may be to disable automatic TRIM (I think it would be in the mount options); and only run it monthly or similar. If the error persists without running TRIM, you may actually have an unrelated hardware error (I would guess RAM). Last fiddled with by phillipsjk on 2019-11-16 at 00:45 Reason: Grammar, spelling |
![]() |
![]() |
![]() |
#3 |
Apr 2018
USA
13 Posts |
![]()
The problem occurred again. No fstrim was run between times.
Code:
[Worker #1 Feb 21 18:38] Iteration: 37610000 / 101988773 [36.87%], ms/iter: 52.999, ETA: 39d 11:47 [Worker #1 Feb 21 18:38] Hardware errors have occurred during the test! [Worker #1 Feb 21 18:38] 1 Gerbicz/double-check error. [Worker #1 Feb 21 18:38] Confidence in final result is excellent. [Worker #1 Feb 21 18:40] Gerbicz error check passed at iteration 37611256. [Worker #3 Feb 21 18:40] M103931309 stage 1 is 32.05% complete. Time: 467.809 sec. [Worker #4 Feb 21 18:41] Iteration: 9890000 / 103946203 [9.51%], ms/iter: 45.156, ETA: 49d 03:46 [Worker #2 Feb 21 18:45] Iteration: 35440000 / 101992529 [34.74%], ms/iter: 44.817, ETA: 34d 12:31 If I remember in a few months--when I trim the file system next--I'll completely exit mprime, and see if that makes a difference. I predict it will! |
![]() |
![]() |
![]() |
#4 |
Undefined
"The unspeakable one"
Jun 2006
My evil lair
6,449 Posts |
![]()
The data in RAM is being corrupted, thus you get the error reported.
So if you are sure it is related to fstrim then there can be a number of possible cause. Buggy driver (already mentioned above). Bad PSU dropping voltage when the drive is sucking more current during the trim. Overheating of the system during trim. etc. But also be open to the idea that trim is just a coincidence. It could be a flaky RAM stick. Cosmic ray upsets. Alpha decay in the RAM packaging. Overzealous clocking of some part. etc. Last fiddled with by retina on 2020-02-22 at 04:58 |
![]() |
![]() |
![]() |
#5 |
Apr 2018
USA
13 Posts |
![]()
Well, I doubt if it's the PSU, because it's a laptop, and the mprime program itself requires more power than executing the trim command. The drive passes every test of it's functionality. The problem only occurs with the combination of mprime and fstrim. And now the problem has mysteriously disappeared without even the most insignificant hardware change.
I doubt if the ram was being written over, because that has nothing to do with the issue, and if it was the cause, it would occur in other scenarios. Alpha particles were a problem for system memory in the 1970s. So, probably not currently relevant. I surmise the program, to avoid making huge files outright, uses sparse files, and fstrim doesn't handle sparse files well if they are open for r/w. Mprime, when stopped temporarily. When the mprime program is quit, using the menu item, it writes it's data and closes the files. Then, fstrim has no trouble determining the correct boundaries. Or, since I'm guessing, I might be completely incorrect! I want to thank the contributors to this discussion thread, for sparking my mind to think. |
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Another mprime is already running | tshinozk | Information & Answers | 3 | 2013-12-10 16:26 |
Running mprime on fedora | jimmychauck | Information & Answers | 1 | 2010-06-16 04:42 |
adding a computer running mprime | Unregistered | Information & Answers | 14 | 2009-02-16 14:01 |
mprime is running but i dont see that | mhnaras | Linux | 2 | 2007-10-21 15:58 |
running mprime on a computer I do not own | happyraul | Software | 4 | 2004-05-06 15:54 |