IPFire went down last night, can't find cause

Jon,

Here’s the S.M.A.R.T info:

smartctl 7.3 2022-02-28 r5338 [x86_64-linux-5.15.59-ipfire] (IPFire 2.27)
Copyright (C) 2002-22  Bruce Allen  Christian Franke  www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     KingDian M280 120GB
Serial Number:    A4720783073000208341
LU WWN Device Id: 0 000000 000000000
Firmware Version: SBFM61.2
User Capacity:    120 034 123 776 bytes [120 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      mSATA
TRIM Command:     Available
Device is:        Not in smartctl database 7.3/5319
ATA Version is:   ACS-4 (minor revision not indicated)
SATA Version is:  SATA 3.2  6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Tue Oct 18 13:51:58 2022 CDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   050    Pre-fail  Always       -       0
  9 Power_On_Hours          0x0012   100   100   000    Old_age   Always       -       34376
 12 Power_Cycle_Count       0x0012   100   100   000    Old_age   Always       -       374
168 Unknown_Attribute       0x0012   100   100   000    Old_age   Always       -       0
170 Unknown_Attribute       0x0003   095   095   000    Pre-fail  Always       -       52
173 Unknown_Attribute       0x0012   100   100   000    Old_age   Always       -       5636205
192 Power-Off_Retract_Count 0x0012   100   100   000    Old_age   Always       -       321
194 Temperature_Celsius     0x0023   067   067   000    Pre-fail  Always       -       33 (Min/Max 33/33)
218 Unknown_Attribute       0x000b   100   100   050    Pre-fail  Always       -       0
231 Unknown_SSD_Attribute   0x0013   100   100   000    Pre-fail  Always       -       97
241 Total_LBAs_Written      0x0012   100   100   000    Old_age   Always       -       10979

I swapped out the whole system, not just a drive, because I didn’t know if it was specifically a drive issue.

I have the old system on my desk here and I ran some tests on it. Both the unit in production and the unit on my desk have Identical hardware: Model 101S-6 Supermicro SYS-E200-9B. 4 Core Intel Pentium N3700 1.6 Ghz, 8GB ram, 120 GB m.2 sata ssd, 4x Intel Ethernet Controller i210 Gigabit Netework connection (rev03).

On the production machine:

[root@ipfire log]# hdparm -tT /dev/sda

/dev/sda:
 Timing cached reads:   3092 MB in  1.99 seconds = 1552.27 MB/sec
 Timing buffered disk reads: 1174 MB in  3.00 seconds = 391.22 MB/sec
[root@ipfire log]#

On the machine on my desk with no users connected:

[root@ipfire log]# hdparm -tT /dev/sda

/dev/sda:
 Timing cached reads:   2236 MB in  2.00 seconds = 1117.89 MB/sec
 Timing buffered disk reads: 230 MB in  3.01 seconds = 76.5 MB/sec
[root@ipfire log]#

That is quite a performance difference.


EDIT: added block code formatting - moderator
See: