The server which hosted this website and a bunch of my data-hoarding and data-intensive operations has decided to fuck off, with roughly this failure
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
|
[ 134.611236] {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 32992
[ 134.611245] {1}[Hardware Error]: event severity: fatal
[ 134.611261] {1}[Hardware Error]: Error 0, type: fatal
[ 134.611266] {1}[Hardware Error]: section_type: PCIe error
[ 134.611269] {1}[Hardware Error]: port_type: 4, root port
[ 134.611272] {1}[Hardware Error]: version: 1.0
[ 134.611275] {1}[Hardware Error]: command: 0x0547, status: 0x4010
[ 134.611279] {1}[Hardware Error]: device_id: 0000:00:01.0
[ 134.611284] {1}[Hardware Error]: slot: 0
[ 134.611286] {1}[Hardware Error]: secondary_bus: 0x01
[ 134.611289] {1}[Hardware Error]: vendor_id: 0x8086, device_id: 0x3c02
[ 134.611291] {1}[Hardware Error]: class_code: 060400
[ 134.611294] {1}[Hardware Error]: bridge: secondary_status: 0x0000, control: 0x0003
[ 134.611297] {1}[Hardware Error]: aer_cor_status: 0x00000001, aer_cor_mask: 0x000031c1
[ 134.611300] {1}[Hardware Error]: aer_uncor_status: 0x00000020, aer_uncor_mask: 0x00318000
[ 134.611304] {1}[Hardware Error]: aer_uncor_severity: 0x00067030
[ 134.611306] {1}[Hardware Error]: TLP Header: 00000000 00000000 00000000 00000000
[ 134.611312] GHES: Fatal hardware error but panic disabled
[ 134.611315] Kernel panic - not syncing: GHES: Fatal hardware error
|
Backups exist, but not for all crap that I ran there. So I’m on a data diet until we attempt a replacement 2 weeks from now.