Determine shutdown cause

Hi all, I’ve got a cheap Celeron box running OPNSense and it’s been pretty good so far, but I found twice that the device turned off at some point while I was at work, and I have been unable to figure out what’s causing it.

The only change was that I enabled Monit to see if I could figure out what was causing crowdsec to stop sometimes but never ended up configuring anything. I’ve only been running it for a couple months though, so it’s possible that that is not related.

I know that on a Mac (based on freebsd, right?) you can determine whether the shutdown reason was a hard shutdown, regular shutdown, or the power cable being unplugged. Is it possible to do that with OPNSense? I’d like to narrow it down to software or hardware ideally.

doctorzeromd,

UPDATE: It crashed again today, and I was able to pull some logs and check the temperature at the time of the crash. (91 degrees which dropped to 71 degrees right before crashing? https://lemmy.world/pictrs/image/3c651ac1-8312-403f-8f76-23895916ba04.png

From system log


<span style="color:#323232;"><13>1 2024-03-13T18:30:44-04:00 OPNsense.my.home opnsense 44846 - [meta sequenceId="1192"] /usr/local/etc/rc.newwanipv6: No IP change detected (current: IPV6ADDRESSREDACTED, interface: wan)
</span><span style="color:#323232;"><13>1 2024-03-13T18:30:53-04:00 OPNsense.my.home opnsense 60522 - [meta sequenceId="1193"] /usr/local/etc/rc.newwanipv6: No IP change detected (current: IPV6ADDRESSREDACTED, interface: wan)
</span><span style="color:#323232;"><45>1 2024-03-13T22:12:44-04:00 OPNsense.my.home syslog-ng 10182 - [meta sequenceId="1"] syslog-ng starting up; version='4.6.0'
</span><span style="color:#323232;"><13>1 2024-03-13T22:12:45-04:00 OPNsense.my.home kernel - - [meta sequenceId="2"] ---<<BOOT>>---
</span><span style="color:#323232;"><13>1 2024-03-13T22:12:45-04:00 OPNsense.my.home kernel - - [meta sequenceId="138"] WARNING: / was not properly dismounted
</span>

From dmesg


<span style="color:#323232;">arp: 192.168.1.61 moved from someMAC to anotherMAC on igc1
</span><span style="color:#323232;">arp: 192.168.1.61 moved from anotherMAC to someMAC on igc1
</span><span style="color:#323232;">WARNING: / was not properly dismounted
</span><span style="color:#323232;">WARNING: /: mount pending error: blocks 40 files 4
</span>

I mean, I’m not saying that errors on the drive are the CAUSE of the problem, more likely a symptom, but it does look like it just straight up crashed, right?

doctorzeromd,

Final Update: it’s the hardware, I think it was overheating in general, but also the SSD seems to have been dying and the ram wasn’t particularly reliable, possibly due to the heat.

Good lesson not to buy the cheapest thing from AliExpress! My new box is working great.

342345,

No clue. :)

You haven’t mentioned the logs. Any hints there? System/ log files/ general. You can see there how a regular reboot/ shutdown should look like at least.

Is there a second device at the same outlet, that writes logs or shows its uptime? To rule out power outages.

Bios settings: is there a setting to power on the pc when the power is reconnected. (If it was an outage)

doctorzeromd,

It’s plugged into a power strip that other devices are plugged into, I did turn on “power on on ac restore” so if it is power related it should come back and I’ll see the downtime in uptimekuma.

The system logs go straight from No IP Change detected to the next boot, so a crash or failure seem likely. If something told the computer to shut down, I should see that in the logs, right?

It’s a passively cooled computer, is there any way that I can determine whether a high temp forced the computer down?

342345,

The system logs go straight from No IP Change detected to the next boot, so a crash or failure seem likely.

I think so. If it was shut down orderly, there should be log entries for the shutdown.

It’s a passively cooled computer, is there any way that I can determine whether a high temp forced the computer down?

Some bios have logging. I remember a Asrock board with bmc which remembered CPU too hot events. It depends on the board, normally I would say: I don’t think so.

If it is a hardware issue: boot a Live Linux from an USB-stick. Memtest86, long smart test,fsck, CPU burnin test, a network load test could show failures. But it is just wild guessing at this point from my side. Sorry.

CPU temps in opnsense: system / settings/ misc/ thermal. Not helpful but maybe interesting.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • opnsense@lemmy.world
  • kavyap
  • thenastyranch
  • cubers
  • ethstaker
  • InstantRegret
  • DreamBathrooms
  • ngwrru68w68
  • magazineikmin
  • everett
  • Youngstown
  • mdbf
  • slotface
  • rosin
  • GTA5RPClips
  • JUstTest
  • khanakhh
  • normalnudes
  • osvaldo12
  • cisconetworking
  • provamag3
  • Durango
  • tacticalgear
  • modclub
  • Leos
  • megavids
  • tester
  • anitta
  • lostlight
  • All magazines