justpassingby

@justpassingby@sh.itjust.works

Just passing by

This profile is from a federated server and may be incomplete. Browse more on the original instance.

justpassingby,

Thanks for the quick reply ๐Ÿ‘

The next step I would try would be to boot an other install, like a liveusb or a raspbian, on the same usb port, to completely eliminate a hardware problem if it boots properly.

Good advice. I moved the usb drive from another (working) PI and attached it to the same USB port. It boots correctly. It is not the USB port nor power.

If it is a software problem, it seems to happen very early in the boot process, so my bet would be a corrupted initramfs/initrd (or what is equivalent on a Pi). No idea how you could debug and fix that on Ubuntu, though (especially on a Pi where /boot isโ€ฆ different).

I believe it is something like that. Or it is not mounting the drive correctly and not finding it, or it is something else. I just wish there was a better (or any) error printed on the console. I tried to attach a keyboard to get to a shell with no success. I honestly could just reformat the drive and use a clean install, but it is the last resource. I would like to understand what happened so I can learn from it and avoid it in the future (or learn a path to fix it).

justpassingby,

I could format it and try but the moment I do I lose the ability to debug the issue and learning from the problem. So I may temporary solve it but it may happen again. I wonder if someone knows what I can check/test/run to identify what broke.

justpassingby,

Hi, just writing it here in case you were curious of the cause of the problem.

I finally had some time today to work on it. What I tried was to copy the content of the system boot partition of another PI, one that I configured at the same time/day, and compared file by file with the content of the partition in the broken one. Now there were some diff in the some binary files as expected, but what surprised me was that one file was ONLY present in the working oneโ€ฆ initrd.img XD Now, donโ€™t ask me why the hell the file was not there. Maybe it got corrupted somehow, since nothing touches this partition as far as I am aware aside at boot time. Luckily, there was .bak file present in the partition which I renamed andโ€ฆ it worked.

Lesson learned: I now have a copy of the boot partition of every PI I managed (it is only 167MB pre tar-zip, so it is not a cost) and I have it backed up safely on another system. Should another file get corrupted in the future (maybe this time without a .bak) I have an older working copy if it, and I can restore the service without the need to format everything.

Thank you so much for your help and advice!

justpassingby,

Two Pi4 Model B 8gb and one Pi4 Model B 4gb. OS I am using the latest ubuntu 64. A word of warning, I am having some issues with longhorn breaking the fs of my pihole pod making them โ€œread onlyโ€. I work on this stuff as a job and I still donโ€™t find a good explanation about wth went wrong. I have to be honest longhorn github issues are not very helpful nor there are good logs about it. I am starting to think there are too many microservices working behind the scene and too many gRPCs. tl;dr: it is hell to debug - I still donโ€™t find a good alternative however

justpassingby,

Eh, I will have to find my notes on the issue with the pihole, I can see if I can dig them out this weekend and send it to you (I wonder if you can send PM in Lemmy ^^).

To stay on the point of this discussion: just, and I am not joking, this afternoon I got hit by this: longhorn.io/โ€ฆ/troubleshooting-volume-with-multipaโ€ฆThe pod (in this case wireguard) was crashing because it could not mount the drive and the error was something like โ€œalready mounted or mount point busyโ€. I had to dig and dig but I found out the problem was the one above and I fixed it. I will now add that setting in my ansible and configure all three the PIs. However this should not happen for a mature-ish system like longhorn which may cater a userbase which may not know enough to dig into /dev . I think there should be a better way to alert the users for such an issue. Just to be clear, longorn UI and logs were nice and dandy, all good on the western front, but all was broken. Longorn reconciler could have a check that is something should be mounted, and is not, and the error is โ€œalready mountedโ€, but is not โ€œalready mountedโ€, check for known bugs. However I think the issue is what I said above. It is too fragmented and working with a miriad of other microservices, so longhorn is like โ€œI gave the order, now whateverโ€. I will share what is in my longhorn-system ns, there is no secret in here but I want to give an idea (ps: I do nothing fancy with longhorn at home - obvs some are ds so you see 3 pods because I have 3 nodes):

<pre style="background-color:#ffffff;">
<span style="color:#323232;">k get pods -n longhorn-system | cut -d' ' -f1
</span><span style="color:#323232;">NAME
</span><span style="color:#323232;">engine-image-ei-f9e7c473-5pdjx
</span><span style="color:#323232;">engine-image-ei-f9e7c473-xq4hn
</span><span style="color:#323232;">instance-manager-e-fa08a5ebf4663f1e9fb894f865362d65
</span><span style="color:#323232;">engine-image-ei-f9e7c473-gdp6n
</span><span style="color:#323232;">instance-manager-e-567b6ba176274fe20a001eec63ce3564
</span><span style="color:#323232;">instance-manager-r-567b6ba176274fe20a001eec63ce3564
</span><span style="color:#323232;">instance-manager-r-b1d285dd9205d1ba992836073c48db8a
</span><span style="color:#323232;">instance-manager-e-b1d285dd9205d1ba992836073c48db8a
</span><span style="color:#323232;">daily-keep-for-a-week-28144800-pppw8
</span><span style="color:#323232;">longhorn-manager-xqwld
</span><span style="color:#323232;">longhorn-ui-f574474c8-n847h
</span><span style="color:#323232;">longhorn-manager-cgqvm
</span><span style="color:#323232;">longhorn-driver-deployer-6c7bd5bd9b-8skh4
</span><span style="color:#323232;">longhorn-manager-tjzvz
</span><span style="color:#323232;">instance-manager-d3c9343a8637e4ef197ad6da68b3ed2d
</span><span style="color:#323232;">instance-manager-cf746b18d51f6426b74d6c6652f01afc
</span><span style="color:#323232;">engine-image-ei-d911131c-wwfwz
</span><span style="color:#323232;">engine-image-ei-d911131c-qcn26
</span><span style="color:#323232;">instance-manager-e7d92f3ca0455cde2158bebdbb33ea16
</span><span style="color:#323232;">engine-image-ei-d911131c-mgb2k
</span><span style="color:#323232;">csi-attacher-785fd6545b-bn9lp
</span><span style="color:#323232;">csi-attacher-785fd6545b-4nfxz
</span><span style="color:#323232;">csi-provisioner-8658f9bd9c-2bq7v
</span><span style="color:#323232;">csi-provisioner-8658f9bd9c-q6ctq
</span><span style="color:#323232;">csi-attacher-785fd6545b-rx7r9
</span><span style="color:#323232;">csi-resizer-68c4c75bf5-tmw2f
</span><span style="color:#323232;">csi-resizer-68c4c75bf5-n9dxm
</span><span style="color:#323232;">csi-snapshotter-7c466dd68f-7r2x6
</span><span style="color:#323232;">csi-snapshotter-7c466dd68f-cd8pm
</span><span style="color:#323232;">longhorn-csi-plugin-vgqh5
</span><span style="color:#323232;">longhorn-csi-plugin-mnskk
</span><span style="color:#323232;">csi-provisioner-8658f9bd9c-kcb8f
</span><span style="color:#323232;">csi-resizer-68c4c75bf5-gccfg
</span><span style="color:#323232;">csi-snapshotter-7c466dd68f-wsltq
</span><span style="color:#323232;">longhorn-csi-plugin-9q9kj
</span>

Dependency on the csi-* ecosystem sort of allows the errors to get lost in translation.

justpassingby,

Please do not use my bad experience stop you! Longhorn is a nice tool and as you can read online and in other posts it works very well. I may have unlucky, have a bad configuration or had my PIs under too much pressure. Who knows! My advice is try something new: k3s, longhorn, etc. That is what I use the PIs for. I would not use Longhorn at work :D

Iโ€™m really not sure what I even want the distributed FS for. I guess I wanted to have redundancy on the pod long term storage, but I have other ways to achieve that.

I am not using replicas :) I use longhorn for the clean/integrated backup mechanism instead of using something external. Maybe one day when I have the same-ish disk speed on all 3 PIs I will enable replicas but for now I am good like this.

For backups of important stuff maybe use something else or ALSO something else. I was personally thinking to use another backup too for longhorn devices like github.com/backube/volsync or velero to have a secondary source in case something happen. Also longhorn is always getting better. This is just out of the press github.com/longhorn/longhorn/releases/tag/v1.5.0

My advice? Try it out! If not, it will still be a source of learning and fun (but I am strange, I like to debug stuff).

justpassingby,

Sorry to hear. If you had NO connection with kubectl, I would have adviced you to check the ports; but if sometimes it replies and most of the time not, it must be something else. Good luck with the debug and if you have any specific problem you could also try to create a post on any of the self-hosted communities here on lemmy. From my experience people is more friendly and more technical than what we used to have on reddit.

justpassingby, (edited )

I think the issue with feddit.de is that, "I think", it is blocking the federetaion with this server. I may be totally wrong, since I am digging how this works for like 1h, but I can see this server as "blocked" here: https://lemmymap.feddit.de/ I wish there was a list instead of the map because it is unreadable, but you can search for sh.itjust.works there. Remember to select "blocked" first in the bottom left.

EDIT: Incorrect information. Read comments below.

justpassingby,

Yes (I think). The two approaches above are the ones that comes to mind. Again, I am super-new here so I really hope I am not misleading you ^^; And what @manifex just commented (reaching out ot the admin).

justpassingby,

Thanks for the link! It is much easier to see the federated servers there.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • โ€ข
  • JUstTest
  • tacticalgear
  • DreamBathrooms
  • thenastyranch
  • magazineikmin
  • Durango
  • cubers
  • Youngstown
  • mdbf
  • slotface
  • rosin
  • ngwrru68w68
  • kavyap
  • GTA5RPClips
  • provamag3
  • ethstaker
  • InstantRegret
  • Leos
  • normalnudes
  • everett
  • khanakhh
  • osvaldo12
  • cisconetworking
  • modclub
  • anitta
  • tester
  • megavids
  • lostlight
  • All magazines