How I failed then succeeded my FreeBSD 13.0-RELEASE upgrade
The different steps
To make it short, I have a first, empty zpool with a minimal FreeBSD installation. Once connected to it I attach the two encrypted partitions of the two disks to a encrypted mirror with
geli. Then I define the next root mount point with the
kenv command and use
reboot -r to make a partial reboot into the newly defined root.
I have a backup for this machine, but I always prefer to make a fresh backup before every important operation. My script is very simple, but it takes hours.
The first reboot
Once the backups finished I reboot the machine. By default it reboots to the unencrypted, small (50GB) zpool titled zboot.
The first upgrade
Connected via ssh to this host I just use the
freebsd-update fetch and
freebsd-update install commands, read the messages, reboot, update the installed packages and everything is well.
First step ✓
The second upgrade and how it failed
TLDR; I can even find out how it failed, I don’t see why it failed, but… it does.
My goal was to attach the encrypted partitions, define the next root to use, and once connected to the entire machine, proceed with
freebsd-update as I already did. Sounds logical to me.
geli attach ada0p4 ada1p4 Enter passphrase
Did I do a zfs update at this time, I can’t remember but certainly I did.
kenv vfs.root.mountfrom="zfs:tank/root" reboot -r
ssh <host> sshd, permission denied
Damn something went wrong.
I started to make mistake a this point. I was tired and want to finish this upgrade before going to bed.
THIS IS A VERY BAD IDEA
The third (and all other) reboots
I connected to the Hetzner console and issue a hard reboot to get back up to the unencrypted zboot system.
Next, instead of investigating why
reboot -r failed and why
sshd did not start, I decided to make a “
The “raw” upgrade
- Download the FreeBSD 13.0-RELEASE archive (base.txz, kernel.txz, lib32.txz, src.txz and ports.txz).
- attach the encrypted partitions and mount the zpool at an alternate root (
zpool import -o altroot=/tank/root tank)
- make a little loop to install the new OS on the encrypted partition:
foreach foo ( base.txz kernel.txz lib32.txz src.txz ports.txz doc.txz); xz -d -c -v $foo | tar -C /root -xf -; end
- Define the next root (
- then reboot (
- No chance, ssh still unavailable.
Many stupid things.
- clean the encrypted partitions from all system directories (/lib, /lib32, /usr, /usr/libexec, /usr/libdata/, /etc, /var) and change some flags for some libraries (
chflags noschg <file>)
- reinstall the OS (see above) As expected, no chance again.
I need help, even when I unmount the data datasets before I install or remove things, I’m going to make more and more mistakes and one day the only solution will be reinstall the whole machine. And I don’t want that.
Calling a friend
So I created a Protonmail address (my mail server is one of the jails of the problematic machine), and ask my friend Ollivier for help.
He asks me to get access to the machine. I put his ssh key on the machine, create a user, give him all the credentials, to mount the encrypted partitions, access to the Hetzner console, etc. Yes I have absolute trust in him, for many many reasons.
He asks me if I have access to the machine console. Sure, and why I did not think about it before. I must be a sucker!
I ask Hetzner support to plug a console into my machine. And I reboot it.
As expected nothing wrong for the first part. As usual, I attached the encrypted partitions and
Damn, the console said there is missing libraries for
sshd, the /var/tmp folder is missing too.
Yes, I am a sucker, a real one and stupid. During all my “experimentation”. I made zfs datasets for all the important folders, those I had cleaned (/lib, /usr, …). And they are missing for the startup.
I fix that and recreate
/var/tmp. Reboot once again.
At this reboot sshd found the mandatory libraries,
/var/tmp and it worked. I can get connected to the entire machine again. \o/
Some other fix
The encrypted zpool was not mounted in the right place. I reboot once again, fix the altroot problem for the zpool, issue a last
reboot -r and… YES.
It works and the jails start as expected.
What I learned
- do not insist if your are tired, you will make mistakes;
- the same behaviour gives you the same result, insisting while tired will irritate you;
- never forget that you have friends to help you! You’re not alone.
- stop what you are doing, do something totally different (gardening, running, cooking…). It will clean your mind.
It took me two and a half days to1 make this upgrade. It’s too long. Before doing things, think about it and the consequences, double, triple check what you are doing. And never forget to make a backup. The backup I made of /etc before doing anything else, saved my ass too.
Big thanks to my friend Ollivier who pointed me to the right track to find the solution. He was my rubber duck debugging :-)
Big thanks too, to MacLemon for rereading and fix this post.
To be clear and real, I was off one day for health reasons. ↩︎