Page 1 of 3 123 LastLast
Results 1 to 10 of 27

Thread: Ubuntu on ZFS grub/boot failure

  1. #1
    Join Date
    Feb 2024
    Beans
    13

    Ubuntu on ZFS grub/boot failure

    Hi all. I installed Ubuntu 22.04 on ZFS May of last year, primarily to begin learning Linux so I can migrate from Windows, especially for servers.
    It mostly functioned well until last week.
    2 issues I want to solve:
    1- Fixing the boot failure
    2- Migrating to a new drive

    Regarding #2, I originally installed Ubuntu on an old Samsung 840 SSD but upgraded the drive in my main PC so wanted to migrate Ubuntu to the now unused NVMe, both to give me more to play with and to learn how to migrate Ubuntu on ZFS to a new drive. I read plenty and attempted multiple methods but all failed. I thought I saw a possible solution but that was more than 6 months ago and I became busy with more important issues so never returned to deal with the issue and now forget. Some things I tried:

    - Using gparted to copy non-zfs partitions, then "zfs attach" to mirror bpool & rpool to the new drive followed by "zfs split" to break the mirror with pools intact on both drives, followed by removing the old drive & renaming the pool to the original name (split requires giving a different name). This is a summary of steps, not complete list, but it failed and I forget exactly why.
    - Fresh installing Ubuntu on ZFS to the NVMe, then booting the LiveUSB to delete the rpool on the NVMe and use zfs send/receive to send the original rpool. IIRC, this failed because grub was looking for rpool/ROOT/ubuntu_YYYY but this method transferred the original /ubuntu_XXXX, and I failed to find a method of fixing that.
    - It is possible the ubuntu_X/Y issue was from the 1st method, I forget now.

    As said above, I set that issue aside, but this new boot failure seems a good time to try to solve that.

    Regarding #1, the system functioned mostly well since May until last week. Important details:
    - I installed zfsautosnapshot. This caused bpool to fill with old snapshot data until it lacked sufficient free space for a kernel update.
    - I finally had some time to learn to deal with that issue so solved that last week by deleting the old snapshots, then updating the kernel (apt update, apt upgrade, apt autoremove). This updated from 6.2* to 6.5*
    - The kernel did its boot/grub update to install the new vmlinuz & initrd where grub could use them, but it found dozens of references to the old 6.2 kernel, I think in the snapshots, not certain. Perhaps this caused part/all of the issue?
    - The present symptom is Ubuntu fails to boot so I get grub saying it cannot find hd1,part3.
    - Doing ls in grub on any hd# or hd#,part# returns "Compression algorithm inherit not supported".
    - Booting to UbuntuLiveUSB, I used "zpool import -R /zfs -a" to verify all the zfs pools & datasets seem to be intact, with the possible exception of bpool.
    - The bpool issue is it shows mounted to /zfs/boot but ls shows /zfs/boot empty, and when I tried to follow instructions from https://askubuntu.com/questions/8262...4-installation section 4.3 returned "unknown filesystem type 'zfs_member'.
    - The UbuntuLiveUSB definitely has ZFS capability because I can use zpool & zfs commands and it imported & mounted the zpools & datasets.

    I want to learn how to properly solve both these issues, and the present boot failure seems a good time to also learn to migrate drives. I am relatively new to Linux, so I apologize if I failed to provide information you think would help. Please instruct me how to get any extra info you think could help.

  2. #2
    Join Date
    Mar 2010
    Location
    USA
    Beans
    Hidden!
    Distro
    Ubuntu Development Release

    Re: Ubuntu on ZFS grub/boot failure

    Let's fix #1 first.

    The problem booting appeared "after" doing a snapshot (or snapshots) of bpool... (BUG) I have an open bug report with 4 work-arounds for that.
    RE: https://bugs.launchpad.net/ubuntu/+s...d/+bug/2051999

    Recommend newest/updated work-around: https://bugs.launchpad.net/ubuntu/+s...99/comments/33

    Here, just for FYI... Keep this link handy for the future. Those are tutorials I came up with to chroot into installed ZFS Systems, depending on how they were installed and with differing options:, when things go South, and you have to rescue an install ZFS system when it won't boot:
    https://github.com/Mafoelffen1/OpenZFS-Ubuntu-Admin

    Tell me how that goes...
    Last edited by MAFoElffen; February 20th, 2024 at 03:10 AM.

    "Concurrent coexistence of Windows, Linux and UNIX..." || Ubuntu user # 33563, Linux user # 533637
    Sticky: Graphics Resolution | UbuntuForums 'system-info' Script | Posting Guidelines | Code Tags

  3. #3
    Join Date
    Feb 2024
    Beans
    13

    Re: Ubuntu on ZFS grub/boot failure

    Thanks for those links. Yet to read all of the OpenZFS-Ubuntu-Admin stuff but running a non-encrypted root presently so skimmed that.
    Read the launchpad thread you linked, mostly followed as much of your updated work-around as I could. Before relaying the error, your comment 23 & 33 have differences which could be important.

    After your UUID= line, you list 2 lines beginning "zfs import" for the rpool & bpool datasets. I think those should be "zfs mount", as import is a pool function while mount is for the datasets in those steps, correct?
    Comment 23 includes instructions to copy the original sources.list file as a backup, then replace the contents with (I think) only the new sources to force updates from those instead of default repositories, then pulling the updates, then restoring the original sources.list.
    Comment 33 (the updated workaround) instructs to add the new repository then perform the upgrade then restore the sources.list, but omits copying the original as a backup then clearing the active sources.list to force updating from the source necessary.
    After I did the chroot step, I copied the sources.list before adding the new repository, then found it did little (basically found no updates to grub) until I cleared the active sources.list and again added the new repository.
    If you re-add those steps, be careful if you copy/paste from comment 23 to 33 because 23 I think has an error on the line to create a new Noble sources.list, being /apt was omitted from the directory path, specifying /etc/sources.list, but seemed to not account for that elsewhere so that could cause problems for people following exactly who fail to notice the issue.

    It failed for me trying to pull the grub update from the new repo.
    Using your exact command I get "grub-efi-amd64-signed is already the newest version (1.187.6+2.06-2ubuntu14.4)", followed (skipping some lines) by "The following packages have unmet dependencies: grub-efi-amd64-signed : Depends: grub-efi-amd64-bin (= 2.06-2ubuntu14.4) but 2.12-1ubuntu1-ppa1~22.04 is to be installed" then "E: Unable to correct problems, you have held broken packages."

    I then tried to alter the command to "apt install grub-efi-amd64-bin" but it also failed, with a similar but different error:
    "shim-signed : Depends: grub-efi-amd64-signed (>= 1.187.2~) but it is not going to be installed or grub-efi-arm64-signed (>= 1.187.2~) but it is not installable"
    "E: Error, pkgProblemResolver::Resolve generated breaks, this may be caused by held packages."

    The arm64 part is irrelevant here.
    Any idea how to solve this issue?
    I am also curious why it succeeded for you but is failing for me. What else is different? Different version of Ubuntu? Did I install an older Ubuntu than you did? Package updates between then & now? Any info you think could help?

    Also, while this boot failure did technically begin after snapshots were created on bpool, I installed zfs-auto-snapshots back in... June I think, so it ran for 8-ish months with no boot problems. I rebooted it a few times, but thankfully nowhere near as often as Windows needs... So this issue all began after I cleared the old snapshots to free the space for the kernel upgrade.
    Further, that launchpad thread indicates auto-snapshots did not cause this problem, only manual snapshots did, and I created no manual snapshots since installing zfs-auto-snapshots. Not saying it is not all related, but I think it important to note that difference.

  4. #4
    Join Date
    Mar 2010
    Location
    USA
    Beans
    Hidden!
    Distro
    Ubuntu Development Release

    Re: Ubuntu on ZFS grub/boot failure

    Dang sorry. My quick tests on that yesterday morning (while trying to help you on my way to work) were not complete enough.

    Try the workaround from Comment #33 again. I made edits to both 33 and updated it based on what it did to you. #33 now uses that same PPA, but uses the Noble sources.list, to pull in the needed depends where it needs to.

    I know that #23 works and had tested it a few time on fresh installs of 22.04. The one in #33, I quickly tested on Mantic & noble, which seemingly must not have the same depends as Jammy. Thank you for helping me test that. I made the source.list change to pull in the needed depends from the Noble repo's.

    That should work for you now. Thank you for testing this for on Jammy, and sorry for the previous. I can't test this myself "this morning" as, I have an early morning then have to pick up my wife from the airport. Hoping that goes well for you, and as expected.

    Please tell me how that goes.
    Last edited by MAFoElffen; February 21st, 2024 at 02:52 PM.

    "Concurrent coexistence of Windows, Linux and UNIX..." || Ubuntu user # 33563, Linux user # 533637
    Sticky: Graphics Resolution | UbuntuForums 'system-info' Script | Posting Guidelines | Code Tags

  5. #5
    Join Date
    Feb 2024
    Beans
    13

    Re: Ubuntu on ZFS grub/boot failure

    Result:
    Was able to progress to the grub-install but it failed with "Compression algorithm inherit not supported".
    I checked both the pool & dataset of bpool, encryption is disabled but lz4 compression is enabled. Could this be an issue?
    If it is, why now but not before?
    Suspecting it must be related to the new bug but why did the solution fail for me if my issue is being caused by the bug?
    I wonder if there is another fact at play here layering a 2nd issue on the bug...

    That said, I think a flaw remains in the updated instructions.
    The line I mentioned before where you instruct to create the new Noble sources.list again lacks /apt from the path, as does part of the line where you say to copy the new Noble version to the default sources.list.
    The lines presently read:
    "sudo nano /etc/sources.list.noble" (is sudo needed here when we are already in sudo & chroot?) which I think should be
    "nano /etc/apt/sources.list.noble"
    and
    "cp /etc/apt/sources.list.noble /etc/sources.list" which I think should be
    "cp /etc/apt/sources.list.noble /etc/apt/sources.list"
    As said before I am new to Linux, so I am not certain of those.
    Another possible issue is the revised instructions in comment 23 with the line to add the backport repository.
    In comment 23 the line reads
    "add-apt-repository ppa:ubuntu-uefi-team/build"
    but in 33 it reads
    "add-apt-repository ppa:ubuntu-uefi-team/backports-build"

    Thinking on it, I am wondering if changing that could fix the error I got. I shall try that & update here later.

    Edit: Tested adding the backports-build repository then running the update again but now it is not updating grub because it is already up-to-date.
    I removed the references to the non-backport build repository in /etc/apt/sources.list.d/ but am wondering if there was a better/proper method. (never removed a repository, shall search a proper method)
    No change though, not pulling updates to grub now so I am wondering if we can/should try removing grub to force it to reinstall fresh from the backports-build repository?
    If so, what exact command should I use to do that?
    Last edited by dragonomnus; February 21st, 2024 at 05:42 PM.

  6. #6
    Join Date
    Mar 2010
    Location
    USA
    Beans
    Hidden!
    Distro
    Ubuntu Development Release

    Re: Ubuntu on ZFS grub/boot failure

    That would be what to do, or to use Software & Updates to uncheck it, which in effect just adds a "#" character to comment out the lines.

    So it is still getting that error. Look at the option for creating a new bpool in this: https://bugs.launchpad.net/ubuntu/+s...99/comments/26

    The next work-around to try would be to move forward with plan for #2... How much room do you have in bpool and rpool. Enough for a full recursive snapshot of bpool/BOOT, rpool/ROOT & rpool/USERDATA?

    "Concurrent coexistence of Windows, Linux and UNIX..." || Ubuntu user # 33563, Linux user # 533637
    Sticky: Graphics Resolution | UbuntuForums 'system-info' Script | Posting Guidelines | Code Tags

  7. #7
    Join Date
    Feb 2024
    Beans
    13

    Re: Ubuntu on ZFS grub/boot failure

    We can work on #2 now if needed, and there is plenty of space for full snapshots of bpool & rpool.
    Ironically that is part of the problem, because I was not intending to remove all snapshots from bpool before performing the kernel upgrade but the command explanation was unclear in 1 part so I saw 2 possibilities for it, being where to place the % in the command to remove all older or all newer snapshots. Placing it after the snapshot name removed all newer which was not what I wanted but a minor problem, however I goofed when I redid the command to remove all older snapshots (which were the snapshots consuming all the space I needed to free). I intended to leave the newest snapshot but accidentally used it as the starting point so removed all remaining snapshots. If I did not goof that, I would have a bpool snapshot to restore & probably fix this issue. If I thought about it and took a new snapshot before installing the kernel upgrade, I would be able to restore. Learned my lessons on those.

    However, I think we can/should try 1 more thing before calling this a fail, which is my last comment in my prior message. Can we force-remove the grub package somehow to allow it to update from the backports-build repository? The present issue is from the differences which were in comment 23 instructions vs comment 33, being comment 23 said to use "ppa:ubuntu-uefi-team/build" instead of "ppa:ubuntu-uefi-team/backports-build", which properly pulled an update, but now it is finding no update available from backports-build because it already updated from non-backports build. I tried to remove grub but a standard "apt remove grub-efi-amd64 grub-efi-amd64-signed" failed to remove grub, so how can we force it to then allow a new pull from the backports-build repository?

  8. #8
    Join Date
    Feb 2024
    Beans
    13

    Re: Ubuntu on ZFS grub/boot failure

    I think above applies, but I realized I failed to provide some possibly helpful information. I rolled bpool & rpool back to the last snapshot on each which corrected some of the errors and allowed me to grub-install & update-initramfs, but no change in failure to boot. I think no change is because it failed to update grub after the rollback despite repeating the steps to alter sources.list & add the backports-build repository again.

    I was able to roll both datasets back because I have snapshots of rpool back to when I installed zfs-auto-snapshots and bpool after the kernel upgrade, which leads to another possibly important piece of info I forgot to mention. Ubuntu could not upgrade the kernel because bpool lacked free space because the snapshots were consuming it (which I need to solve also), which is why I deleted the old snapshots which allowed the kernel to upgrade. Ubuntu wanted me to reboot after the upgrade, which I did and it rebooted normally. It was a couple days later when I began noticing issues (for example I tried to access a network share but could not) and found no obvious cause for the issues so rebooted again, which is when it failed to boot. This also means I have some snapshots on bpool *after* the kernel upgrade, if those could help.

    That said, I think there is an error in the updated instructions, but not certain.
    I think the line "add-apt-repository ppa:ubuntu-uefi-team/backports-build" is correct, but if it is then I think the line "nano /etc/apt/sources.list.d/ubuntu-uefi-team-ubuntu-build-jammy.list" is incorrect. Notice the 2nd command lacks the backport reference? I think it should be:
    nano /etc/apt/sources.list.d/ubuntu-uefi-team-ubuntu-backports-build-jammy.list

    And same thing for what to edit in that file, lacking the backport reference, so I think it should be:
    deb [trusted=yes] https://ppa.launchpadcontent.net/ubuntu-uefi-team/backports-build/ubuntu/ noble main

    Also, I think there is an issue in the later command to update initramfs. I am not certain but
    "update-intramfs" should be
    "update-initramfs", correct?

    I was able to pull an update to grub during 1 of the prior attempts, but I think it was the attempt following the prior instructions which said "build" instead of "backports-build" so the update failed to solve the issue. So now the question is: if I rolled rpool & bpool back to before all these changes (and verified the sources.list etc files were removed/restore), then why is it not pulling a new update to grub from backports-build if I follow the updated instructions? There must be some detail which caused a change in behavior, but the only thing I can say it is doing now is possibly a failure when I try "apt update".
    It returns Hit or Ign (Ignored?) on each line and is Ign on all the lines from the backports-build repository (except an error on the line for i386 but that should not be relevant), so perhaps that is why it is not pulling an update from them now?
    How can we solve that directly, or to force-update or force-remove to allow an update to grub?

    Edit: Intended to say but forgot by the time I was done typing all that...
    No worries on the solution failing or your limited ability to test, you have a life.
    I appreciate your help with this issue and am happy if it helps the Linux community solve an issue.
    But I also began thinking after my last post.... I have snapshots of rpool from before the update so I could roll it back then try updating grub & stuff.
    However, I think that would not be the best idea because I think it is better to find a working solution then push it to the updates channel before this hits too many users.
    On that note....
    What if we do both? I was thinking perhaps I should image this system so we have it in a non-booting state to play with more if we need, but I can restore bpool & rpool on the drive to the last snapshot before the reboot where it failed so would we need images of the non-booting state or could we fix it then roll the snapshots to break it again?
    If that would succeed (if not I can image it, if you help me with that) & help solve this issue further, then perhaps we should image the drive then begin working on #2 of migrating to a new drive.
    If we begin working on #2, my preference would be to learn how to migrate an installation (including all users/configs/software/etc). I think using ZFS to mirror & split bpool & rpool should be the best way, but I think that was 1 of the methods I tried before which failed. I remember 1 of the methods I tried failed because of the UUID part of the dataset name, but I failed to solve that issue and cannot remember if that was when I mirrored & split or what.
    Last edited by dragonomnus; February 22nd, 2024 at 09:14 PM.

  9. #9
    Join Date
    Aug 2016
    Location
    Wandering
    Beans
    Hidden!
    Distro
    Xubuntu Development Release

    Re: Ubuntu on ZFS grub/boot failure

    I'm on Noble currently, but on Jammy I had to hand remove grub:
    Code:
    apt list grub* --installed
    Listing... Done
    grub-common/noble,now 2.12-1ubuntu2 amd64 [installed,automatic]
    grub-efi-amd64-bin/noble,noble,now 2.12-1ubuntu1 amd64 [installed]
    grub-efi-amd64-signed/noble,now 1.201+2.12-1ubuntu1 amd64 [installed]
    grub-gfxpayload-lists/noble,now 0.7build1 amd64 [installed]
    grub-pc-bin/noble,now 2.12-1ubuntu2 amd64 [installed]
    grub-pc/noble,noble,now 2.12-1ubuntu2 amd64 [installed]
    grub2-common/noble,now 2.12-1ubuntu2 amd64 [installed]
    then update again..."sudo apt update" then install/reinstall all the above. After that it has been pretty solid.

    What information do you need from me?
    Code:
    apt policy grub-efi-amd64-signed
    grub-efi-amd64-signed:
      Installed: 1.201+2.12-1ubuntu1
      Candidate: 1.201+2.12-1ubuntu1
      Version table:
     *** 1.201+2.12-1ubuntu1 500
            500 http://us.archive.ubuntu.com/ubuntu noble/main amd64 Packages
            100 /var/lib/dpkg/status
         1.201+2.12-1ubuntu1 500
            500 https://ppa.launchpadcontent.net/ubuntu-uefi-team/build/ubuntu noble/main amd64 Packages
    Repo nanming:
    Code:
    inxi -r | grep ubuntu-uefi-team
      Active apt repos in: /etc/apt/sources.list.d/ubuntu-uefi-team-ubuntu-build-noble.sources
        1: deb https://ppa.launchpadcontent.net/ubuntu-uefi-team/build/ubuntu/ noble main
    It's pretty easy to forget about the room snapshots will gobble up (best to check at least weekly)
    Code:
    sudo zfs list -o space | sort -k4 --human-numeric-sort
    [sudo] password for me: 
    bpool                                             1.50G   256M        0B     96K             0B       256M
    bpool/BOOT                                        1.50G   255M        0B     96K             0B       255M
    bpool/BOOT/ubuntu_2wtpxc                          1.50G   255M        0B    255M             0B         0B
    NAME                                              AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD
    rpool                                              417G  29.0G        0B     96K             0B      29.0G
    rpool/ROOT                                         417G  16.2G        0B     96K             0B      16.2G
    rpool/ROOT/ubuntu_2wtpxc/srv                       417G    96K        0B     96K             0B         0B
    rpool/ROOT/ubuntu_2wtpxc/usr                       417G   240K        0B     96K             0B       144K
    rpool/ROOT/ubuntu_2wtpxc/usr/local                 417G   144K        0B    144K             0B         0B
    rpool/ROOT/ubuntu_2wtpxc/var                       417G  4.86G        0B     96K             0B      4.86G
    rpool/ROOT/ubuntu_2wtpxc/var/games                 417G    96K        0B     96K             0B         0B
    rpool/ROOT/ubuntu_2wtpxc/var/lib                   417G  4.81G        0B   4.65G             0B       158M
    rpool/ROOT/ubuntu_2wtpxc/var/lib/AccountsService   417G   100K        0B    100K             0B         0B
    rpool/ROOT/ubuntu_2wtpxc/var/lib/apt               417G  94.4M        0B   94.4M             0B         0B
    rpool/ROOT/ubuntu_2wtpxc/var/lib/dpkg              417G  63.3M        0B   63.3M             0B         0B
    rpool/ROOT/ubuntu_2wtpxc/var/lib/NetworkManager    417G   160K        0B    160K             0B         0B
    rpool/ROOT/ubuntu_2wtpxc/var/log                   417G  54.2M        0B   54.2M             0B         0B
    rpool/ROOT/ubuntu_2wtpxc/var/mail                  417G    96K        0B     96K             0B         0B
    rpool/ROOT/ubuntu_2wtpxc/var/snap                  417G   472K        0B    472K             0B         0B
    rpool/ROOT/ubuntu_2wtpxc/var/spool                 417G   112K        0B    112K             0B         0B
    rpool/ROOT/ubuntu_2wtpxc/var/www                   417G    96K        0B     96K             0B         0B
    rpool/USERDATA                                     417G  12.7G        0B     96K             0B      12.7G
    rpool/USERDATA/me_0wdehs                           417G  12.7G        0B   12.7G             0B         0B
    rpool/USERDATA/root_0wdehs                         417G  2.02M        0B   2.02M             0B         0B
    rpool/ROOT/ubuntu_2wtpxc                           417G  16.2G     4.99G   6.39G             0B      4.86G
    Last edited by 1fallen; February 22nd, 2024 at 09:41 PM.
    With realization of one's own potential and self-confidence in one's ability, one can build a better world.
    Dalai Lama>>
    Code Tags | System-info | Forum Guide lines | Arch Linux, Debian Unstable, FreeBSD

  10. #10
    Join Date
    Feb 2024
    Beans
    13

    Re: Ubuntu on ZFS grub/boot failure

    Quote Originally Posted by 1fallen View Post
    I'm on Noble currently, but on Jammy I had to hand remove grub
    How?
    I did the "apt list" you did and saw my grub-efi-amd64* packages were pulled from noble but grub-pc/grub-common/grub2-common were all from jammy so I pulled those updates (and their dependencies) from noble then did another grub-install & update-initramfs.
    They both succeeded with no error but no change on reboot, being it again failed.

    I am wondering more if my issue is from the grub bug or some other cause, but I already gave all information I know and know how to get.
    I would again attempt to migrate to the new SSD as MAfoElffen suggested last, but the methods I read and tried all failed and I would guess every method I know or can think of except hand-picking which directories & files to copy would result in transferring the boot problem also. And I know not which directories to hand-pick to effect a full migration of users/configs/software while not transferring the boot problem.

    I could do a fresh install on the new drive, but I need to learn a functional method of system migration (and backup/restoration of Linux/software on ZFS). I suspect the chroot & grub-install I learned thanks to MAfoElffen would help so I can probably try again, but every method I know would copy all of rpool (& bpool) so would almost certainly copy the problem with it.

Page 1 of 3 123 LastLast

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •