Tuesday, August 30, 2011

Broken system after zpool upgrade

After updating to Solaris 9 Update 9 ( 144500-19 ), I went ahead to upgrade my zfs pools ( zpool upgrade -a ) and zfs file systems (zfs upgrade -a ). The upgrade went pretty well until I rebooted and was faced with the following:

Rebooting with command: boot /pci@0,600000/pci@0/pci@0/scsi@0/disk@0,0 -F
Boot device: /pci@0,600000/pci@0/pci@0/scsi@0/disk@0,0 File and args: -F
ERROR: Last Trap: Fast Data Access MMU Miss
%TL:1 %TT:68 %TPC:f000ad64 %TnPC:f000ad68 %TSTATE:6a001600
%PSTATE:16 ( IE:1 PRIV:1 PEF:1 )
DSFSR:4280804b ( FV:1 OW:1 PR:1 E:1 TM:1 ASI:80 NC:1 BERR:1 )
DSFAR:fda60000 DSFPAR:4018007f8000 D-TAG:1fb6c2000


For some reason the start of the disk has been corrupted - I don't know how this wasn't picked up during in QA at Oracle, but it seems to have been a problem since 144500-06 ( http://wesunsolve.net/bugid/id/7022082 ). I also saw a reference to a similar bug in the OpenSolaris bug tracker.

There isn't really a "fix", as far as I can see, but there are two work-arounds:

1. reinstall the boot block after you've done a zpool upgrade, but before you reboot:

$ installboot -F zfs /usr/platform/`uname -i`/lib/fs/zfs/bootblk /dev/rdsk/c0t1d0s0

2. Patch the miniroot on your jumpstart server, run a 'boot net -s' on your broken system and reinstall the boot block in your miniroot environment


No comments: