LVM metadata corruption

I had the biggest PC related scare a couple of days ago. After I had two disks in my RAID5 fail in a very short amount of time and only pure luck saved my data I moved to RAID6 and I felt safer. That was, until two days ago I ran:
# pvs -v
pvs Scanning for physical volume names
pvs Incorrect metadata area header checksum
pvs Incorrect metadata area header checksum
pvs WARNING: Volume Group vg0 is not consistent
pvs Incorrect metadata area header checksum
pvs Incorrect metadata area header checksum
pvs PV VG Fmt Attr PSize PFree DevSize PV UUID
pvs /dev/md2 vg0 lvm2 a- 1.80T 922.19G 1.80T Y9naEo-OKG6-0ZyX-qmZX-u3JP-uCPg-cE1hVX


Ooops. Not looking good.

# vgs -v
vgs Finding all volume groups
vgs Incorrect metadata area header checksum
vgs Finding volume group "vg0"
vgs Incorrect metadata area header checksum
vgs Incorrect metadata area header checksum
vgs VG Attr Ext #PV #LV #SN VSize VFree VG UUID
vgs vg0 wz--n- 4.00M 1 15 0 1.80T 922.19G 8uc2fo-0OwD-lPRT-1gdh-87lw-pjQW-Y1n2vd


Hmmmm, lots of errors but at least the volume group is there... now let's see the logical volumes:

# lvs -v
lvs Finding all logical volumes
lvs Incorrect metadata area header checksum
lvs Incorrect metadata area header checksum
lvs Incorrect metadata area header checksum
lvs Volume group "vg0" inconsistent
lvs Incorrect metadata area header checksum
lvs Incorrect metadata area header checksum
lvs WARNING: Inconsistent metadata found for VG vg0 - updating to use version 154
lvs Incorrect metadata area header checksum
lvs Automatic metadata correction failed
lvs Internal error: Volume Group vg0 was not unlocked
lvs Device '/dev/md2' has been left open.
lvs Device '/dev/md2' has been left open.
lvs Device '/dev/md2' has been left open.
lvs Device '/dev/md2' has been left open.


I was expecting 15 logical volumes. The /home directory, some data, mail volume, etc. Thing is, they still worked.


The thing is, even if the metadata on /dev/md2 is corrupted, my kernel still had everything mounted. Since LVM had already created the device mapper devices and the kernel knew at what offsets all the volumes are. Or in simpler terms, the metadata on the disk is corrupted, but the metadata in the kernel is still alive. Therefore, first thing I did was to run
dmsetup table
and save its output. If bad came to worse, I could still recreate all the device mapper devices using this output. Then I was able to recover the metadata from the lvm backup files, that LVM by defaults dump /etc/lvm/backup but I had set that up as a symlink to /boot/lvm-backup because it makes no sense to keep the LVM backups on an LVM volume (my / is also on an LVM).

Comments

Popular posts from this blog

Installing Gentoo with full disk encryption

ADSL Router Model CT-5367 user and pass (VIVACOM)

FreeIPA cluster with containers