lvmthin woes
Stefan M. Radman
smr at kmi.com
Wed Jun 30 14:20:07 CEST 2021
Hi
Today tried to migrate a disk from a thin pool to another thin pool on PVE 6.4-6 but online storage migration failed at 54%.
When I attempted the migration a second time, the reason for the first failure became apparent from the error message.
WARNING: Remaining free space in metadata of thin pool hdd/data is too low (100.00% >= 96.00%). Resize is recommended.
I had run out of metadata on the destination thin pool during storage migration.
Jun 30 13:10:41 hrd-srv lvm[857]: WARNING: Thin pool hdd-data-tpool metadata is now 80.97% full.
Jun 30 13:11:31 hrd-srv lvm[857]: WARNING: Thin pool hdd-data-tpool metadata is now 85.80% full.
Jun 30 13:12:21 hrd-srv lvm[857]: WARNING: Thin pool hdd-data-tpool metadata is now 90.48% full.
Jun 30 13:13:11 hrd-srv lvm[857]: WARNING: Thin pool hdd-data-tpool metadata is now 95.46% full.
Jun 30 13:13:58 hrd-srv kernel: device-mapper: thin: No free metadata blocks
Jun 30 13:13:58 hrd-srv kernel: device-mapper: thin: 253:3: switching pool to read-only mode
Jun 30 13:13:59 hrd-srv pvedaemon[20516]: VM 103 qmp command failed - VM 103 qmp command 'block-job-cancel' failed - Block job 'drive-scsi2' not found
Jun 30 13:13:59 hrd-srv kernel: device-mapper: thin: 253:3: unable to service pool target messages in READ_ONLY or FAIL mode
Jun 30 13:13:59 hrd-srv pvedaemon[20516]: lvremove 'hdd/vm-103-disk-0' error: Failed to update pool hdd/data.
Jun 30 13:13:59 hrd-srv pvedaemon[20516]: storage migration failed: block job (mirror) error: drive-scsi2: 'mirror' has been cancelled
Jun 30 13:13:59 hrd-srv pvedaemon[13249]: <root at pam> end task UPID:hrd-srv:00005024:15A37C83:60DC42CC:qmmove:103:root at pam: storage migration failed: block job (mirror) error: drive-scsi2: 'mirror' has been cancelled
Jun 30 13:14:01 hrd-srv lvm[857]: WARNING: Thin pool hdd-data-tpool metadata is now 100.00% full.
Resizing the metadata pool (lvresize --poolmetadatasize ..) worked without a problem (thanks to the hint in the wiki).
[3630930.387272] device-mapper: thin: No free metadata blocks
[3630930.393240] device-mapper: thin: 253:3: switching pool to read-only mode
[3630931.632043] device-mapper: thin: 253:3: unable to service pool target messages in READ_ONLY or FAIL mode
[3633079.042694] device-mapper: thin: 253:3: growing the metadata device from 25600 to 51200 blocks
[3633079.042699] device-mapper: thin: 253:3: switching pool to write mode
[3633462.550930] device-mapper: thin: 253:3: growing the metadata device from 51200 to 76800 blocks
[3633489.659827] device-mapper: thin: 253:3: growing the metadata device from 76800 to 90112 blocks
[3633497.712154] device-mapper: thin: 253:3: growing the metadata device from 90112 to 103424 blocks
[3633514.576483] device-mapper: thin: 253:3: growing the metadata device from 103424 to 116736 blocks
[3633531.417242] device-mapper: thin: 253:3: growing the metadata device from 116736 to 130048 blocks
[3633552.498115] device-mapper: thin: 253:3: growing the metadata device from 130048 to 143360 blocks
[3633563.294272] device-mapper: thin: 253:3: growing the metadata device from 143360 to 156672 blocks
[3633573.562695] device-mapper: thin: 253:3: growing the metadata device from 156672 to 169984 blocks
[3633595.843620] device-mapper: thin: 253:3: growing the metadata device from 169984 to 262144 blocks
The pool is writable again and not overcommitted.
root at hrd-srv:~# lvs -a hdd
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
data hdd twi-aotz-- 1.00t 16.86 10.36
[data_tdata] hdd Twi-ao---- 1.00t
[data_tmeta] hdd ewi-ao---- 1.00g
[lvol0_pmspare] hdd ewi------- 100.00m
vm-100-disk-0 hdd Vwi-aotz-- 500.00g data 23.02
vm-102-disk-1 hdd Vwi-a-tz-- 14.90g data 22.66
vz hdd -wi-ao---- 1.00t
root at pve:~# vgs hdd
VG #PV #LV #SN Attr VSize VFree
hdd 4 4 0 wz--n- <3.64t <1.64t
Nevertheless, the third storage migration attempt failed with error
Thin pool hdd-data-tpool (253:3) transaction_id is 17, while expected 18.
Any advice on how to go about this situation?
Thank you
Stefan
FIRST ATTEMPT
create full clone of drive scsi2 (ssd:vm-103-disk-2)
Logical volume "vm-103-disk-0" created.
drive mirror is starting for drive-scsi2
drive-scsi2: transferred 33.0 MiB of 100.0 GiB (0.03%) in 0s
...
drive-scsi2: transferred 54.1 GiB of 100.0 GiB (54.11%) in 4m 41s
drive-scsi2: Cancelling block job
drive-scsi2: Done.
device-mapper: message ioctl on (253:3) failed: Operation not supported
Failed to process message "delete 3".
Failed to suspend hdd/data with queued messages.
lvremove 'hdd/vm-103-disk-0' error: Failed to update pool hdd/data.
TASK ERROR: storage migration failed: block job (mirror) error: drive-scsi2: 'mirror' has been cancelled
SECOND ATTEMPT
create full clone of drive scsi2 (ssd:vm-103-disk-2)
WARNING: Remaining free space in metadata of thin pool hdd/data is too low (100.00% >= 96.00%). Resize is recommended.
TASK ERROR: storage migration failed: lvcreate 'hdd/vm-103-disk-0' error: Cannot create new thin volume, free space in thin pool hdd/data reached threshold.
I ran out of metadata on a thin pool today
create full clone of drive scsi2 (ssd:vm-103-disk-2)
Thin pool hdd-data-tpool (253:3) transaction_id is 17, while expected 18.
TASK ERROR: storage migration failed: lvcreate 'hdd/vm-103-disk-0' error: Failed to suspend hdd/data with queued messages.
THIRD ATTEMPT
create full clone of drive scsi2 (ssd:vm-103-disk-2)
Thin pool hdd-data-tpool (253:3) transaction_id is 17, while expected 18.
TASK ERROR: storage migration failed: lvcreate 'hdd/vm-103-disk-0' error: Failed to suspend hdd/data with queued messages.
CONFIDENTIALITY NOTICE: This communication may contain privileged and confidential information, or may otherwise be protected from disclosure, and is intended solely for use of the intended recipient(s). If you are not the intended recipient of this communication, please notify the sender that you have received this communication in error and delete and destroy all copies in your possession.
More information about the pve-user
mailing list