lvmthin woes

Wed Jun 30 14:20:07 CEST 2021

Hi

Today tried to migrate a disk from a thin pool to another thin pool on PVE 6.4-6 but online storage migration failed at 54%.

When I attempted the migration a second time, the reason for the first failure became apparent from the error message.

WARNING: Remaining free space in metadata of thin pool hdd/data is too low (100.00% >= 96.00%). Resize is recommended.

I had run out of metadata on the destination thin pool during storage migration.

Jun 30 13:10:41 hrd-srv lvm[857]: WARNING: Thin pool hdd-data-tpool metadata is now 80.97% full.
Jun 30 13:11:31 hrd-srv lvm[857]: WARNING: Thin pool hdd-data-tpool metadata is now 85.80% full.
Jun 30 13:12:21 hrd-srv lvm[857]: WARNING: Thin pool hdd-data-tpool metadata is now 90.48% full.
Jun 30 13:13:11 hrd-srv lvm[857]: WARNING: Thin pool hdd-data-tpool metadata is now 95.46% full.
Jun 30 13:13:58 hrd-srv kernel: device-mapper: thin: No free metadata blocks
Jun 30 13:13:58 hrd-srv kernel: device-mapper: thin: 253:3: switching pool to read-only mode
Jun 30 13:13:59 hrd-srv pvedaemon[20516]: VM 103 qmp command failed - VM 103 qmp command 'block-job-cancel' failed - Block job 'drive-scsi2' not found
Jun 30 13:13:59 hrd-srv kernel: device-mapper: thin: 253:3: unable to service pool target messages in READ_ONLY or FAIL mode
Jun 30 13:13:59 hrd-srv pvedaemon[20516]: lvremove 'hdd/vm-103-disk-0' error:   Failed to update pool hdd/data.
Jun 30 13:13:59 hrd-srv pvedaemon[20516]: storage migration failed: block job (mirror) error: drive-scsi2: 'mirror' has been cancelled
Jun 30 13:13:59 hrd-srv pvedaemon[13249]: <root at pam> end task UPID:hrd-srv:00005024:15A37C83:60DC42CC:qmmove:103:root at pam: storage migration failed: block job (mirror) error: drive-scsi2: 'mirror' has been cancelled
Jun 30 13:14:01 hrd-srv lvm[857]: WARNING: Thin pool hdd-data-tpool metadata is now 100.00% full.

Resizing the metadata pool (lvresize --poolmetadatasize ..) worked without a problem (thanks to the hint in the wiki).

[3630930.387272] device-mapper: thin: No free metadata blocks
[3630930.393240] device-mapper: thin: 253:3: switching pool to read-only mode
[3630931.632043] device-mapper: thin: 253:3: unable to service pool target messages in READ_ONLY or FAIL mode
[3633079.042694] device-mapper: thin: 253:3: growing the metadata device from 25600 to 51200 blocks
[3633079.042699] device-mapper: thin: 253:3: switching pool to write mode
[3633462.550930] device-mapper: thin: 253:3: growing the metadata device from 51200 to 76800 blocks
[3633489.659827] device-mapper: thin: 253:3: growing the metadata device from 76800 to 90112 blocks
[3633497.712154] device-mapper: thin: 253:3: growing the metadata device from 90112 to 103424 blocks
[3633514.576483] device-mapper: thin: 253:3: growing the metadata device from 103424 to 116736 blocks
[3633531.417242] device-mapper: thin: 253:3: growing the metadata device from 116736 to 130048 blocks
[3633552.498115] device-mapper: thin: 253:3: growing the metadata device from 130048 to 143360 blocks
[3633563.294272] device-mapper: thin: 253:3: growing the metadata device from 143360 to 156672 blocks
[3633573.562695] device-mapper: thin: 253:3: growing the metadata device from 156672 to 169984 blocks
[3633595.843620] device-mapper: thin: 253:3: growing the metadata device from 169984 to 262144 blocks

The pool is writable again and not overcommitted.

root at hrd-srv:~# lvs -a hdd
  LV              VG  Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  data            hdd twi-aotz--   1.00t             16.86  10.36
  [data_tdata]    hdd Twi-ao----   1.00t
  [data_tmeta]    hdd ewi-ao----   1.00g
  [lvol0_pmspare] hdd ewi------- 100.00m
  vm-100-disk-0   hdd Vwi-aotz-- 500.00g data        23.02
  vm-102-disk-1   hdd Vwi-a-tz--  14.90g data        22.66
  vz              hdd -wi-ao----   1.00t
root at pve:~# vgs hdd
  VG  #PV #LV #SN Attr   VSize  VFree
  hdd   4   4   0 wz--n- <3.64t <1.64t

Nevertheless, the third storage migration attempt failed with error

Thin pool hdd-data-tpool (253:3) transaction_id is 17, while expected 18.

Any advice on how to go about this situation?

Thank you

Stefan

FIRST ATTEMPT

create full clone of drive scsi2 (ssd:vm-103-disk-2)
Logical volume "vm-103-disk-0" created.
drive mirror is starting for drive-scsi2
drive-scsi2: transferred 33.0 MiB of 100.0 GiB (0.03%) in 0s
...
drive-scsi2: transferred 54.1 GiB of 100.0 GiB (54.11%) in 4m 41s
drive-scsi2: Cancelling block job
drive-scsi2: Done.
device-mapper: message ioctl on (253:3) failed: Operation not supported
Failed to process message "delete 3".
Failed to suspend hdd/data with queued messages.
lvremove 'hdd/vm-103-disk-0' error: Failed to update pool hdd/data.
TASK ERROR: storage migration failed: block job (mirror) error: drive-scsi2: 'mirror' has been cancelled

SECOND ATTEMPT

create full clone of drive scsi2 (ssd:vm-103-disk-2)
WARNING: Remaining free space in metadata of thin pool hdd/data is too low (100.00% >= 96.00%). Resize is recommended.
TASK ERROR: storage migration failed: lvcreate 'hdd/vm-103-disk-0' error: Cannot create new thin volume, free space in thin pool hdd/data reached threshold.

I ran out of metadata on a thin pool today

create full clone of drive scsi2 (ssd:vm-103-disk-2)
Thin pool hdd-data-tpool (253:3) transaction_id is 17, while expected 18.
TASK ERROR: storage migration failed: lvcreate 'hdd/vm-103-disk-0' error: Failed to suspend hdd/data with queued messages.

THIRD ATTEMPT

create full clone of drive scsi2 (ssd:vm-103-disk-2)
Thin pool hdd-data-tpool (253:3) transaction_id is 17, while expected 18.
TASK ERROR: storage migration failed: lvcreate 'hdd/vm-103-disk-0' error: Failed to suspend hdd/data with queued messages.

CONFIDENTIALITY NOTICE: This communication may contain privileged and confidential information, or may otherwise be protected from disclosure, and is intended solely for use of the intended recipient(s). If you are not the intended recipient of this communication, please notify the sender that you have received this communication in error and delete and destroy all copies in your possession.