[PVE-User] V4.1: "Move Disk" function leads to file system corruption

Emmanuel Kasper e.kasper at proxmox.com
Thu Mar 10 10:20:33 CET 2016



On 03/09/2016 07:14 PM, Stefan Plattner wrote:
> Hello everyone!
> 
> I used the "Move Disk" function in the "Hardware" tab of a
> stopped/offline Windows-VM. After the process was finished, I re-started
> the VM and the Guest greeted me with the following, in this cas
> SQLServer related, (fatal) error message:
> 
> "Could not open error log file Operating system error = 1392"
> 
> Running a chkdsk inside the VM revealed several files system errors but
> even after "chkdsk /f", SQLServer still fails to start with "sql server
> error 9004"...
> 
> Switching back to the original disk (raw format), resolves all this
> filesystem problems and SQLServer starts fine. No disk problems reported
> with chkdsk (in the guest).
> 
> Examining what is happening when executing a "Move disk", ps revealed
> the following command line:
> 
> "/usr/bin/qemu-img convert -t writeback -p -n -f raw -O raw
> /var/lib/vz/images/501/vm-501-disk-3.raw
> /mnt/pve/prox02-ssd-lv/images/501/vm-501-disk-2.raw"
> 
> So what seems to happen, is not a simple fs move/copy, but a conversion
> from raw to raw...
> The resulting file is also different from the original:
> "cmp /mnt/pve/prox02-ssd-lv/images/501/vm-501-disk-2.raw vm-501-disk-3.raw
> vm-501-disk-2.raw vm-501-disk-3.raw differ: byte 1228841, line 737"
> 
> I don't understand why a "convert" is happening at all (same image
> format) but the at the moment the result is a binary different image
> which is causing fatal filesystem errors in the (Windows) guest.


Hi Stefan

SQL Server is known to be picky about the filesystem.

Is the problem repeatable if you use the qemu-img command as it
displayed above ?

If Yes can you try it again by using the switch -t writethrough instead
of writeback ?
This should be slower but safer.

It might be that by using writeback, some blocks are still in the Linux
kernel page cache, and the filesystem inside the VM is not consistent if
you start the VM before the kernel has flushed the cache.

Emmanuel








More information about the pve-user mailing list