[PVE-User] ZFS Checksum Error in VM but not on Host

Harald Leithner leithner at itronic.at
Mon Jul 24 17:21:04 CEST 2017


Hi,

Am 24.07.2017 um 13:32 schrieb Yannis Milios:
> Hello,
> 
>>> RAIDZ1 (2 Disks) -> qemu -> ZFS (1 Disk)
> 
> 
> Is there any particular reason of having this kind of setup? I mean in
> general using ZFS inside a VM is not recommended.

2 reason for this, first having checksums^^, second snapshots.
And I prefer ZFS over any other filesystem.

Whats the reason why ZFS is not good in a VM?

> 
> 
>   >>      NAME        STATE     READ WRITE CKSUM
>   >>       backup      ONLINE       0       0          * 13*
>   >>          sdc         ONLINE         0       0         *26*
> 
>>> errors: Permanent errors have been detected in the following files:
> 
> 
>>> */battlefield/backup/kunde/serv*er/data/d7/d79c0feb29ef024ce01
> 64253ee08e6daa986bd1d599f4640167de2c3d7828524
> 
> Apparently you had checksum errors which lead to corruption of that file.
> Since this pool is not redundant, you will have to delete the file, restore
> it from a backup and then scrub the volume.

I understand the error and the solution, but not really why it happen. 
In the meantime I got an answer from Wolfgang Link who thinks it could 
be a bit flip in memory...

Do you have any other Filesystem that support checksumming, thats maybe 
better for this job?


> 
> Yannis

Harald


> 
> On Mon, Jul 24, 2017 at 11:11 AM, Harald Leithner <leithner at itronic.at>
> wrote:
> 
>> Hi,
>>
>> I'm not sure if this is Proxmox/Qemu related but I try it here.
>>
>> We have a VM on a ZFS Pool with Proxmox kernel for ZFS, so the result is
>>
>> RAIDZ1 (2 Disks) -> qemu -> ZFS (1 Disk)
>>
>> We got 2 Mails from ZFS inside the VM:
>>
>> ---
>>
>> ZFS has detected a checksum error:
>>
>>     eid: 37
>>   class: checksum
>>    host: backup
>>    time: 2017-07-21 15:07:59+0200
>>   vtype: disk
>>   vpath: /dev/sdc1
>>   vguid: 0x003AC1491C2AC7D2
>>   cksum: 1
>>    read: 0
>>   write: 0
>>    pool: backup
>>
>> ---
>>
>> ZFS has detected a data error:
>>
>>     eid: 36
>>   class: data
>>    host: backup
>>    time: 2017-07-21 15:07:59+0200
>>    pool: backup
>>
>> ---
>>
>> The status of the zfs pool inside the VM:
>>
>> ---
>>
>>   zpool status -v
>>    pool: backup
>>   state: ONLINE
>> status: One or more devices has experienced an error resulting in data
>>          corruption.  Applications may be affected.
>> action: Restore the file in question if possible.  Otherwise restore the
>>          entire pool from backup.
>>     see: http://zfsonlinux.org/msg/ZFS-8000-8A
>>    scan: scrub repaired 0 in 0h42m with 0 errors on Sun Jul  9 01:06:26 2017
>> config:
>>
>>          NAME        STATE     READ WRITE CKSUM
>>          backup      ONLINE       0     0    13
>>            sdc       ONLINE       0     0    26
>>
>> errors: Permanent errors have been detected in the following files:
>>
>>
>> /battlefield/backup/kunde/server/data/d7/d79c0feb29ef024ce01
>> 64253ee08e6daa986bd1d599f4640167de2c3d7828524
>>
>> ---
>>
>> But on the host there is no error:
>>
>>   zpool status -v
>>
>>    pool: slow
>>   state: ONLINE
>>    scan: scrub in progress since Fri Jul 21 15:45:13 2017
>>      54.0G scanned out of 486G at 60.8M/s, 2h1m to go
>>      0 repaired, 11.11% done
>> config:
>>
>>          NAME        STATE     READ WRITE CKSUM
>>          slow        ONLINE       0     0     0
>>            mirror-0  ONLINE       0     0     0
>>              sda2    ONLINE       0     0     0
>>              sdb2    ONLINE       0     0     0
>>
>> errors: No known data errors
>>
>>
>> (Scrub is also finished with no errors)
>>
>> ---
>>
>> HOST:
>>
>>   pveversion --verbose
>> proxmox-ve: 5.0-16 (running kernel: 4.10.15-1-pve)
>> pve-manager: 5.0-23 (running version: 5.0-23/af4267bf)
>> pve-kernel-4.10.15-1-pve: 4.10.15-15
>> pve-kernel-4.4.35-1-pve: 4.4.35-77
>> pve-kernel-4.10.8-1-pve: 4.10.8-7
>> pve-kernel-4.4.59-1-pve: 4.4.59-87
>> pve-kernel-4.10.11-1-pve: 4.10.11-9
>> pve-kernel-4.10.17-1-pve: 4.10.17-16
>> libpve-http-server-perl: 2.0-5
>> lvm2: 2.02.168-pve2
>> corosync: 2.4.2-pve3
>> libqb0: 1.0.1-1
>> pve-cluster: 5.0-12
>> qemu-server: 5.0-14
>> pve-firmware: 2.0-2
>> libpve-common-perl: 5.0-16
>> libpve-guest-common-perl: 2.0-11
>> libpve-access-control: 5.0-5
>> libpve-storage-perl: 5.0-12
>> pve-libspice-server1: 0.12.8-3
>> vncterm: 1.5-2
>> pve-docs: 5.0-9
>> pve-qemu-kvm: 2.9.0-2
>> pve-container: 2.0-14
>> pve-firewall: 3.0-2
>> pve-ha-manager: 2.0-2
>> ksm-control-daemon: 1.2-2
>> glusterfs-client: 3.8.8-1
>> lxc-pve: 2.0.8-3
>> lxcfs: 2.0.7-pve2
>> criu: 2.11.1-1~bpo90
>> novnc-pve: 0.6-4
>> smartmontools: 6.5+svn4324-1
>> zfsutils-linux: 0.6.5.9-pve16~bpo90
>>
>> ---
>>
>> VM:
>>
>> Linux backup 4.10.15-1-pve #1 SMP PVE 4.10.15-12 (Mon, 12 Jun 2017
>> 11:18:07 +0200) x86_64 GNU/Linux
>>
>> zfsutils-linux: 0.6.5.9-pve16~bpo90
>>
>>
>> Some hints would be very appreciated!
>>
>> bye
>> Harald
>>
>>
>> --
>> Harald Leithner
>>
>> ITronic
>> Wiedner Hauptstraße 120/5.1, 1050 Wien, Austria
>> Tel: +43-1-545 0 604
>> Mobil: +43-699-123 78 4 78
>> Mail: leithner at itronic.at | itronic.at
>> _______________________________________________
>> pve-user mailing list
>> pve-user at pve.proxmox.com
>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>>
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> 

-- 
Harald Leithner

ITronic
Wiedner Hauptstraße 120/5.1, 1050 Wien, Austria
Tel: +43-1-545 0 604
Mobil: +43-699-123 78 4 78
Mail: leithner at itronic.at | itronic.at



More information about the pve-user mailing list