[PVE-User] replication problems after upgrade to 5.0-15

nmachkova at verdnatura.es nmachkova at verdnatura.es
Wed Nov 29 12:44:44 CET 2017


I am using ZFS for CTs at PROXMOX cluster with 2 nodes without HA since 
August 2017
I installed proxmox-ve_5.0-af4267bf-4.iso at 2 old servers
and proxmox is really great 8-)))
but I did upgrade at both nodes and something is wrong, because I am 
unable to do CT replications or migrations

example of replication of CT 106(goat) from node mox11 => mox

========== errors from webGUI

2017-11-29 10:42:01 106-0: start replication job
2017-11-29 10:42:01 106-0: guest => CT 106, running => 1
2017-11-29 10:42:01 106-0: volumes => zfs:subvol-106-disk-1
2017-11-29 10:42:02 106-0: freeze guest filesystem
2017-11-29 10:42:03 106-0: create snapshot 
'__replicate_106-0_1511948521__' on zfs:subvol-106-disk-1
2017-11-29 10:42:03 106-0: thaw guest filesystem
2017-11-29 10:42:03 106-0: full sync 'zfs:subvol-106-disk-1' 
(__replicate_106-0_1511948521__)
2017-11-29 10:42:04 106-0: internal error: Invalid argument
2017-11-29 10:42:04 106-0: command 'zfs send -Rpv -- 
ctpool/subvol-106-disk-1 at __replicate_106-0_1511948521__' failed: got 
signal 6
2017-11-29 10:42:04 106-0: cannot receive: failed to read from stream
2017-11-29 10:42:04 106-0: cannot open 'ctpool/subvol-106-disk-1': 
dataset does not exist
2017-11-29 10:42:04 106-0: command 'zfs recv -F -- 
ctpool/subvol-106-disk-1' failed: exit code 1
2017-11-29 10:42:04 106-0: delete previous replication snapshot 
'__replicate_106-0_1511948521__' on zfs:subvol-106-disk-1
2017-11-29 10:42:04 106-0: end replication job with error: command 'set 
-o pipefail && pvesm export zfs:subvol-106-disk-1 zfs - -with-snapshots 
1 -snapshot __replicate_106-0_1511948521__ | /usr/bin/ssh -o 
'BatchMode=yes' -o 'HostKeyAlias=mox' root at 172.16.251.8 -- pvesm import 
zfs:subvol-106-disk-1 zfs - -with-snapshots 1' failed: exit code 1

=========== version (same at both nodes)

proxmox-ve: 5.0-15 (running kernel: 4.10.15-1-pve)
pve-manager: 5.1-36 (running version: 5.1-36/131401db)
pve-kernel-4.10.15-1-pve: 4.10.15-15
libpve-http-server-perl: 2.0-6
lvm2: 2.02.168-pve6
corosync: 2.4.2-pve3
libqb0: 1.0.1-1
pve-cluster: 5.0-15
qemu-server: 5.0-12
pve-firmware: 2.0-3
libpve-common-perl: 5.0-20
libpve-guest-common-perl: 2.0-13
libpve-access-control: 5.0-7
libpve-storage-perl: 5.0-16
pve-libspice-server1: 0.12.8-3
vncterm: 1.5-2
pve-docs: 5.1-12
pve-qemu-kvm: 2.9.0-2
pve-container: 2.0-17
pve-firewall: 3.0-3
pve-ha-manager: 2.0-3
ksm-control-daemon: 1.2-2
glusterfs-client: 3.8.8-1
lxc-pve: 2.1.0-2
lxcfs: 2.0.7-pve4
criu: 2.11.1-1~bpo90
novnc-pve: 0.6-4
smartmontools: 6.5+svn4324-1
zfsutils-linux: 0.7.3-pve1~bpo9

========== zfs status

=== node mox
# zfs list -t all -r ctpool
NAME                                        USED  AVAIL  REFER  
MOUNTPOINT
ctpool                                     3.41G  25.4G   112K  /ctpool
ctpool/subvol-110-disk-1                    503M   521M   503M  
/ctpool/subvol-110-disk-1
ctpool/subvol-251-disk-1                    478M  3.54G   468M  
/ctpool/subvol-251-disk-1
ctpool/subvol-251-disk-1 at dyn01sharednet    10.1M      -   467M  -
ctpool/subvol-252-disk-1                    478M  3.54G   468M  
/ctpool/subvol-252-disk-1
ctpool/subvol-252-disk-1 at dyn01sharednet    10.2M      -   468M  -
ctpool/subvol-301-disk-1                    690M  3.49G   518M  
/ctpool/subvol-301-disk-1
ctpool/subvol-301-disk-1 at silla70y100        151M      -   538M  -
ctpool/subvol-301-disk-1 at silla70y100fixed  18.3M      -   526M  -
ctpool/subvol-302-disk-1                    531M  3.54G   470M  
/ctpool/subvol-302-disk-1
ctpool/subvol-302-disk-1 at silla70y100       40.3M      -   472M  -
ctpool/subvol-302-disk-1 at silla70y100fixed  17.6M      -   478M  -
ctpool/subvol-501-disk-1                    402M   110M   402M  
/ctpool/subvol-501-disk-1
ctpool/subvol-502-disk-1                    388M   124M   388M  
/ctpool/subvol-502-disk-1

=== node mox11
# zfs list -t all -r ctpool
NAME                                     USED  AVAIL  REFER  MOUNTPOINT
ctpool                                  3.97G  24.8G    96K  /ctpool
ctpool/subvol-102-disk-1                 589M  7.42G   589M  
/ctpool/subvol-102-disk-1
ctpool/subvol-103-disk-1                 870M  7.16G   855M  
/ctpool/subvol-103-disk-1
ctpool/subvol-103-disk-1 at campana        9.42M      -   756M  -
ctpool/subvol-103-disk-1 at beforenodeupg  4.87M      -   855M  -
ctpool/subvol-106-disk-1                 427M  3.59G   424M  
/ctpool/subvol-106-disk-1
ctpool/subvol-106-disk-1 at goat_apache2   3.08M      -   423M  -
ctpool/subvol-111-disk-1                1.65G  6.58G  1.42G  
/ctpool/subvol-111-disk-1
ctpool/subvol-111-disk-1 at postgresql     53.4M      -   513M  -
ctpool/subvol-111-disk-1 at phppgadmin     2.00M      -   524M  -
ctpool/subvol-111-disk-1 at pgwebssl       1.93M      -   524M  -
ctpool/subvol-111-disk-1 at redmine01      28.8M      -   714M  -
ctpool/subvol-111-disk-1 at redmine02      31.5M      -   882M  -
ctpool/subvol-111-disk-1 at redmine03      43.4M      -  1.42G  -
ctpool/subvol-111-disk-1 at redmine04      19.0M      -  1.42G  -
ctpool/subvol-111-disk-1 at redmine05      7.14M      -  1.42G  -
ctpool/subvol-220-disk-1                 470M   554M   470M  
/ctpool/subvol-220-disk-1

============== node mox11
# zpool history ctpool
2017-11-28.14:26:08 zfs destroy 
ctpool/subvol-106-disk-1 at __replicate_106-0_1511875560__
2017-11-28.14:32:05 zpool import -c /etc/zfs/zpool.cache -aN
2017-11-28.15:10:04 zfs snapshot 
ctpool/subvol-103-disk-1 at __replicate_103-0_1511878201__
2017-11-28.15:10:09 zfs destroy 
ctpool/subvol-103-disk-1 at __replicate_103-0_1511878201__
2017-11-28.15:15:04 zfs snapshot 
ctpool/subvol-103-disk-1 at __replicate_103-0_1511878501__
2017-11-28.15:15:09 zfs destroy 
ctpool/subvol-103-disk-1 at __replicate_103-0_1511878501__
2017-11-28.15:43:59 zfs snapshot ctpool/subvol-106-disk-1 at __migration__
2017-11-28.15:44:04 zfs destroy ctpool/subvol-106-disk-1 at __migration__
2017-11-29.10:42:04 zfs snapshot 
ctpool/subvol-106-disk-1 at __replicate_106-0_1511948521__
2017-11-29.10:42:09 zfs destroy 
ctpool/subvol-106-disk-1 at __replicate_106-0_1511948521__
2017-11-29.10:47:04 zfs snapshot 
ctpool/subvol-106-disk-1 at __replicate_106-0_1511948821__
2017-11-29.10:47:09 zfs destroy 
ctpool/subvol-106-disk-1 at __replicate_106-0_1511948821__
2017-11-29.10:57:04 zfs snapshot 
ctpool/subvol-106-disk-1 at __replicate_106-0_1511949421__
2017-11-29.10:57:09 zfs destroy 
ctpool/subvol-106-disk-1 at __replicate_106-0_1511949421__

============== node mox
# zpool history ctpool
2017-11-28.05:10:40 zfs get -o value -Hp available,used ctpool
2017-11-28.05:33:09 zfs rollback -r -- 
ctpool/subvol-106-disk-1 at __replicate_106-0_1511251201__
2017-11-28.06:03:09 zfs rollback -r -- 
ctpool/subvol-106-disk-1 at __replicate_106-0_1511251201__
2017-11-28.06:09:59 zpool list -o name -H ctpool
2017-11-28.06:21:30 zfs get -o value -Hp available,used ctpool
2017-11-28.06:33:10 zfs rollback -r -- 
ctpool/subvol-106-disk-1 at __replicate_106-0_1511251201__
2017-11-28.07:03:09 zfs rollback -r -- 
ctpool/subvol-106-disk-1 at __replicate_106-0_1511251201__
2017-11-28.07:04:52 zpool list -o name -H ctpool
2017-11-28.07:04:57 zfs get -o value -Hp available,used ctpool
2017-11-28.07:33:09 zfs rollback -r -- 
ctpool/subvol-106-disk-1 at __replicate_106-0_1511251201__
2017-11-28.08:03:09 zfs rollback -r -- 
ctpool/subvol-106-disk-1 at __replicate_106-0_1511251201__
2017-11-28.08:04:21 zfs get -o value -Hp available,used ctpool
2017-11-28.08:07:41 zpool list -o name -H ctpool
2017-11-28.08:33:09 zfs rollback -r -- 
ctpool/subvol-106-disk-1 at __replicate_106-0_1511251201__
2017-11-28.08:56:12 zpool list -o name -H ctpool
2017-11-28.09:03:10 zfs rollback -r -- 
ctpool/subvol-106-disk-1 at __replicate_106-0_1511251201__
2017-11-28.09:04:21 zfs get -o value -Hp available,used ctpool
2017-11-28.09:16:59 zfs get -o value -Hp available,used ctpool
2017-11-28.09:33:08 zfs rollback -r -- 
ctpool/subvol-106-disk-1 at __replicate_106-0_1511251201__
2017-11-28.10:03:09 zfs rollback -r -- 
ctpool/subvol-106-disk-1 at __replicate_106-0_1511251201__
2017-11-28.10:33:09 zfs rollback -r -- 
ctpool/subvol-106-disk-1 at __replicate_106-0_1511251201__
2017-11-28.11:03:09 zfs rollback -r -- 
ctpool/subvol-106-disk-1 at __replicate_106-0_1511251201__
2017-11-28.11:13:29 zpool list -o name -H ctpool
2017-11-28.11:33:09 zfs rollback -r -- 
ctpool/subvol-106-disk-1 at __replicate_106-0_1511251201__
2017-11-28.12:03:09 zfs rollback -r -- 
ctpool/subvol-106-disk-1 at __replicate_106-0_1511251201__
2017-11-28.12:09:49 zpool list -o name -H ctpool
2017-11-28.12:22:47 zpool list -o name -H ctpool
2017-11-28.12:33:08 zfs rollback -r -- 
ctpool/subvol-106-disk-1 at __replicate_106-0_1511251201__
2017-11-28.12:38:41 zpool list -o name -H ctpool
2017-11-28.12:42:11 zpool list -o name -H ctpool
2017-11-28.13:07:37 zfs get -o value -Hp available,used ctpool
2017-11-28.13:10:59 zpool import -c /etc/zfs/zpool.cache -aN
2017-11-28.13:33:10 zfs rollback -r -- 
ctpool/subvol-106-disk-1 at __replicate_106-0_1511251201__
2017-11-28.13:43:02 zfs destroy 
ctpool/subvol-106-disk-1 at __replicate_106-0_1511251201__
2017-11-28.13:43:09 zfs destroy -r ctpool/subvol-106-disk-1
2017-11-28.14:12:19 zfs destroy ctpool/subvol-103-disk-1
2017-11-29.04:50:24 zfs get -o value -Hp available,used ctpool
2017-11-29.10:39:14 zfs set compression=lz4 ctpool


# zpool status
   pool: ctpool
  state: ONLINE
status: Some supported features are not enabled on the pool. The pool 
can
	still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
	the pool may no longer be accessible by software that does not support
	the features. See zpool-features(5) for details.
   scan: scrub repaired 0B in 0h6m with 0 errors on Sun Nov 26 00:30:42 
2017
config:

	NAME        STATE     READ WRITE CKSUM
	ctpool      ONLINE       0     0     0
	  zfs       ONLINE       0     0     0

errors: No known data errors

root at mox:~# zfs get all ctpool
NAME    PROPERTY              VALUE                  SOURCE
ctpool  type                  filesystem             -
ctpool  creation              Mon Jul 31 16:48 2017  -
ctpool  used                  3.41G                  -
ctpool  available             25.4G                  -
ctpool  referenced            112K                   -
ctpool  compressratio         1.87x                  -
ctpool  mounted               yes                    -
ctpool  quota                 none                   default
ctpool  reservation           none                   default
ctpool  recordsize            128K                   default
ctpool  mountpoint            /ctpool                default
ctpool  sharenfs              off                    default
ctpool  checksum              on                     default
ctpool  compression           lz4                    local
ctpool  atime                 on                     default
ctpool  devices               on                     default
ctpool  exec                  on                     default
ctpool  setuid                on                     default
ctpool  readonly              off                    default
ctpool  zoned                 off                    default
ctpool  snapdir               hidden                 default
ctpool  aclinherit            restricted             default
ctpool  createtxg             1                      -
ctpool  canmount              on                     default
ctpool  xattr                 on                     default
ctpool  copies                1                      default
ctpool  version               5                      -
ctpool  utf8only              off                    -
ctpool  normalization         none                   -
ctpool  casesensitivity       sensitive              -
ctpool  vscan                 off                    default
ctpool  nbmand                off                    default
ctpool  sharesmb              off                    default
ctpool  refquota              none                   default
ctpool  refreservation        none                   default
ctpool  guid                  2622709745618035732    -
ctpool  primarycache          all                    default
ctpool  secondarycache        all                    default
ctpool  usedbysnapshots       0B                     -
ctpool  usedbydataset         112K                   -
ctpool  usedbychildren        3.41G                  -
ctpool  usedbyrefreservation  0B                     -
ctpool  logbias               latency                default
ctpool  dedup                 off                    default
ctpool  mlslabel              none                   default
ctpool  sync                  standard               default
ctpool  dnodesize             legacy                 default
ctpool  refcompressratio      1.00x                  -
ctpool  written               112K                   -
ctpool  logicalused           5.79G                  -
ctpool  logicalreferenced     45.5K                  -
ctpool  volmode               default                default
ctpool  filesystem_limit      none                   default
ctpool  snapshot_limit        none                   default
ctpool  filesystem_count      none                   default
ctpool  snapshot_count        none                   default
ctpool  snapdev               hidden                 default
ctpool  acltype               off                    default
ctpool  context               none                   default
ctpool  fscontext             none                   default
ctpool  defcontext            none                   default
ctpool  rootcontext           none                   default
ctpool  relatime              off                    default
ctpool  redundant_metadata    all                    default
ctpool  overlay               off                    default

I disabled and removed ALL the replication tasks and tried to find some 
answer at
# systemctl status zed
# pvesr status

BUT this is interesting, I tried to upgrade the pool as it is 
recommended via zpool status

# zpool upgrade -a
This system supports ZFS pool feature flags.

cannot set property for 'ctpool': invalid argument for this pool 
operation
=============== end

hm, replication error is occuring just after some "sync" operation so 
these are my packages related to sync

# dpkg -l |grep sync
ii  corosync                             2.4.2-pve3                     
amd64        cluster engine daemon and utilities
ii  libasyncns0:amd64                    0.8-6                          
amd64        Asynchronous name service query library
ii  libcorosync-common4:amd64            2.4.2-pve3                     
amd64        cluster engine common library
ii  libevent-2.0-5:amd64                 2.0.21-stable-3                
amd64        Asynchronous event notification library
ii  libfile-sync-perl                    0.11-2+b3                      
amd64        Perl interface to sync() and fsync()
ii  libpve-http-server-perl              2.0-6                          
all          Proxmox Asynchrounous HTTP Server Implementation
ii  rsync                                3.1.2-1                        
amd64        fast, versatile, remote (and local) file-copying tool

do I have to install pve-zsync ???
MANY THANKS for your time&energy amigos
Nada



More information about the pve-user mailing list