[PVE-User] Replica stuck after a network outgage...

Marco Gaiarin gaio at lilliput.linux.it
Sun Nov 13 18:24:28 CET 2022


Situation: two servers with a direct link with a 10Gbit/s speed; after
creating a new VMs on one side, and filled up with data, i've enabled
replica.
Network link goes down, i'm investigating, but now work.

Sending side i catch:

2022-11-13 18:16:01 123-0: start replication job
2022-11-13 18:16:01 123-0: guest => VM 123, running => 6101
2022-11-13 18:16:01 123-0: volumes => local-zfs:vm-123-disk-0,rpool-data:vm-123-disk-0
2022-11-13 18:16:02 123-0: create snapshot '__replicate_123-0_1668359761__' on local-zfs:vm-123-disk-0
2022-11-13 18:16:02 123-0: create snapshot '__replicate_123-0_1668359761__' on rpool-data:vm-123-disk-0
2022-11-13 18:16:02 123-0: using insecure transmission, rate limit: 50 MByte/s
2022-11-13 18:16:02 123-0: full sync 'local-zfs:vm-123-disk-0' (__replicate_123-0_1668359761__)
2022-11-13 18:16:02 123-0: using a bandwidth limit of 50000000 bps for transferring 'local-zfs:vm-123-disk-0'
2022-11-13 18:16:04 123-0: full send of rpool/data/vm-123-disk-0 at __replicate_123-0_1668359761__ estimated size is 17.9G
2022-11-13 18:16:04 123-0: total estimated size is 17.9G
2022-11-13 18:16:04 123-0: 1164 B 1.1 KB 0.44 s 2616 B/s 2.55 KB/s
2022-11-13 18:16:04 123-0: write: Broken pipe
2022-11-13 18:16:04 123-0: warning: cannot send 'rpool/data/vm-123-disk-0 at __replicate_123-0_1668359761__': signal received
2022-11-13 18:16:04 123-0: cannot send 'rpool/data/vm-123-disk-0': I/O error
2022-11-13 18:16:04 123-0: command 'zfs send -Rpv -- rpool/data/vm-123-disk-0 at __replicate_123-0_1668359761__' failed: exit code 1
2022-11-13 18:16:04 123-0: [svpve1] volume 'rpool/data/vm-123-disk-0' already exists
2022-11-13 18:16:04 123-0: delete previous replication snapshot '__replicate_123-0_1668359761__' on local-zfs:vm-123-disk-0
2022-11-13 18:16:04 123-0: delete previous replication snapshot '__replicate_123-0_1668359761__' on rpool-data:vm-123-disk-0
2022-11-13 18:16:04 123-0: end replication job with error: command 'set -o pipefail && pvesm export local-zfs:vm-123-disk-0 zfs - -with-snapshots 1 -snapshot __replicate_123-0_1668359761__ | /usr/bin/cstream -t 50000000' failed: exit code 2


and receiving side i see two processes stuck:

 root at svpve1:~# ps aux | grep [z]fs
 root     23000  0.0  0.1 301468 81720 ?        Ss   12:10   0:01 /usr/bin/perl /usr/sbin/pvesm import rpool-data:vm-123-disk-0 zfs tcp://10.5.251.0/24 -with-snapshots 1 -allow-rename 0
 root     23003  0.0  0.0   8956  3404 ?        S    12:10   0:11 zfs recv -F -- rpool-data/vm-123-disk-0

time of processes match with the crash time.


Can i safely kill them? Thanks.

-- 
  Risulterebbe che i due ladroni crocefissi accanto al Signore fossero
  socialisti: infatti erano ladri e occupavano due posti su tre.
								(Anonimo)





More information about the pve-user mailing list