[PVE-User] replication failures
    Randy Bush 
    randy at psg.com
       
    Mon May 26 05:40:39 CEST 2025
    
    
  
three node debian-12 8.4.1 zfs raidz2 ssd cluster, maybe 20vms, all vms
replicate /15 to the next node to the right.  
on one and only of a couple of similar clusters, and on only one
particular node, we're getting replication failuers of the nature of
    2025-05-26T00:16:17.643854+00:00 vm21 pvescheduler[2641364]: command 'zfs destroy images/vm-107-disk-0 at __replicate_107-0_1748217943__' failed: got timeout
    2025-05-26T00:16:37.218095+00:00 vm21 pvescheduler[2641364]: 107-0: got unexpected replication job error - command 'zfs snapshot images/vm-107-disk-0 at __replicate_107-0_1748218563__' failed: got timeout
five to 15 times a day.  zfs load?  flaky disk (smartmon reports
nothing)?  weak ether?  moon in klutz?
how do folk diagnose?
randy
    
    
More information about the pve-user
mailing list