[PVE-User] replication failures

DERUMIER, Alexandre alexandre.derumier at groupe-cyllene.com
Mon May 26 17:11:00 CEST 2025


How much time does it take in you do the delete command manually ? 
(zfs destroy images/vm-107-disk-0 at __replicate_107-0_1748217943__)


(maybe the timeout in the code is too short ?)


-------- Message initial --------
De: Randy Bush <randy at psg.com>
Répondre à: Proxmox VE user list <pve-user at lists.proxmox.com>
À: ProxMox Users <pve-user at lists.proxmox.com>
Objet: [PVE-User] replication failures
Date: 26/05/2025 05:40:39

three node debian-12 8.4.1 zfs raidz2 ssd cluster, maybe 20vms, all vms
replicate /15 to the next node to the right.  

on one and only of a couple of similar clusters, and on only one
particular node, we're getting replication failuers of the nature of

    2025-05-26T00:16:17.643854+00:00 vm21 pvescheduler[2641364]:
command 'zfs destroy images/vm-107-disk-0 at __replicate_107-
0_1748217943__' failed: got timeout
    2025-05-26T00:16:37.218095+00:00 vm21 pvescheduler[2641364]: 107-0:
got unexpected replication job error - command 'zfs snapshot images/vm-
107-disk-0 at __replicate_107-0_1748218563__' failed: got timeout

five to 15 times a day.  zfs load?  flaky disk (smartmon reports
nothing)?  weak ether?  moon in klutz?

how do folk diagnose?

randy

_______________________________________________
pve-user mailing list
pve-user at lists.proxmox.com
https://antiphishing.vadesecure.com/v4?f=Rld2eGhGQ3psZjlOWGwxQ1_ZfFbgqZ
TPaooaLkyo9Iz48f3wEJxfdHSaXhsgUlRBwsSa2EvkACP7Jh9e5TXbPw&i=U2pXU09ocHlt
dTEydGM2aUXXbilnQtz5PQDA1D2RBy8&k=1XpP&r=SjA3d003VWxKRk1kazNaeRJgzukDmh
QdY5g-DacBRkZ4pgKdvLOyt2Z87havu-ae7CZLNw-
FYpOPxDnH4AVQTw&s=6f39617ccf400668f694b93aa3fbcb2782f4bc0a65f6c1bc81b8d
c48b06d54f4&u=https%3A%2F%2Flists.proxmox.com%2Fcgi-
bin%2Fmailman%2Flistinfo%2Fpve-user



More information about the pve-user mailing list