[PVE-User] DRBD out of sync

Fábio Rabelo fabio at fabiorabelo.wiki.br
Mon Dec 12 12:34:00 CET 2011


Hi to all ...

I am not shure if my question is pertinent or not, but if it is not someone
probably will tell me here is the right place to get some help .

I am getting this in log :

Dec 12 09:03:38 host2 kernel: block drbd0: conn( StandAlone -> Unconnected )
Dec 12 09:03:38 host2 kernel: block drbd0: Starting receiver thread
(from drbd0_worker [3410])
Dec 12 09:03:38 host2 kernel: block drbd0: receiver (re)started
Dec 12 09:03:38 host2 kernel: block drbd0: conn( Unconnected -> WFConnection )
Dec 12 09:03:38 host2 kernel: block drbd1: conn( StandAlone -> Unconnected )
Dec 12 09:03:38 host2 kernel: block drbd1: Starting receiver thread
(from drbd1_worker [3423])
Dec 12 09:03:38 host2 kernel: block drbd1: receiver (re)started
Dec 12 09:03:38 host2 kernel: block drbd1: conn( Unconnected -> WFConnection )
Dec 12 09:04:46 host2 kernel: block drbd1: Handshake successful:
Agreed network protocol version 91
Dec 12 09:04:46 host2 kernel: block drbd0: Handshake successful:
Agreed network protocol version 91
Dec 12 09:04:46 host2 kernel: block drbd0: Peer authenticated using 20
bytes of 'sha1' HMAC
Dec 12 09:04:46 host2 kernel: block drbd1: Peer authenticated using 20
bytes of 'sha1' HMAC
Dec 12 09:04:46 host2 kernel: block drbd0: conn( WFConnection ->
WFReportParams )
Dec 12 09:04:46 host2 kernel: block drbd1: conn( WFConnection ->
WFReportParams )
Dec 12 09:04:46 host2 kernel: block drbd0: Starting asender thread
(from drbd0_receiver [22818])
Dec 12 09:04:46 host2 kernel: block drbd1: Starting asender thread
(from drbd1_receiver [22823])
Dec 12 09:04:46 host2 kernel: block drbd1: data-integrity-alg: <not-used>
Dec 12 09:04:46 host2 kernel: block drbd0: data-integrity-alg: <not-used>
Dec 12 09:04:46 host2 kernel: block drbd1: drbd_sync_handshake:
Dec 12 09:04:46 host2 kernel: block drbd0: drbd_sync_handshake:
Dec 12 09:04:46 host2 kernel: block drbd1: self
407980FB1CA92BE9:8F2945448A9D8C93:92878B68661C628D:3F1D44101917C6D7
bits:0 flags:0
Dec 12 09:04:46 host2 kernel: block drbd1: peer
EB1CB85D85D75A19:8F2945448A9D8C93:92878B68661C628D:3F1D44101917C6D7
bits:13613642 flags:0
Dec 12 09:04:46 host2 kernel: block drbd1: uuid_compare()=100 by rule 90
Dec 12 09:04:46 host2 kernel: block drbd1: Split-Brain detected,
dropping connection!
Dec 12 09:04:46 host2 kernel: block drbd0: self
42DDF89DDC6DEB5B:29B9739AEF60D14D:9903D190A5CAB801:4E0E807BBA9F1DA9
bits:835532 flags:0
Dec 12 09:04:46 host2 kernel: block drbd0: peer
288F1E6264B05C47:29B9739AEF60D14D:9903D190A5CAB800:4E0E807BBA9F1DA9
bits:1558653 flags:0
Dec 12 09:04:46 host2 kernel: block drbd0: uuid_compare()=100 by rule 90
Dec 12 09:04:46 host2 kernel: block drbd0: Split-Brain detected,
dropping connection!
Dec 12 09:04:46 host2 kernel: block drbd0: helper command:
/sbin/drbdadm split-brain minor-0
Dec 12 09:04:46 host2 kernel: block drbd0: helper command:
/sbin/drbdadm split-brain minor-0 exit code 0 (0x0)
Dec 12 09:04:46 host2 kernel: block drbd0: conn( WFReportParams ->
Disconnecting )
Dec 12 09:04:46 host2 kernel: block drbd0: error receiving ReportState, l: 4!
Dec 12 09:04:46 host2 kernel: block drbd0: asender terminated
Dec 12 09:04:46 host2 kernel: block drbd0: Terminating drbd0_asender
Dec 12 09:04:47 host2 kernel: block drbd0: Connection closed
Dec 12 09:04:47 host2 kernel: block drbd0: conn( Disconnecting -> StandAlone )
Dec 12 09:04:47 host2 kernel: block drbd0: receiver terminated
Dec 12 09:04:47 host2 kernel: block drbd0: Terminating drbd0_receiver
Dec 12 09:04:47 host2 kernel: block drbd1: meta connection shut down by peer.
Dec 12 09:04:47 host2 kernel: block drbd1: conn( WFReportParams ->
NetworkFailure )
Dec 12 09:04:47 host2 kernel: block drbd1: asender terminated
Dec 12 09:04:47 host2 kernel: block drbd1: Terminating drbd1_asender
Dec 12 09:04:47 host2 kernel: block drbd1: helper command:
/sbin/drbdadm split-brain minor-1
Dec 12 09:04:47 host2 kernel: block drbd1: helper command:
/sbin/drbdadm split-brain minor-1 exit code 0 (0x0)
Dec 12 09:04:47 host2 kernel: block drbd1: conn( NetworkFailure ->
Disconnecting )
Dec 12 09:04:47 host2 kernel: block drbd1: error receiving ReportState, l: 4!
Dec 12 09:04:47 host2 kernel: block drbd1: Connection closed
Dec 12 09:04:47 host2 kernel: block drbd1: conn( Disconnecting -> StandAlone )
Dec 12 09:04:47 host2 kernel: block drbd1: receiver terminated
Dec 12 09:04:47 host2 kernel: block drbd1: Terminating drbd1_receiver
Dec 12 09:12:20 host2 kernel: block drbd0: conn( StandAlone -> Unconnected )
Dec 12 09:12:20 host2 kernel: block drbd0: Starting receiver thread
(from drbd0_worker [3410])
Dec 12 09:12:20 host2 kernel: block drbd0: receiver (re)started
Dec 12 09:12:20 host2 kernel: block drbd0: conn( Unconnected -> WFConnection )
Dec 12 09:12:20 host2 kernel: block drbd1: conn( StandAlone -> Unconnected )
Dec 12 09:12:20 host2 kernel: block drbd1: Starting receiver thread
(from drbd1_worker [3423])
Dec 12 09:12:20 host2 kernel: block drbd1: receiver (re)started
Dec 12 09:12:20 host2 kernel: block drbd1: conn( Unconnected -> WFConnection )

Right, splitbrain, all inf that i fund tells me to keep one note and
discard the other .

But i can not do that because I have some VMs running in one node, and
some other Vms running in the other node, so, if I discard whatever
eatch one I will loose all changes since the drbd lost connection !!!

Hos I can I make things right ?!?

Any help will be very welcome ...

Thanks in advance ... and please, forgive my bad English


Fábio Rabelo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://pve.proxmox.com/pipermail/pve-user/attachments/20111212/954cd1ef/attachment-0013.html>


More information about the pve-user mailing list