[pve-devel] Blacklisting HP hardware watchdog timer module ?
Alexandre DERUMIER
aderumier at odiso.com
Thu Dec 3 18:24:40 CET 2015
I just found a strange bug with ipmi_watchdog, dell openmanage related
at boot the timeout is correclty setup to 10s
root at kvmtest1 ~ # ipmitool mc watchdog get
Watchdog Timer Use: SMS/OS (0x44)
Watchdog Timer Is: Started/Running
Watchdog Timer Actions: Hard Reset (0x01)
Pre-timeout interval: 0 seconds
Timer Expiration Flags: 0x10
Initial Countdown: 10 sec
Present Countdown: 9 sec
but after some minutes (5-10min),
I'm seeing it at 480s
# ipmitool mc watchdog get
Watchdog Timer Use: SMS/OS (0xc4)
Watchdog Timer Is: Started/Running
Watchdog Timer Actions: No action (0x00)
Pre-timeout interval: 0 seconds
Timer Expiration Flags: 0x10
Initial Countdown: 480 sec
Present Countdown: 479 sec
In the dell openmanage, I'm seeing a reset configuration option at 480s.
(I think it's the openmanage service which overwrite the value).
I'll add a note in the wiki about this too.
----- Mail original -----
De: "aderumier" <aderumier at odiso.com>
À: "dietmar" <dietmar at proxmox.com>
Cc: "pve-devel" <pve-devel at pve.proxmox.com>
Envoyé: Jeudi 3 Décembre 2015 17:48:14
Objet: Re: [pve-devel] Blacklisting HP hardware watchdog timer module ?
>>The timeout must be 60 seconds!! Never change that.
>>
>>We set the timeout to 60s when we start watchdog-mux.
Ah ok. I thinked we need to define it manually
What is the difference between this 2 timeout ?
+ int watchdog_timeout = 10;
+ int client_watchdog_timeout = 60;
ipmitool give me 10s, so it's seem to works fine :)
# ipmitool mc watchdog get
Initial Countdown: 10 sec
> Another question, I have done some tests 2weeks ago with a customer,
> and I think I had some problem, if the node reboot too fast
> (pve-ha-manager see the node down, but it's coming up again before the vm was
> migrated).
> Is it a known bug ?
>>What bug exactly?
I don't remember exactly, but lrm or crm was stuck, because node (and vms) had rebooted too fast.
I don't have access to customer logs sorry.
----- Mail original -----
De: "dietmar" <dietmar at proxmox.com>
À: "aderumier" <aderumier at odiso.com>
Cc: "pve-devel" <pve-devel at pve.proxmox.com>
Envoyé: Jeudi 3 Décembre 2015 17:28:55
Objet: Re: [pve-devel] Blacklisting HP hardware watchdog timer module ?
> BTW, what is the best timeout for the watchdog ?
> I think that pve ha manager wait for around 1min before migrating vm ?
> if yes, the watchdog timeout should be lower ?
The timeout must be 60 seconds!! Never change that.
We set the timeout to 60s when we start watchdog-mux.
> Another question, I have done some tests 2weeks ago with a customer,
> and I think I had some problem, if the node reboot too fast
> (pve-ha-manager see the node down, but it's coming up again before the vm was
> migrated).
> Is it a known bug ?
What bug exactly?
_______________________________________________
pve-devel mailing list
pve-devel at pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
More information about the pve-devel
mailing list