[pve-devel] [PATCH docs] ha-manager: error fixes and small additions

Thomas Lamprecht t.lamprecht at proxmox.com
Fri Apr 29 12:04:47 CEST 2016



On 04/29/2016 11:52 AM, Dietmar Maurer wrote:
>> Services in error state are not freezable, 
> but we can consider such service as already freezed...

Yes, true so the same

>
>> thus the LRM wont
>> restart/release its lock/close the watchdog => problematic
>>
>> Services which are in the error state cannot be touched, as its unclear
>> what happened really with them
>>  (or at least we can not recover them) and they need manual intervention.
>>
>> We could although don't count them towards the active services in the
>> LRM, so the LRM could restart with those and they would stay in the
>> error state, i.e. untouched.
>> This seams like a reasonable idea for me, thoughts?
> yes, sounds reasonable to me. A new CRM will not touch services in 
> error state, so there is no need to freeze them. Please
> can you verify this claim?

Yes that's the case, I verified that already. While the freeze state
actively unfreezes services if a nodes LRM comes back online again the
error state doesn't.
The error state has only one wait out and that's disabling the service
(and naturally removing it from HA works also, but that works for every
state always).
I'll prepare a patch.



More information about the pve-devel mailing list