[pve-devel] [RFC qemu] savevm-async: improve check for blockers
Thomas Lamprecht
t.lamprecht at proxmox.com
Fri May 17 17:00:48 CEST 2024
subject might be improved by being less general/ambiguous, something like:
savevm-async: improve coverage by also checking for migration blockers
or
savevm-async: block snapshot also if migration would fail
or
savevm-async: reuse migration blocker check for snapshots
Would have helped me to have a better initial context for reading this commit
(message).
Am 17/05/2024 um 13:39 schrieb Fiona Ebner:
> Same rationale as with upstream QEMU commit 5aaac46793 ("migration:
> savevm: consult migration blockers"), migration and (async) snapshot
> are essentially the same operation and thus snapshot also needs to
> check for migration blockers. For example, this catches passed-through
> PCI devices, where the driver does not support migration.
>
> However, the commit notes:
>
>> There is really no difference between live migration and savevm, except
>> that savevm does not require bdrv_invalidate_cache to be implemented
>> by all disks. However, it is unlikely that savevm is used with anything
>> except qcow2 disks, so the penalty is small and worth the improvement
>> in catching bad usage of savevm.
>
> and for Proxmox VE, suspend-to-disk with VMDK does use savevm-async
> and would be broken by simply using migration_is_blocked(). To keep
> this working, introduce a new helper that filters blockers with the
> prefix used by the VMDK migration blocker.
>
> The function qemu_savevm_state_blocked() is called as part of
> migration_is_blocked_allow_prefix() so no check is lost with this
> patch.
>
> Signed-off-by: Fiona Ebner <f.ebner at proxmox.com>
> ---
>
> An alternative would be to mark the VMDK blocker as a
> "live-migration-only" blocker and extending migration_is_blocked() or
> using separate helpers to check for migration and snapshot blockers
> differently. But that requires touching more machinery and probably
> needs more adaptation going forward than the approach here.
>
> migration/migration.c | 22 ++++++++++++++++++++++
> migration/migration.h | 1 +
> migration/savevm-async.c | 7 ++++++-
> 3 files changed, 29 insertions(+), 1 deletion(-)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index b8d7e471a4..6235309a00 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1897,6 +1897,28 @@ void qmp_migrate_pause(Error **errp)
> "during postcopy-active or postcopy-recover state");
> }
>
> +/*
> + * HACK to allow hibernation in Proxmox VE even when VMDK image is present.
> + */
> +bool migration_is_blocked_allow_prefix(Error **errp, const char *prefix)
> +{
> + GSList *blockers = migration_blockers[migrate_mode()];
> +
> + if (qemu_savevm_state_blocked(errp)) {
> + return true;
> + }
> +
> + while (blockers) {
> + if (!g_str_has_prefix(error_get_pretty(blockers->data), prefix)) {
> + error_propagate(errp, error_copy(blockers->data));
> + return true;
> + }
> + blockers = g_slist_next(blockers);
> + }
> +
> + return false;
> +}
> +
> bool migration_is_blocked(Error **errp)
> {
> GSList *blockers = migration_blockers[migrate_mode()];
> diff --git a/migration/migration.h b/migration/migration.h
> index 8045e39c26..575805a8e2 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -484,6 +484,7 @@ int migration_call_notifiers(MigrationState *s, MigrationEventType type,
> Error **errp);
>
> int migrate_init(MigrationState *s, Error **errp);
> +bool migration_is_blocked_allow_prefix(Error **errp, const char *prefix);
> bool migration_is_blocked(Error **errp);
> /* True if outgoing migration has entered postcopy phase */
> bool migration_in_postcopy(void);
> diff --git a/migration/savevm-async.c b/migration/savevm-async.c
> index bf36fc06d2..33085446e1 100644
> --- a/migration/savevm-async.c
> +++ b/migration/savevm-async.c
> @@ -363,7 +363,12 @@ void qmp_savevm_start(const char *statefile, Error **errp)
> return;
> }
>
> - if (qemu_savevm_state_blocked(errp)) {
> + /*
> + * The limitation for VMDK images only applies to live-migration, not
> + * snapshots, see commit 5aaac46793 ("migration: savevm: consult migration
> + * blockers").
> + */
> + if (migration_is_blocked_allow_prefix(errp, "The vmdk format used by node")) {
meh, not a big fan of matching strings here, especially as that is not
stable ABI, I mean I did not check, but I would be surprised if that's
the case – maybe we could factor out that string here and when its added
as blocker into a common constant so that we'd notice if it changes.
And if we only uses this here then why add a generic "ignore one specific
blocker" helper, might be better to at least contain that detail in a
"qemu_savevm_async_state_blocked" one that takes only the `errp` as
parameter, as hacks should IMO always be quite specific to avoid the
spread of them (I know you would check in detail before doing so, but
not everybody does).
> return;
> }
>
More information about the pve-devel
mailing list