[pve-devel] [PATCH qemu-server] migrate: skip tpmstate for NBD migration

Thomas Lamprecht t.lamprecht at proxmox.com
Tue Nov 16 12:50:48 CET 2021

On 16.11.21 12:39, Fabian Grünbichler wrote:
> On November 16, 2021 12:12 pm, Thomas Lamprecht wrote:
>> On 16.11.21 11:52, Fabian Grünbichler wrote:
>>> the tpmstate volume is not available in the VM directly, but we do
>>> migrate the state volume via a storage migration anyway if necessary.
>> some context would be great to have in the commit message, iow. mentioning
>> that QEMU is already migrating this as part of its memory/state migration.
> I tried to get some understanding of how this works, and I don't think 
> that the stuff that Qemu copies as part of the TPM emulator state covers 
> everything that is in the state volume.
> what happens is the following:
> - our migration code finds a tpmstate volume, it gets migrated via 
>   storage_migrate if on local storage (and replicated if that is 
>   enabled)
> - the VM is started on the remote node with the initial swtpm setup part 
>   skipped, since we already have a volume with state
> - the RAM migration happens (and rest of state, including 'tpm emulator 
>   state')
> so there is a window between storage_migrate/replication happening, and 
> the migration being finished where changes to the TPM state volume from 
> within the guest could potentially get lost (unless the state covered by 
> the migrate stream covers ALL the state inside the state volume, which I 
> don't think, but maybe I am mistaken on that front).

I've something in mind about talking to Stefan regarding this, it should
be fine, but need to rethink and ensure that I remember this correctly..

> but this is irrespective of this patch, which just fixes the wrong 
> attempt of setting up an NBD server for the replicated tpm state volume. 
> even attaching the volume (like we do for backups) and setting up that 
> NBD server would not help, since changes to the state volume are not 
> tracked in the source VM on the block level, as Qemu doesn't access the 
> state volume directly, only swtpm does.

yeah, something like above paragraph would help to avoid confusion like
mine :)

>> Also, how is "migrate -> stop -> start" affected, is the TPM synced out to
>> the (previously replicated?) disk on the target side during stop?
> I am not sure I understand this question. nothing changes about the flow 
> of migration with this patch, except that where the migration would fall 
> apart previously if replication was enabled, it now works. the handling 
> of the state volume is unchanged / identical to a VM that is not 
> replicated. in either case we only sync the state volume once, before 
> starting the VM on the target node, doing block mirror, and the 
> ram/state migration. swtpm probably syncs it whenever state-changing 
> operations are issued from within the VM - but that is not something 
> that we can control when shutting down the VM. AFAIU, the 'raw' state of 
> the TPM is not even available to Qemu directly, that's the whole point 
> of the swtpm component after all?

Yes and no, it's the case but not the whole point, rather QEMU just did not
wanted to directly incorporate TPM stuff if it can live externally (i.e.,
more about code-base and architecture than having them separated to ensure
QEMU is unaware of TPM states) and to allow HW TPM stuff to be used at
the same time, but don't quote me on that, just recollections of discussions
with Stefan and the swtpm project.

More information about the pve-devel mailing list