VM live migration implementation¶
Live migration for virtual machines in LXD is achieved by streaming instance state from a source QEMU to a target QEMU. VM live migration is supported for all storage pool types.
API extension: migration_vm_live
Conceptual process¶
The live migration workflow varies depending on the type of storage pool used. The two key scenarios are non-shared storage and shared storage within a cluster (e.g., Ceph). If live state transfer is not supported by a target, a stateful stop is performed prior to migration.
Migration API¶
Sending a POST
request to /1.0/instances/{name}
renames, moves an instance between pools, or migrates an instance to another server. In the push case, the returned operation metadata for migration is a background operation with progress data. For the pull case, it is a WebSocket operation with a number of secrets to be passed to the target server.
Live migration call stack¶
Below is a general overview of the key functions of the live migration call stack:
lxd/lxd/instance_post.go
¶
This function handles post requests to the /1.0/instances
endpoint.
lxd/lxd/migrate_instance.go
¶
This function performs the migration operation on the source VM for the given state and operation. It sets up the necessary WebSocket connections for control, state, and filesystem, and then initiates the migration process.
lxd/lxd/instance/drivers/driver_qemu.go
¶
This function controls the sending of a migration, checking for stateful support, waiting for connections, performing checks, and sending a migration offer. When performing an intra-cluster same-name migration, steps are taken to prevent corruption of volatile device configuration keys during the start and stop of the instance on both source and target.
This function performs the live migration send process:
Connect to the QEMU monitor: The function begins by establishing a connection to the QEMU monitor using
qmp.Connect
.Define disk names: The function defines names for the root disk (
lxd_root
), the NBD target disk (lxd_root_nbd
), and the snapshot disk (lxd_root_snapshot
). These will be used later to manage the root disk and its snapshot during migration.Check for shared storage: If the migration involves shared storage, the migration process can bypass the need for synchronizing the root disk. The function checks for this condition by verifying if
clusterMoveSourceName
is non-empty and the pool is remote.Non-shared storage snapshot setup: If shared storage is not used, the function proceeds to set up a temporary snapshot of the root disk.
Migration capabilities such as
auto-converge
,pause-before-switchover
, andzero-blocks
are set to optimize the migration process.The function creates a QCOW2 snapshot file of the root disk, which will store changes to the disk during migration.
The snapshot file is opened for reading and writing, and the file descriptor is passed to QEMU.
The snapshot is added as a block device to QEMU, ensuring that it is not visible to the guest OS.
A snapshot of the root disk is taken using
monitor.BlockDevSnapshot
. This ensures that changes to the root disk are isolated during migration.Revert function: The revert function is used to clean up in case of failure. It ensures that the guest is resumed, and any changes made during snapshot creation are merged back into the root disk if migration fails
Shared storage setup: If shared storage is used, only the
auto-converge
migration capability is set, and no snapshot creation is necessary.Perform storage transfer: The storage pool is migrated while the VM is still running. The
volSourceArgs.AllowInconsistent
flag is set to true to allow migration while the disk is in use. The migration checks are done by callingpool.MigrateInstance
.Notify shared disk pools: For each disk in the VM, the migration process checks if the disk belongs to a shared pool. If so, the disk is prepared for migration by calling
MigrateVolume
on the source disk.Set up NBD listener and connection: If shared storage is not used, the function sets up a Unix socket listener for NBD connections. This listener handles the actual data transfer of the root disk from the source VM to the migration target.
Begin block device mirroring: After setting up the NBD connection, the function starts transferring the migration snapshot to the target disk by using
monitor.BlockDevMirror
.Send stateful migration checkpoint: The function creates a pipe to transfer the state of the VM during the migration process. It writes the VMs state to the
stateConn
via the pipe, usingd.saveStateHandle
to handle the state transfer. Note that the source VMs guest OS is paused while the state is transferred. This ensures that the VMs state is consistent when the migration completes.Finalize snapshot transfer: If non-shared storage is used, the function waits for the state transfer to reach the
pre-switchover
stage, ensuring that the guest remains paused during this process. Next, the function cancels the block job associated with the root snapshot to finalize the transfer and ensure that no changes are lost.Completion: Once all transfers are complete, the function proceeds to finalize the migration process by resuming the target VM and ensuring that source VM resources are cleaned up. The source VM is stopped, and its storage is discarded.