How to manage VM in SEAPATH
This page describe the vm_manager tool: https://github.com/seapath/vm_manager
Deploying a virtual machine on a SEAPATH cluster requires to handle many elements: ceph, qemu, libvirt, corosync. vm_manager was created as a wrapper around these components.
vm_manager is not useful for a standalone SEAPATH hypervisor. If you wish to deploy a VM for that use case, refer to the deploy_vms_standalone.yaml playbook.
The cluster_vm Ansible module is wrapper above vm_manager cli. A more detailed documentation can be generated from sources.
It can be called from a playbook to perform actions on VMs. For instance, an example of playbook that creates a VM from a predefined image disk and XML Libvirt configuration would be:
- name: Create and start guest0
cluster_vm:
name: guest0
command: create
system_image: my_disk.qcow2
xml: "{{ lookup('file', 'my_vm_config.xml', errors='strict') }}"
Playbooks can be executed in any hypervisor. Other playbook examples are stored in the example/playbooks/vm directory.
This section describes the VM architecture and the cluster_vm commands from a high-level point of view. Please read cluster_vm module documentation for further information.
Like other Ansible modules, the cluster_vm documentation can also be displayed by executing ansible-doc cluster_vm command from the Ansible root repository.
You will also find information on how to troubleshoot problems related to VM management on page To be removed.
VM status
In the SEAPATH cluster the VMs can have several statuses:
Undefined: The VM does not exist yet.
Disabled: The VM exists and its data disk has been created, but it is not enabled to be used on the cluster.
Starting: The VM is enabled and performing an operation of start.
Started: The VM is enabled and started. Note: This doesn’t mean that the VM is ready and has finished booting, which can take some time.
Stopping: The VM is enabled and performing a power-off action.
Stopped: The VM is enabled and stopped.
Failed: The VM is enabled, but it has failed to start.
VM Manager commands
All sub-commands has -n
, --name
required option to specify which resource should be used.
add_colocation
: Adds a colocation constraint between resourcesclone
: Creates a copy of the VMcreate
: Generates a new resource from a VMcreate_snapshot
: Creates a snapshot of a resourcedisable
: Stops and removes the resource on the clusterenable
: Adds and starts the resource on the clusterget_metadata
: Gets a metadata of a resourcelist
: Lists all resourceslist_metadata
: Lists all keys of a resourcelist_snapshots
: Lists all created snapshotspurge
: Deletes all snapshots of a resourceremove
: Removes the resourceremove_snapshot
: Removes a snapshot of a resourcerollback
: Rollbacks to a snapshot for a resourceset_metadata
: Sets a metadata of a resourcestart
: Start a resourcestatus
: Gets the status of a resourcestop
: Stops a resource
VM architecture
The diagram below describes how a VM is stored in the SEAPATH cluster. All non-volatile VM data is stored using Ceph, which is in charge of the maintenance of the data-store and data replication between all the hypervisors.
VM is stored in a Ceph RBD group named as the VM name
VM contains:
Metadata
Image data disk
Image data snapshots
Metadata provides information associated with a VM. It consists of a list of pairs (key, value) that are set at the moment of the VM creation. You can define as many metadata fields as you want but some keys are reserved:
KEY | VALUE MEANING |
---|---|
vm_name | VM name |
_base_xml | Initial Libvirt XML VM configuration |
xml | Libvirt XML file used for the VM configuration. It is autogenerated by modifying the _base_xml file. |
VM deployment
The VM data disk is set when creating a new VM or cloning an existing one, as described in the schemas below.
Create a VM
Create a VM from scratch by importing an image disk with the create command:
- name: Create and start guest0
cluster_vm:
name: guest0
command: create
system_image: my_disk.qcow2
xml: "{{ lookup('file', 'my_vm_config.xml', errors='strict') }}"
Clone a VM
Copy an existing VM with the clone command:
- name: Clone guest0 into guest1
cluster_vm:
name: guest1
src_name: guest0
command: clone
VM network configuration
The network configuration inside the VMs is done with the playbook file cluster_setup_network.yaml. You need to use an inventory that describes the VMs instead of the cluster as in the example vms_inventory_example.yaml file.
VM snapshots
Disk snapshots can be used to save the disk image data at a given moment, that can be later recovered.
Snapshot creation
Snapshots can be created when the VM is stopped or running, but if you perform a snapshot when the VM is running, only the data written on the disk will be saved.
Volatile data such as the content of the RAM or the data not written on the disk will not be stored on the snapshot.
Snapshot rollback
You can restore the VM to a determined previous state by performing a rollback operation based on a snapshot. The data saved during the snapshot operation will be restored and replace the current disk image data. All current disk image data will be lost. The rollback operation does not remove the snapshot, it is possible to reuse the snapshot to re-apply a later rollback.
The rollback operation must be applied on a disabled machine. So if the VM is enabled, it will be automatically disabled before the rollback and re-enabled once the operation is finished.
Other snapshot operations
With the cluster_vm module it is also possible to:
List all snapshots
Remove a particular snapshot
Remove multiple snapshots by purging:
All of them
The n oldest one
The oldest ones to a specific date
An example playbook that removes the snapshots created before a determined date would be:
The purge operation can be performed regularly to avoid over space. This can be easily done with a tool like Ansible Tower or AWX.
Update a VM
Updating the VM data inside the VM
Updating the VM data cannot be performed by the cluster_vm module, but you can use its snapshot system to cancel the update in case of error as described in the diagram below. To achieve this, you can base your playbook on the update skeleton example.
Updating VM configuration or metadata
The VM configuration and metadata are immutable. To change them, you must create a new VM from the existing one with the clone command.
The file update configuration example can help you to create a playbook to achieve this operation according to the following diagram.
Troubleshooting
This section describes the unstable scenarios that can occur while executing Ansible commands on the cluster and which operations should be performed to recover a stable situation.
Ansible command is interrupted
The execution of a cluster_vm command can be interrupted for different reasons: crash on the hypervisor, network failure, manual stop of the operation… For the commands that modify the system, the interruption might result in an undesirable scenario, where a fix action will be required:
Command | How to fix |
---|---|
create | Re-call the command with the force parameter set to true. |
clone | |
remove | Re-call the command. |
start | |
stop | |
create_snapshot | |
rollback_snapshot | |
remove_snapshot | |
enable | |
disable | |
purge_image | Note: purging snapshots according to number or date is not transactional. In case of interruption only a part of them might be removed. In this case, it is necessary to re-call the transaction. |
VM cannot be enabled
Enabling a VM on the Pacemaker cluster might fail if its XML configuration is invalid. Pacemaker will detect it and the VM will remain in a Stopped or Failed state, triggering a Timeout error. The commands that can enable a VM are:
Command | How to fix |
---|---|
create | Remove the VM (*), fix the configuration and try creating it again. |
clone | |
rollback_snapshot | |
enable |
(*) Note: Calling the create or clone commands with the force parameter set to true will automatically remove the VM before its creation.
“VM is not on the cluster” error
If the VM is not enabled on the Pacemaker cluster there are three commands that will fail with the “VM is not on the cluster” error.
Command | Error message | How to fix |
---|---|---|
start | VM is not on the cluster. | VM has to be created and enabled on the cluster. |
stop | ||
disable |
Unnecessary action / accessing nonexistent VM, snapshot or metadata
Creating a VM or snapshot that already exists or trying to access a nonexistent VM, snapshot or metadata will fail according to the following errors:
Command | Error message | How to fix |
---|---|---|
create | VM already exists. | Choose a nonexistent VM name. |
clone | VM already exists. | Choose an nonexistent VM name. |
Error opening image. | Choose an existent VM name. | |
remove | VM does not exist. | Choose an existing VM name. |
list_snapshots | Error opening image. | Choose an existing VM name. |
create_snapshot | Error opening image. | Choose an existent VM name. |
Snapshot already exists. | Choose a nonexistent snapshot_name. | |
rollback_snapshot | Error opening image. | Choose an existent VM name. |
Snapshot does not exist on VM. | Choose an existent snapshot_name. | |
remove_snapshot | Error opening image. | Choose an existent VM name. |
Error checking if snapshot is protected. | Choose an existent snapshot_name. | |
purge_image | Error opening image. | Choose an existent VM name. |
get_metadata | Error opening image. | Choose an existent VM name. |
No metadata for image. | Choose an existent metadata_name. |
Invalid parameter name
Names for VMs, snapshots and metadata keys must only contain letters and numbers without spaces. Additionally, metadata has also reserved keys that cannot be used. In case of not following these rules, the commands create, clone and create_snapshot will fail with the error “Parameter must not contain spaces or special chars”.
Command | Error message | How to fix |
---|---|---|
create | Parameters must not contain spaces or special chars. | Verify VM name and metadata keys. |
clone | Verify VM name and metadata keys (src_name and name cannot be the same). | |
create_snapshot | Verify snapshot_name. |