[pve-devel] [PATCH ha-manager 13/15] test: ha tester: add test cases for loose colocation rules

Mon Apr 28 16:44:30 CEST 2025

Am 25.03.25 um 16:12 schrieb Daniel Kral:
> Add test cases for loose positive and negative colocation rules, i.e.
> where services should be kept on the same node together or kept separate
> nodes. These are copies of their strict counterpart tests, but verify
> the behavior if the colocation rule cannot be met, i.e. not adhering to
> the colocation rule. The test scenarios are:
> 
> - 2 neg. colocated services in a 3 node cluster; 1 node failing
> - 2 neg. colocated services in a 3 node cluster; 1 node failing, but the
>   recovery node cannot start the service
> - 2 pos. colocated services in a 3 node cluster; 1 node failing
> - 3 pos. colocated services in a 3 node cluster; 1 node failing, but the
>   recovery node cannot start one of the services
> 
> Signed-off-by: Daniel Kral <d.kral at proxmox.com>

With the errors in the descriptions fixed:

Reviewed-by: Fiona Ebner <f.ebner at proxmox.com>

> diff --git a/src/test/test-colocation-loose-separate4/README b/src/test/test-colocation-loose-separate4/README

Not sure it should be named the same number as the strict test just
because it's adapted from that.

> new file mode 100644
> index 0000000..5b68cde
> --- /dev/null
> +++ b/src/test/test-colocation-loose-separate4/README
> @@ -0,0 +1,17 @@
> +Test whether a loose negative colocation rule among two services makes one of
> +the services migrate to a different recovery node than the other service in
> +case of a failover of service's previously assigned node. As the service fails
> +to start on the recovery node (e.g. insufficient resources), the failing
> +service is kept on the recovery node.

The description here is wrong. It will be started on a different node
after the start failure.

> +
> +The test scenario is:
> +- vm:101 and fa:120001 should be kept separate
> +- vm:101 and fa:120001 are on node2 and node3 respectively
> +- fa:120001 will fail to start on node1
> +- node1 has a higher service count than node2 to test the colocation rule is
> +  applied even though the scheduler would prefer the less utilized node
> +
> +Therefore, the expected outcome is:
> +- As node3 fails, fa:120001 is migrated to node1
> +- fa:120001 will be relocated to another node, since it couldn't start on its
> +  initial recovery node

---snip 8<---

> diff --git a/src/test/test-colocation-loose-together1/README b/src/test/test-colocation-loose-together1/README
> new file mode 100644
> index 0000000..2f5aeec
> --- /dev/null
> +++ b/src/test/test-colocation-loose-together1/README
> @@ -0,0 +1,11 @@
> +Test whether a loose positive colocation rule makes two services migrate to
> +the same recovery node in case of a failover of their previously assigned node.
> +
> +The test scenario is:
> +- vm:101 and vm:102 should be kept together
> +- vm:101 and vm:102 are both currently running on node3
> +- node1 and node2 have the same service count to test that the rule is applied
> +  even though it would be usually balanced between both remaining nodes
> +
> +Therefore, the expected outcome is:
> +- As node3 fails, both services are migrated to node2

It's actually node1