[pve-devel] [RFC] towards automated integration testing

Fri Oct 13 15:33:26 CEST 2023

Hello,

I am currently doing the groundwork that should eventually enable us
to write automated integration tests for our products.

Part of that endeavor will be to write a custom test runner, which will
   - setup a specified test environment
   - execute test cases in that environment
   - create some sort of test report

What will follow is a description of how that test runner would roughly
work. The main point is to get some feedback on some of the ideas/
approaches before I start with the actual implementation.

Let me know what you think!

## Introduction

The goal is to establish a framework that allows us to write
automated integration tests for our products.
These tests are intended to run in the following situations:
- When new packages are uploaded to the staging repos (by triggering
   a test run from repoman, or similar)
- Later, this tests could also be run when patch series are posted to
   our mailing lists. This requires a  mechanism to automatically
   discover, fetch and build patches, which will be a separate,
   follow-up project.
- Additionally, it should be easy to run these integration tests locally
   on a developer's workstation in order to write new test cases, as well
   as troubleshooting and debugging existing test cases. The local
   test environment should match the one being used for automated testing
   as closely as possible

As a main mode of operation, the Systems under Test (SUTs)
will be virtualized on top of a Proxmox VE node.

This has the following benefits:
- it is easy to create various test setups (fixtures), including but not
   limited to single Proxmox VE nodes, clusters, Backup servers and
   auxiliary services (e.g. an LDAP server for testing LDAP
   authentication)
- these test setups can easily be brought to a well-defined state:
   cloning from a template/restoring a backup/rolling back to snapshot
- it makes it easy to run the integration tests on a developers
   workstation in identical configuration

For the sake of completeness, some of the drawbacks of not running the
tests on bare-metal:

- Might be unable to detect regressions that only occur on real hardware

In theory, the test runner would also be able to drive tests on real
hardware, but of course with some limitations (harder to have a
predictable, reproducible environment, etc.)

## Terminology
- Template: A backup/VM template that can be instantiated by the test
   runner
- Test Case: Some script/executable executed by the test runner, success
   is determined via exit code.
- Fixture: Description of a test setup (e.g. which templates are needed,
   additional setup steps to run, etc.)

## Approach
Test writers write template, fixture, test case definition in
declarative configuration files (most likely TOML). The test case
references a test executable/script, which performs the actual test.

The test script is executed by the test runner; the test outcome is
determined by the exit code of the script. Test scripts could be written
in any language, e.g. they could be Perl scripts that use the official
`libpve-apiclient-perl` to test-drive the SUTs.
If we notice any emerging patterns, we could write additional helper
libs that reduce the amount of boilerplate in test scripts.

In essence, the test runner would do the following:
- Group testcases by fixture
- For every fixture:
     - Instantiate needed templates from their backup snapshot
     - Start VMs
     - Run any specified `setup-hooks` (update system, deploy packages,
     etc.)
     - Take a snapshot, including RAM
     - For every testcase using that fixture:
         - Run testcase (execute test executable, check exit code)
         - Rollback to snapshot (iff `rollback = true` for that template)
     - destroy test instances (or at least those which are not needed by
       other fixtures)

In the beginning, the test scripts would primarily drive the Systems
under Test (SUTs) via their API. However, the system would also offer
the flexibility for us to venture into the realm of automated GUI
testing at some point (e.g. using selenium) - without having to
change the overall test architecture.

## Mock Test Runner Config

Beside the actual test scripts, test writers would write test
configuration. Based on the current requirements and approach that
I have chose, a example config *could* look like the one following.
These would likely be split into multiple files/folders
(e.g. to group test case definition and the test script logically).

```toml
[template.pve-default]
# Backup image to restore from, in this case this would be a previously
# set up PVE installation
restore = '...'
# To check if node is booted successfully, also made available to hook
# scripts, in case they need to SSH in to setup things.
ip = "10.0.0.1"
# Define credentials in separate file - most template could use a
# default password/SSH key/API token etc.
credentials = "default"
# Update to latest packages, install test .debs
# credentials are passed via env var
# Maybe this could also be ansible playbooks, if the need arises.
setup-hooks = [
     "update.sh",
]
# Take snapshot after setup-hook, roll back after each test case
rollback = true

[template.ldap-server]
# Backup image to restore from
restore = '...'
credentials = "default"
ip = "10.0.0.3"
# No need to roll back in between test cases, there won't be any changes
rollback = false

# Example fixture. They can be used by multiple testcases.
[fixture.pve-with-ldap-server]
# Maybe one could specify additional setup-hooks here as well, in case
# one wants a 'per-fixture' setup? So that we can reduce the number of
# base images?
templates = [
     'pve-default',
     'ldap-server',
]

# testcases.toml (might be split to multiple files/folders?)
[testcase.test-ldap-realms]
fixture = 'pve-with-ldap-server'

# - return code is check to determine test case success
# - stderr/stdout is captured for the final test report
# - some data is passed via env var:
#   - name of the test case
#   - template configuration (IPs, credentials, etc.)
#   - ...
test-exec = './test-ldap-realms.pl'
# Consider test as failed if test script does not finish fast enough
test-timeout = 60
# Additional params for the test script, allowing for parameterized
# tests.
# Could also turn this into an array and loop over the values, in
# order to create multiple test cases from the same definition.
test-params = { foo = "bar" }

# Second test case, using the same fixture
[testcase.test-ldap-something-else]
fixture = 'pve-with-ldap-server'
test-exec = './test-ldap-something-else.pl'

```

-- 
- Lukas