[pve-devel] [RFC] towards automated integration testing

Tue Oct 17 14:33:20 CEST 2023

On 10/17/23 08:35, Thomas Lamprecht wrote:
>>> Is the order of test-cases guaranteed by toml parsing, or how are intra-
>>> fixture dependencies ensured?
>>>
>>
>> Good point. With rollbacks in between test cases it probably does not
>> matter much, but on 'real hardware' with no rollback this could
>> definitely be a concern.
>> A super simple thing that could just work fine is ordering test
>> execution by testcase-names, sorted alphabetically. Ideally you'd write
>> test cases that do not depend on each other any way, and *if* you ever
>> find yourself in the situation where you *need* some ordering, you
>> could> just encode the order in the test-case name by adding an integer
>> prefix> - similar how you would name config files in /etc/sysctl.d/*,
>> for instance.
> 
> 
> While it can be OK to leave that for later, encoding such things
> in names is IMO brittle and hard to manage if more than a handful
> of tests, and we hopefully got lots more ;-)
> 
> 
>  From top of my head I'd rather do some attribute based dependency
> annotation, so that one can depend on single tests, or whole fixture
> on others single tests or whole fixture.
> 

The more thought I spend on it, the more I believe that inter-testcase
deps should be avoided as much as possible. In unit testing, (hidden)
dependencies between tests are in my experience the no. 1 cause of
flaky tests, and I see no reason why this would not also apply for
end-to-end integration testing.

I'd suggest to only allow test cases to depend on fixtures. The fixtures
themselves could have setup/teardown hooks that allow setting up and
cleaning up a test scenario. If needed, we could also have something
like 'fixture inheritance', where a fixture can 'extend' another,
supplying additional setup/teardown.
Example: the 'outermost' or 'parent' fixture might define that we
want a 'basic PVE installation' with the latest .debs deployed,
while another fixture that inherits from that one might set up a
storage of a certain type, useful for all tests that require specific 
that type of storage.
On the other hand, instead of inheritance, a 'role/trait'-based system
might also work (composition >>> inheritance, after all) - and
maybe that also aligns better with the 'properties' mentioned in
your other mail (I mean this here:  "ostype=win*", "memory>=10G").

This is essentially a very similar pattern as in numerous other testing
frameworks (xUnit, pytest, etc.); I think it makes sense to
build upon this battle-proven approach.

Regarding execution order, I'd now even suggest the polar opposite of my 
prior idea. Instead of enforcing some execution order, we could also 
actively shuffle execution order from run to run, at least for tests 
using the same fixture.
The seed used for the RNG should be put into the test
report and could also be provided via a flag to the test runner, in case
we need to repeat a specific test sequence .
In that way, the runner would actively help us to hunt down
hidden inter-TC deps, making our test suite hopefully less brittle and
more robust in the long term.

Any way, lots of details to figure out. Thanks again for your input.

-- 
- Lukas