Since T378478, tests are allocated to split_groups in alphabetical order by filename rather than round-robin. This creates less time-balanced groups - the tests for some extensions take longer than for others, and the slower extensions then slow down the split_group that they are included in.
T384925 creates a service that makes combined timing information available for download. The service provides data for both unit (databaseless) and integration (database) tests, since the same PHPUnit test classes will have different test run durations depending on which PHPUnit test group is executed.
1113147 demonstrates a possible approach here, and shows how the duration information can be applied to the split group generation. In local testing for T378797, the current split group generation implementation (TestSuiteBuilder::buildSuites) was not able to generate balanced groups. The existing algorithm assumes that timing information will not always be available, and in the absence of timing data splits the groups so that they have the same number of test classes. Currently the fallback class counting behaviour dominates the splitting process so that the duration information never becomes relevant.
Investigate (possibly based on the existing patches) how to generate duration-balanced split groups using the available timing information, and create an implementation for mediawiki/core that implements balanced generation.
Acceptance Criteria
- Make changes to mediawiki/core to download the combined timing data and apply it to the split_group generation
- Ensure that the split_groups are more balanced (i.e. the difference between the maximum and minimum runtime of the split groups is lower) in the presence of timing data than in its absence.
- Ensure that there is no regression in group duration balancing after whatever changes are made, even if timing data is no longer available.