Benchmarks #112

ethomson · 2025-01-19T15:15:33Z

Introduce a simple benchmark system for clar. This is an extension of #74, but adding some additional capabilities.

Tests can now have multiple runs, which will be timed individually (and in aggregate)
When running in benchmark mode, clar will run at least 10 iterations of the test function, and attempt to run 3 seconds worth of iterations by default. The number of iterations can be controlled by the test itself.
We use a high-res monotonic timer on all platforms, if available.
We display a helpful output for benchmark mode, and produce a JSON file with all time information; both outputs are inspired by (but not identical to) hyperfine's output

Allow the resulting application names to be configured by the user, instead of hardcoding `clar_suite.h` and `clar.suite`. This configuration will also customize the struct names (`clar_func`, etc.) Also allow the test names to be configured by the user, instead of hardcoding `test_` as a prefix. This allows users to generate test functions with uniquely prefixed names, for example, if they were generating benchmark code instead of unit tests.

Tests can now have optional metadata, provided as comments in the test definition. For example: ``` void test_spline__reticulation(void) /*[clar]: description="ensure that splines are reticulated" */ { ... } ``` This description is preserved and produced as part of the summary XML.

Move the elapsed time calculation to `counter.h`, and use high-resolution monotonic performance counters on all platforms.

Refactor the `ontest` callback (which is implicitly test _finished_) into a test started and test finished callback. This allows printers to show the test name (at start) and its conclusion in two steps, which is advantageous for users to see the current test during long-running test executions. In addition, rename `onsuite` to `suite_start` for consistency.

Allow tests to specify that they should have multiple runs. These runs all occur within a single initialization and cleanup phase, and is useful for repeatedly testing the same thing as quickly as possible. The time for each run is recorded, which may be useful for benchmarking that test run.

An application can provide _benchmarks_ instead of _tests_. Benchmarks can run multiple times, will calculate the times of each run, and some simple additional data (mean, min, max, etc). This information will be displayed and will optionally be emitted in the summary output. Test hosts can indicate that they're benchmarks (not tests) by setting the mode before parsing the arguments. This will switch the output and summary format types.

In benchmark mode, when the number of runs was not explicitly specified in the test itself, run a reasonably number of iterations. We do this by measuring one run of the test, then using that data to determine how many iterations we should run to fit within 3 seconds. (With a minimum number of iterations to ensure that we get some data, and a maximum number to deal with poor precision for fast test runs.) The 3 second number, and 10 iteration minimum, were chosen by consulting the hyperfine defaults.

For multi-run tests (benchmarks), we introduce a `reset` function. By default, between each run of a test, the initialization will be called at startup, and the cleanup will be called at finish. A benchmark may wish to set up multi-run state at the beginning of the invocation (in initialization), and keep a steady state through all test runs. Users can now add a `reset` function so that initialization occurs only at the beginning of all runs.

Well-written clar tests (those that clean up after themselves) are capable of running in benchmark mode. Provide it as an option.

ethomson added 2 commits January 19, 2025 00:37

ethomson force-pushed the ethomson/benchmark branch from f59b798 to 8c24def Compare January 19, 2025 15:18

ethomson added 9 commits January 19, 2025 13:45

use monotonic performance counter for elapsed time

dcfbaf6

Move the elapsed time calculation to `counter.h`, and use high-resolution monotonic performance counters on all platforms.

refactor error struct

21f8273

benchmark mode command-line option

6f3260b

Well-written clar tests (those that clean up after themselves) are capable of running in benchmark mode. Provide it as an option.

Update README with benchmark information

0290399

ethomson force-pushed the ethomson/benchmark branch from 3cf1322 to 0290399 Compare January 19, 2025 21:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmarks #112

Benchmarks #112

ethomson commented Jan 19, 2025

Benchmarks #112

Are you sure you want to change the base?

Benchmarks #112

Conversation

ethomson commented Jan 19, 2025