Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework ci and add e2e tests scaffolding #109

Open
wants to merge 1 commit into
base: trunk
Choose a base branch
from

Conversation

reimic
Copy link
Collaborator

@reimic reimic commented Jun 16, 2024

Summary

Rework CI with 2 dispatch workflows: automatic as a quick and thick quality gate for all pushes and pull requests and manual for a nice and easy way of geeting details on failing tests.

The first step toward achieving robust end-to-end (#108) coverage.

Major Changes

  • add a fast-failing broad automatic-dispatch run for pushes and merge requests
  • add a modular manual-dispatch run
  • add unit and e2e test suites to phpunit.xml.dist
  • add the defaultTestSuite ("all") attribute to phpunit.xml.dist
  • move the testdox flag from scripts to phpunit.xml.dist
  • add the E2ETestCase config for all e2e tests
  • add shell scripts to the README

The manual-dispatch allows for control over the branch it will run on, php version, os and test suite. Every default is set to all and the parameter FAIL_FAST is always set to false. This dispatches the run with a configuration that is broad and offers maximum info on the condition of the tested branch.

The automatic-disptach is more rigid and always tests all php versions, os and with all test suites. It also has the FAIL_FAST parameter set to true. This also introduces --stop-on-defect --fail-on-warning --fail-on-risky PHPUnit flags for enchanced quality control. Yet, this is a job-level feature. This means that the whole test matrix will always run (3 os * 7 php versions => 21 jobs). The benefit of the FAIL_FAST parameter is to run as few tests as possible to ascertain if a os-php version combination is stable.

I'd hope this will enable a workflow going more or less like that:

  1. You work on a feature on your local fork.
  2. After work is done you run tests locally with vendor/bin/phpunit, but unless you are really into PHP versions this will run on the single version you have configured as your default project interpreter C:\php\php-8.3.exe. Obviously, the local test run also only tests the code performance on your OS. For examples sake, let's say you are developting on a Mac and that all tests have passed.
  3. You push your code and the automatic-disptach kicks in. GitHub quickly informs you that some of the jobs failed. Since all 21 jobs have run you have an instant and quite broad idea of what is going on. Let's assume your Mac-coding skills are flawless and all but the Windows jobs have passed.
  4. You go the the manual-dispatch and pick your branch, windows-latest as the os to run on and decide to check all php versions. With an initial idea on the scope of your changes and data from the automatic runs you might have also conclued that only e2e tests actually failed. To save a few precious seconds you decide to only run the e2e suite.
  5. You enjoy precise infos and helpfull stacktraces for failing Windows tests.

Example Usage

  1. In the Actions tab, pick Manual Tests Dispatch.
    image
  2. Click the Run workflow button:
    image
  3. Customize your dispatch run with inputs:
    image
  • pick which branch to test:
    image
    image
  • pick which PHP version to use (all is default and will use all versions)
    image
    image
  • pick which OS to run on (all is default and will run on all os)
    image
    image
  • pick which PHPUnit test suite to run (all is default and will run both unit and e2e tests)
    image
    image
  1. Inspect final dispatch configuration:
    image
  2. Disptach tests.
    image
    image
  3. Enjoy the autogenerated run name on GitHub:
    image
  4. Inspect results:
    image
    (in the example only unit tests were selected, so e2e tests were skipped)

 add a fast-failing broad automatic-dispatch run for pushes and merge requests providing a quality gate
 add a modular manual-dispatch run providing details on issues
 add shell scripts to the README
 add defaultTestSuite ("all") attribute the phpunit.xml.dist
 move testdox flag from scripts to phpunit.xml.dist
@reimic reimic added the tests label Jun 16, 2024
@reimic reimic requested a review from adamziel June 16, 2024 00:03
@adamziel
Copy link
Collaborator

How slow is the manual workflow? Couldn't it just run automatically? Typically everyone forgets about these manual workflows and they get stale and start failing soon.

@reimic
Copy link
Collaborator Author

reimic commented Jun 20, 2024

Typically everyone forgets about these manual workflows and they get stale and start failing soon.

Couldn't it just run automatically?

I see why you would inquire about that. To clarify: it's not possible for them to get stale or desync in any way - they run the same test suites that are run automatically. This is not a different set of tests - it is a different way of running tests, with modularity in mind.

When considering their usefulness - think "dev tool" not quality gate. When pushing or PR-ing all test suites do run automatically.

How slow is the manual workflow?

On my machine e2e tests already take a while (with just 3/4 implemented). Mind that this is for a single php - os run. The containers on GitHub are a bit faster, since the download times are much better. (Probably due to mirrors and data transfer speed.) Also they do try to run in parallel, yet they start to queue up quite quickly. To speak beyond numbers - I have noticed the issue almost instantly. And this worried me to the extent that I've tackled it on two fronts - see "observations" here: #108.
If not addressed it will hinder our goal to offer superb reliability. Since that requires tests for various complex blueprints with larger downloads.

To reiterate:

  • the manual workflow is a dev tool not a quality measure,
  • it can't get stale,
  • it lets you run all 21 combinations on your fork, at any time you wish, in a form fine-tuned for the issue you are facing and can't get info on locally,
  • it does not strain shared resources,
  • this is a PR that introduces the concept not a fully fleshed out version, I'll work on it and make it more powerful.

@adamziel
Copy link
Collaborator

Can you make it fail fast and also produce the full error output?

@adamziel
Copy link
Collaborator

Eg mark the step as failed to make the job red, and then, on failure, start a full test run that won't stop after the first failure

@reimic
Copy link
Collaborator Author

reimic commented Jun 20, 2024

Eg mark the step as failed to make the job red, and then, on failure, start a full test run that won't stop after the first failure

Sure. GitHub steps produce a conclusion and an outcome that can be used in logical expressions within a job. (Side note: a workflow has jobs that may include steps) There is a faint difference between one and another and they might be helpful here. Workflows share this feature and can trigger workflows depending on their status:

on:
  workflow_run:
    workflows: [Build]
    types: [completed]

jobs:
  on-success:
    runs-on: ubuntu-latest
    if: ${{ github.event.workflow_run.conclusion == 'success' }}
    steps:
      - run: echo 'The triggering workflow passed'
  on-failure:
    runs-on: ubuntu-latest
    if: ${{ github.event.workflow_run.conclusion == 'failure' }}
    steps:
      - run: echo 'The triggering workflow failed'

Help me see your point. What you are saying is that the initial automatic fast-failing workflow would trigger another more robust workflow. Which is fine and clearly doable, but the new jobs will compete over the same resource pool.

I could explore 2 strategies here:

  • The automatic workflow fails ultra fast (at first sign of problems to any step in any job) and then queues the robust workflow. This makes initial feedback rapid and also automatically provides robust info, but with a slight delay. I wonder if I could deprioritize it, to make all automatic workflows skip to the front of the queue.
  • The automatic workflow fails fast (jobs fails fast) and the queued robust workflow only runs for the php-os versions that reported issues.

Of course all this can notify interested parties and all that jazz. But that is beyond the point here.

I am on the fence here. With the current state of this PR, if an automatic workflow fails. You can decide if you care about further development and if you do - for the cheap price of 2 clicks - run the manual workflow on your own fork to see how your code performs for the php-os combination of your choosing.

I see how this could be improved upon, but my recommendation is as follows: Marge this as it is. I'll add several e2e tests next and once I am done and execution times are more apparent we'll circle back to this issue and evaluate how useful it has proven to be and how it could be enhanced.

@adamziel
Copy link
Collaborator

I mean the tests we have took 37s to finish, this fast/slow mode seems a but premature. Let's just ship the full version and scale back as needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants