Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Api merge without fixes #223

Draft
wants to merge 64 commits into
base: main
Choose a base branch
from

Conversation

cgsheeh
Copy link
Member

@cgsheeh cgsheeh commented Feb 6, 2025

No description provided.

cgsheeh and others added 30 commits December 14, 2023 10:35
…it` diff (Bug 1709608) (#364)

Extend `get_uplift_conduit_state` to also query `differential.querydiffs`
and return all known diffs associated with the selected uplift
revisions. When creating the uplift revisions, add a function to
parse through the diffs and return the latest diff that does not
have a `creationMethod` of `commit`. Then find the associated diff
response from `differential.diff.search` and pass both along to the
uplift revision creation function. When parsing binary from the
`getrawdiff` API response, re-use the metadata from the `querydiffs`
response so the pre-uploaded binary file is re-used in the uplift revision.
Update `ruff` and `black` to the latest versions, and apply
some fixes that `ruff` finds to the codebase.
While debugging bug 1871425 we orignally had a suspicion that
the issue was resulting from Lando erroneously applying later
patches in the stack when it should only have applied the
requested patches. I noticed we don't have a test for this
case, so I added one myself, and it passed as expected. We
might as well keep this simple test around just to provide
extra test coverage of this use case.
…es (Bug 1843041) (#369)

Currently a Lando based uplift request populates the required fields
in Phabricator by pulling down the raw diff from `differential.getrawdiff`,
passing the diff through `rs-parsepatch` and populating a dict in
the format Phabricator expects. This has worked decently well to date,
however this method doesn't include the full file context, leading
to diff display issues on Phabricator.

The `differential.querydiffs` endpoint returns data about a diff in
the expected Phabricator format. We have already started using this
data to populate the binary files metadata for uplift requests.
Taking a closer look at the response we can see that the full file
context is included in this format, and the diff itself is the
exact same diff we expect to be uploaded as the uplift request on
Phabricator.

Remove the code which parses the `differential.getrawdiff` response
into the expected Phabricator format and instead simply re-use the
data from the `differential.querydiffs` response for the uplift
revision. This will provide a more accurate view of the diff that
will land on the uplift train. The content of the file itself may
be different than what exists on the target train, however this
problem already exists in the existing Lando uplift implementation.
Add the `--verbose` flag to the `mach lint` call so more information
is displayed in log output when linters fail.
…ug 1856590) (#373)

Some patches sent with Git will include whitespace at the end of the
patch, meaning the Git version info is the last 3 lines instead of
2. Add a function which scans the lines in reverse and looks for
the first line beginning with `--` instead of using a naive slice
that removes the last two lines of output.

Update the Git parsing test to include extra lines of whitespace
to account for this change in the main tests for `GitPatchParser`.
Also add a simple test for the new parsing function.
…(#375)

Also rename the `r` variable to `reviewers` while we are here.
…451) (#376)

In bug 1864463 we added a snippet so pushlog operational failures in
hgmo resulted in a retry instead of a permanent failure of the landing
job. The snippet we check for was the very generic `abort: push failed
on remote`, which is actually an error message seen for many other
failures, including problems with the patch that will always cause
the push to be rejected.

Remove the generic snippet and replace with a more specific error about
pushlog operational failures. Also add a snippet about closed connections.
`transplants.py` has a function for comparing the equality of two
confirmation tokens that is a simple wrapper around `==` to allow
for easy replication of unequal tokens in test conditions. However,
the pytext fixture which is hooked into this shim is unused. Remove
the shim and replace the function with a simple `==` comparison.
…25) (#381)

Due to the WPT linter breaking in a way that is difficult to debug,
we have had autoformatting disabled in Lando for a short while. There
are still some blockers with bug 1807712 that mean we can't directly
move to `mach format`. In the meantime, to re-enable autoformatting,
add an explicit allow list for formatters which will run in Lando
and update the call to `mach lint` to use them.
…landing (Bug 1880861) (#382)

Add a blocking check that enforces revisions with the `needs-data-classification`
tag cannot land. The check is similar to the testing policy project tag checks
except it simply looks for the `needs-data-classification` tag and
blocks landing if it is present on the revision. The tag requires the
PHID of the project tag so it is generated with a factory function in a
similar manner to the release manager approval blocking check.
In the list comprehension where we set `job.revision_order` in
`create_landing_job_with_no_linked_revisions`, we iterate over
`revisions` and set the iteration variable as `r`, but the
value that ends up in the list uses the leftover `revision`
variable from the loop iteration above. Rename `r` to `revision`
so the correct variable is used.
Add the existing `large-repo` Phabricator repo to Lando so patches can
be landed there for testing.
…472) (#283)

This commit implements the Treestatus API as a feature of Lando.
The Treestatus OpenAPI spec is added to Lando's API spec and most
endpoints are re-implemented here. The following Treestatus endpoints
are removed in the Lando implementation:

- `/v0/*` endpoints.
- `DELETE /trees2`

The following features are also removed in the Lando implementation:

- StatusPage updating.
- Pulse message publishing.

Treestatus' database, caching and logging is changed to integrate with
Lando's Flask platform.

A new `group` argument is added to the Auth0 decorators to specify the
user must be a member of the specified groups to be permitted to
access certain API endpoints, and two new `TREESTATUS_ADMIN` and
`TREESTATUS_USER` group name constants are added. A `user_id()`
function is added to `A0User` to retrieve the current user ID string
for usage in the `who` field of status updates.

Integration tests for the endpoints are added to show typical usage
patterns and ensure the endpoints return output in a format consistent
with the existing Treestatus API.

A few database adjustments are made in this migration. First, the `tree`
field of `Tree` is now a unique index instead of a primary key, as Lando
models all have an `id` field as the primary key. This also involves
updating some `.get()` calls in SQLALchemy to instead by `filter_by`.
Second, the custom `UTCDatetime` object used by the existing Treestatus
implementation is replaced with a normal `DateTime` SQLAlchemy object.

Since Treestatus is now a part of Lando after this PR, the existing
Treestatus client is removed, and replaced with an `is_open` function
that directly queries the DB to retrieve the status of the tree. The
TreestatusDouble test mocker is removed and replaced with a simple
fixture to create a tree and set it to a specific status in the
tests. The `TreestatusSubsystem` is also removed as it is no longer
necessary.

After the above changes, many more code cleanups and refactorings
were completed to make the code easier to reason about and more
maintainable.
This commit begins the process of unifying and refactoring the
landing checks system, which blocks stacks and revisions from
landing and issues warnings about various issues in the landing
that are not blockers. The state of this revision gives us a
base to work on for extending the lints system to add Mercurial
commit hooks, clean up some unnecessary data fields, and improve
the user experience around presentation of landing lints.

The existing checks are split up into three categories. First are
stack-level blocker checks, which inspect the entire state of the
stack and block landings entirely. The second are revision-level
blocker checks which inspect the state of individual revisions and
block landing of each revision and it's successor nodes. Finally
are the revision-level warnings, which do not block landings but
will present information about sub-optimal landing conditions that
must be acknowledged before landing.

Each piece of state used by individual checks is added to a new
TransplantAssessmentState class. An instance of this class is
passed to each lint to keep the API consistent and discoverable.
Some fields on this class have been moved directly from existing
checks, and future refactors should reduce the number of fields on
this class.

The transplant assessment API and the calculate_landable_subgraphs
API are combined into a single API as part of this commit. Since
the previous setup had two different use cases, some state is not
required to run landing checks on non-transplant assessments. All
pieces of state required to check for a valid landing request are
encapsulated in LandingAssessmentState, which is an optional field
on TransplantAssessmentState. Checks which should only run on
landing assessments can look for the existance of a LandingAssessmentState
on the TransplantAssessmentState.

The calculate_landable_subgraphs API and the custom graph walking
code is replaced by the new APIs. When building the TransplantAssessmentState
we create a deep copy of the RevisionStack object and name it the
landing_stack, where blocked revisions are removed from this stack.
Then all landable paths in the landing_stack from the root revisions
to the leaf revisions are considered to be landable subgraphs. A new
leaf_revisions method is added which returns all nodes that have
no successors in the graph. Each node in the original stack object has a
blocked attribute which is initialized to an empty list. When a
blocking check fails on a revision the reason is appended to the
list in the full revision stack.
…e_state` (Bug 1895933) (#400)

Make `landing_state` a separate argument in `create_state`, so callers
who want to create a custom landing state must do so with
`create_landing_state` directly instead of indirectly via
`create_state`.
This commit adds a blocking check to reject the introduction of
symlinks into the repo. It is a port of the `prevent_symlinks.py`
Mercurial server-side hook. First we re-introduce the `rs-parsepatch`
library as a dependency. We add a `get_parsed_diffs` function to
query Conduit's `differential.getrawdiff` API endpoint to retrieve
the raw diff content for each Phabricator diff, pass the contents
through `rs_parsepatch.get_diffs` and add the response to the
`StackAssessmentState` object. This will allow us to share the
parsed diff contents between this check and future diff-inspecting
checks that will be introduced shortly. We then add a revision-wise
check which inspects the `new` file modes for each file in the diff and
asserts the mode does not match the known symlink decimal notation.
This command was a one-off to import data into Lando from
RelEng Treestatus. It should be removed to avoid being run
again.
…1896696) (#406)

Our current snippet parsing for `hg push` output looks for `is CLOSED`
to determine if a tree is closed. When the Treestatus API returns an
error, it will instead print that is it "treating as if CLOSED". Since
we don't check for that specific string we can sometimes treat closures
as permanent failures. Add a snippet to account for this tree-closed
state.
…) (#405)

There is a test repo `uplift-target` used in some tests that is
added to the `local-dev` repo config. In reality this repo should
be present in the mocked out testing-only `test` repo config. Move
this test repo to the correct location and update tests with fixtures
to make the repo available.
cgsheeh and others added 30 commits May 23, 2024 10:30
…97007) (#411)

Similar to the logic for blocking checks, only add revision warnings
to the assessment if the PHID in question is in the relevant assessment
set. I noticed that warnings for successor revisions were appearing when
trying to land only part of a stack, which this PR resolves.
… (Bug 1895931) (#401)

Add a blocking check that asserts the `try_task_config.json` file is not
introduced erroneously to a non-try repo.
…9584) (#417)

Add a `DiffAssessor` class as an internal API for running
diff-based checks. Move existing diff-inspecting checks
into the API and update landing checks to use them. Add
a `run_diff_checks` method which runs each check and
returns a list of string errors for display on failure.
Hook the `run_diff_checks` API into the try push revision
building function, so the currently ported checks are
run on try pushes. Add integration tests to show that
pushes are accepted/rejected as appropriate based on
these diff checks.

Since some checks will require inspection of the landing
repo to determine if the check is valid, we add a
`get_repo_for_revision` method that returns the repo for
a revision based on the passed `revision` Phabricator API
object. The logic is derived from the uplift blocking
check where it is updated, and is re-used in the
`try_task_config.json` blocker test.
…ug 1902545) (#420)

Port the commit message validation hook from hgmo into a Lando
`DiffAssessor` check. Most of the code and accompanying regexes
are ported directly from version-control-tools, with a few minor
readability cleanups and removals of some unused features around
vendored directories and deprecated workflow checks. Tests from
v-c-t are ported as well and some additional checks are added as
required.

Not ported here is the `IGNORE BAD COMMIT MESSAGES` override, which
will be added at a later date.
…ando checks (Bug 1903456) (#425)

Add a `check_wpt_sync` check to the `DiffAssessor` class which
is a port of the `prevent_wptsync_changes.py` hook in v-c-t.
The hook uses a regex to assert that the WPTSync bot user cannot
make changes to files outside of a specific directory in the main
tree. The bot is allowed to push any changes to `try`, but is
otherwise only allowed to make changes to files matching the regex.

Implement the `prevent_nspr_nss_changes.py` hook from v-c-t as a
check on the `DiffAssessor` API within Lando. The hook checks for
file changes under given directories in the tree and prevents updates
without a specific keyword in the commit message.
…3462) (#426)

Add a check to the `DiffAssessor` API that prevents introduction of
the `.gitmodules` file, which tracks Git submodules in the repository.
Due to a regression in newer Python build tools, our
currently installed version of `celery` fails to install
in the Docker image build as a result of improperly formatted
version specifiers. Install the latest version of `celery`
to resolve the issue, which also requires an update to
the `kombu` dependency. Updating `kombu` then causes a
regression in `redis`, so update `redis` to the latest
`3.x` version.

In newer versions of `celery` the pytest fixtures are disabled
by default, so add a `pytest_plugins` variable to the top-level
`conftest.py` to enable it.

Celery commands now fail when `sys.argv[0]` is passed as part
of `argv`, so remove them instead. The worker also requires the
`worker` command as the first argument, so update the call to
add it. We also add the broker URL as the `results_backend` in
the worker config.
Since `hg id <repo_url>` hits the network to connect to hgmo,
it can fail due to internal server errors and other connectivity
issues. In this case, we want the landing job to be retried in
the same fashion as when failing to pull/push. Map the exception
raised by `run_hg` to an `HgException` so the failure will be
considered a retry in the landing worker.
…t is found (Bug 1921023) (#433)

Previously we had just removed the last two lines of output,
however if the patch has extra newlines this could fail and
cause the patch to be deformed. Use the helper function we
added to avoid issues with parsing the commit message.
… 1893976) (#434)

When parsing the Git patch sent through `git format-patch`, the
email parser returns escaped unicode. Explicitly set the charset
of the email to `utf-8` so the unescaped unicode string is returned.

Add a test which uses a real-life example of this issue, which fails
without specifying the charset but passes afterwards.
…(Bug 1915671) (#432)

Adds an API for push-wide checks and implements the bug references check
using it. The API defines a `PushCheck` interface which initializes the
check with a repo, defines a `next_diff()` function which accepts each diff 
parsed as a `PatchHelper`, and a `result()` function which returns the error
message for the check, or `None` if the check passed.

The `BugReferencesCheck` is the first check to be implemented using this
new API. The check is only relevant on `try` repositories. Each diff is
passed via `next_diff`, and if `SKIP_BMO_CHECK` is found in the commit
message of any diff, a flag is set which indicates the check will be
skipped, otherwise the bug numbers found in the commit message are
stored. On `result()` the `skip_check` flag will cause an unconditional
pass, otherwise BMO is pinged to check the bug numbers in the commit
message are all referencing public bugs.

Since the `author` field was always assumed to be an email address, it
was renamed to the more accurate `email` name.
…91) (#435)

`eslint` can make more than just formatting fixes, so we should disable
it.
…20) (#437)

`headers` includes headers passed to the callers, which we add to the
`common_headers` dict to include things like the UA string and
API key. When we create the actual request, we are instead passing
the `headers` object and ignoring our common headers. Pass
`common_headers` to `requests.request` instead.
… 1926431) (#439)

Change `parse_bugs` to use `BUG_CONSERVATIVE_RE` to avoid incorrect parsing
of bug numbers during the `BugReferencesCheck`. Other code paths use the
`parse_bugs` function as well, but their tests pass after this change, so
we fully remove `BUG_RE` since it is no longer used.

Also remove the accidental double-definition of `BUG_RE` and
`BUG_CONSERVATIVE_RE` from `commit_message.py`.
…927013) (#440)

The `PhabricatorDouble` mock API calls take the `limit` parameter,
but do not actually limit the number of entries in the response data.
Add a check if `limit` is set and slice the `items `list at that index.
… (#438)

Add a warning that checks when a revision has diffs from multiple
authors associated with it. First we add support to specify a diffs
author in the test suite. Then we update `request_extended_revision_data`
to pull diffs based on the `revisionPHID` instead of the diff `id`,
which returns all diffs for a given revision. We update
`gather_involved_phids` to take the list of diffs for a revision and
gather all diff authors as involved in the revision. We then add
the warning which gathers all the PHIDs that have authored diffs
on a revision, and emit a warning when there are multiple authors,
including the Phabricator username of each author.
- ci-admin was moved into ci-configuration.
- ci-configuration was moved to Github.
- taskgraph was moved to Github.
- fluent-migration was moved to Github.
- build-tools is deprecated.
…15) (#441)

`parse_git_author_information` is using `email.utils.parseaddr` to
parse the authorship information from Git and Mercurial patches.
In Mercurial, the standard Git format is not required, so users can
use values like `ffxbld` as their author header. When the parser
encounters fields such as this, the improperly formatted name
is returned as the email.

Update `parse_git_author_information` to return the raw header
as the username when the email is parsed as empty.
…CollectionCheck` (Bug 1932507) (#442)

This commit refactors the `DiffAssessor` API to more closely
match the `PatchCollectionCheck` API. A `PatchCheck` abstract
class is introduced which has a similar interface as the
`PatchCollectionCheck`, where each check can store state
as each part of the patch is iterated over using `next_diff`,
and `result()` is called to determine the final result of
the check. Each check in the `DiffAssessor` class is moved
to a class which subclasses `PatchCheck`.

To make the patch checks more closely resemble the patch
collection checks, each check to be run is passed as an
argument to `run_patch_collection_checks`, which allows
us to run the relevant checks based on the code path
or known relevant repository instead of unconditionally
running all checks. As a result, all logic around inspecting
the repo and determining if a check is relevant is removed
from the checks, simplifying the APIs.
… (#443)

Currently when an uplift without a bug number hits the
Bugzilla update code path, an exception is thrown which
is turned into a failure email indicating Bugzilla could
not be updated. This commit changes the behaviour to
avoid trying to update Bugzilla if no bug numbers are
detected.
…rs` calls (Bug 1936373) (#446)

Catch any exceptions thrown while setting reviewers from `mots.yaml`, so
they do not take down the landing worker and cause lag in the queue.
…936373) (#447)

I used `logging.info` instead of `logger.info` in my haste, so this
logging message is not being sent to the proper location.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants