Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating odh kserve kustomize manifests #390

Draft
wants to merge 118 commits into
base: master-bad
Choose a base branch
from

Conversation

KillianGolds
Copy link

@KillianGolds KillianGolds commented Jul 23, 2024

What this PR does / why we need it:

Updating deprecated kustomize labels.

Here is a gist of the differing outputs of kustomize build command has comparing original deprecated syntax and updated syntax: https://gist.github.com/KillianGolds/ef98951981f04ce5f95dbca514a328ea

With the old syntax storageInitializer resolves as such:

storageInitializer: |-
    {
        "image" : "quay.io/opendatahub/kserve-storage-initializer:latest",
        "memoryRequest": "100Mi",
        "memoryLimit": "1Gi",
        "cpuRequest": "100m",
        "cpuLimit": "1",
        "enableDirectPvcVolumeMount": true
    }

New syntax is overwriting the json and just placing the image:


storageInitializer: quay.io/opendatahub/kserve-storage-initializer:latest

The same is happening for router, logger, agent and batcher.

I believe this is due to Kustomize v5 lack of handling of JSON.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Type of changes
Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Feature/Issue validation/testing:

Please describe the tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.

  • Test A

  • Test B

  • Logs

Special notes for your reviewer:

  1. Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

Checklist:

  • Have you added unit/e2e tests that prove your fix is effective or that this feature works?
  • Has code been commented, particularly in hard-to-understand areas?
  • Have you made corresponding changes to the documentation?

Release note:


Re-running failed tests

  • /rerun-all - rerun all failed workflows.
  • /rerun-workflow <workflow name> - rerun a specific failed workflow. Only one workflow name can be specified. Multiple /rerun-workflow commands are allowed per comment.

taneem-ibrahim and others added 30 commits July 10, 2023 18:15
…endatahub-io#18)

**What this does / why we need it**:

    This PR adds custom code to make KServe run on OpenShift without the need for anyuid SCC.

**More context**:

OpenShift uses istio-cni which causes an issue with init-containers when calling external services
like S3 or similar. Setting the uid for the storage-initializer to the same uid as the
uid of the istio-proxy resolves the issue. In OpenShift the istio-proxy always gets assigned
the first uid from the namespaces uid range + 1 (The range is defined in an annotation on the namespace).

**Release note**:

```release-note
The `storage-initializer` container will now run with the same `uid` as the `istio-proxy` which resolves an issue when istio-cni is used.
```

---
Squashed commit titles:
* add storage-initializer uid handling for OpenShift with istio-cni
* update storage_initializer_injector tests
* Also use annotation on pod to override uid
* Remove manager's rbac-proxy and add ODH requried network policies
* Workaround for Kustomize bug about creationTimestamp
  See: kubernetes-sigs/kustomize#5031

  Without this workaround, some Kustomize versions are generating
  `creationTimestamp: "null"` (null as a string).

Signed-off-by: Edgar Hernández <[email protected]>
* Workflows are changed to push to Quay.io.
* The go.yml workflow is changed to omit updating the coverage badge (we don't have one, for now).
* The README.md file is updated to right ODH urls.

Signed-off-by: Edgar Hernández <[email protected]>
…ge-push

Adapt GH-workflows to correctly push to ODH container repositories
Kustomize patches for OpenShift were cherry-picked from the release-v0.10 branch. The cherry-pick succeeded, but the resulting manifests were not working, because of the differences. This fixes the manifests and bring them back to a working state.

Signed-off-by: Edgar Hernández <[email protected]>
These are the needed changes to have openshift-ci running the E2E tests successfully.

There are several groups of E2E tests that can be deduced from the .github/workflows/e2e-test.yaml file: fast, slow, explainer, transformer-mms, qpext, grpc, helm, raw and kourier. For ODH, the `fast`, `slow` and `grpc` groups are the ones that cover the features that are going to be supported in the initial adoption of ODH.

This commit contains the needed adaptations to the E2E tests of the `fast` and `slow` groups to successfully run them in an openshift cluster. It also adds a few scripts on test/scripts/openshift-ci to run these E2Es in the openshift-ci operator.

Some of these changes should be seen as provisional and should be rolled back:
* test/e2e/common/utils.py: because of the networking/DNS expectations, that are currently not covered in ODH's installation.
* test/e2e/predictor/*:
  * In general all changes under this path should be seen as provisional. However, since ODH won't support all ServingRuntimes, it is possible that some of the tests will stay out.
  * There are some GRPC-related tests marked as skipped. Since this work is not enabling the `grpc` group, a subsequent commit/PR for enabling GRPC E2Es should remove/revert those skip marks.
  * Also, there are some tests skipped with the `Not testable in ODH at the moment` reason. The root cause of the failure should be investigated to re-enable these tests.
* python/kserve/kserve/models/v1beta1_inference_service.py: This is injecting an annotation that is required given the specifics of OSSM/Maistra and OpenShift-Serverless that are used in ODH. This annotation is, currently, user responsibility and this was the cleanest way to add it in the E2Es. Being platform-specific, it's been discussed that this (and some other) annotation should be injected by some controller to relief the user from this responsibility. If this happens, this change should be reverted.

Also, ideally, changes to the following files should be contributed back to upstream. Those changes are not required in upstream and should have no effect, but in openshift-ci become required because a different builder image is being used:
* Dockerfile
* agent.Dockerfile

Signed-off-by: Edgar Hernández <[email protected]>
Openshift-ci onboarding

These are the needed changes to have openshift-ci running the E2E tests successfully.

There are several groups of E2E tests that can be deduced from the .github/workflows/e2e-test.yaml file: fast, slow, explainer, transformer-mms, qpext, grpc, helm, raw and kourier. For ODH, the `fast`, `slow` and `grpc` groups are the ones that cover the features that are going to be supported in the initial adoption of ODH.

This commit contains the needed adaptations to the E2E tests of the `fast` and `slow` groups to successfully run them in an openshift cluster. It also adds a few scripts on test/scripts/openshift-ci to run these E2Es in the openshift-ci operator.

Some of these changes should be seen as provisional and should be rolled back:
* test/e2e/common/utils.py: because of the networking/DNS expectations, that are currently not covered in ODH's installation. These changes should be rolled back once the following ticked is fixed: opendatahub-io/odh-model-controller#59
* test/e2e/predictor/*:
  * In general all changes under this path should be seen as provisional. However, since ODH won't support all ServingRuntimes, it is possible that some of the tests will stay out.
  * There are some GRPC-related tests marked as skipped. Since this work is not enabling the `grpc` group, a subsequent commit/PR for enabling GRPC E2Es should remove/revert those skip marks.
  * Also, there are some tests skipped with the `Not testable in ODH at the moment` reason. The root cause of the failure should be investigated to re-enable these tests.
* python/kserve/kserve/models/v1beta1_inference_service.py: This is injecting an annotation that is required given the specifics of OSSM/Maistra and OpenShift-Serverless that are used in ODH. This annotation is, currently, user responsibility and this was the cleanest way to add it in the E2Es. Being platform-specific, it's been discussed that this (and some other) annotation should be injected by some controller to relief the user from this responsibility. If this happens, this change should be reverted.

Also, ideally, changes to the following files should be contributed back to upstream. Those changes are not required in upstream and should have no effect, but in openshift-ci become required because a different builder image is being used:
* Dockerfile
* agent.Dockerfile
Augments the `default` profile with some changes expected by an ODH installation:
* Removes the `Namespace` CR, because the ODH operator does not expect such resource. The Namespace is expected to be created in advance to later create a KfDef on it, where resources are going to be installed.
* Adds cluster roles, to extend the cluster's default user-facing roles with KServe privileges.

Signed-off-by: Edgar Hernández <[email protected]>
[Sync] kserve/kserve-master to master branch
Signed-off-by: Edgar Hernández <[email protected]>
Signed-off-by: Edgar Hernández <[email protected]>
automate addition of new isues into ODH board
Code sync up to upstream commit for v0.11.1
Open Data Hub operator v2 is going to be consuming Kustomize manifests
from component repos, and `odh-manifests` repo is going to be archived.

This is moving/copying artifacts from `odh-manifests` into an already
existent odh overlay. With these changes, the overlay can be directly
consumed by ODH-operator v2.

Signed-off-by: Edgar Hernández <[email protected]>
…master

[master] Preparation for odh-opeartor v2
Partial revert of
opendatahub-io/odh-manifests#916, because
opendatahub-io/odh-model-controller#84 has been
completed.

Signed-off-by: Edgar Hernández <[email protected]>
Jooho and others added 24 commits May 30, 2024 15:37
…-6506-odh

replace upstream Dockerfiles with ubi dockerfiles.
Signed-off-by: Edgar Hernández <[email protected]>
Signed-off-by: Vedant Mahabaleshwarkar <[email protected]>
Add workflow to Trigger build on PR add LGTM and Create Tag and Release with Changelog and push image to quay
chore:	Fixes CWE-362 - anyio Race Condition.
	Affected versions of this package are vulnerable to Race Condition in
	_eventloop.get_asynclib() that cause crashes when multiple event loops
	of the same backend are running in separate threads and simultaneously
	attempting to use AnyIO for the first time.

Signed-off-by: Spolti <[email protected]>
set protocol https for .Status.Address.URL
Copy link

openshift-ci bot commented Jul 23, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

Copy link

openshift-ci bot commented Jul 23, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: KillianGolds
Once this PR has been reviewed and has the lgtm label, please assign heyselbi for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

- source:
kind: ConfigMap
name: kserve-parameters
fieldPath: data.kserve-storage-initializer

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this should be data.kserve-storage-initializer.image or data.kserve-storage-initializer["image"] (not sure how kustomize treats nesting in json the same as yaml or not)
If it works it should be the same solution for the rest as well

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried this, unfortunately .image results in an error:

Error: error looking up replacement source: wrong node kind: expected MappingNode but got ScalarNode: node contents:
quay.io/opendatahub/kserve-storage-initializer:latest

and ["image"] results in:

Error: fieldPath data.kserve-storage-initializer["image"] is missing for replacement source ConfigMap.[noVer].[noGrp]/kserve-parameters.[noNs]

spolti referenced this pull request in spolti/kserve Sep 16, 2024
…updates/kserve-agent-211

Update kserve-agent-211 to f138ff1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: New/Backlog
Development

Successfully merging this pull request may close these issues.