-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updating odh kserve kustomize manifests #390
base: master-bad
Are you sure you want to change the base?
Updating odh kserve kustomize manifests #390
Conversation
…endatahub-io#18) **What this does / why we need it**: This PR adds custom code to make KServe run on OpenShift without the need for anyuid SCC. **More context**: OpenShift uses istio-cni which causes an issue with init-containers when calling external services like S3 or similar. Setting the uid for the storage-initializer to the same uid as the uid of the istio-proxy resolves the issue. In OpenShift the istio-proxy always gets assigned the first uid from the namespaces uid range + 1 (The range is defined in an annotation on the namespace). **Release note**: ```release-note The `storage-initializer` container will now run with the same `uid` as the `istio-proxy` which resolves an issue when istio-cni is used. ``` --- Squashed commit titles: * add storage-initializer uid handling for OpenShift with istio-cni * update storage_initializer_injector tests * Also use annotation on pod to override uid
* Remove manager's rbac-proxy and add ODH requried network policies * Workaround for Kustomize bug about creationTimestamp See: kubernetes-sigs/kustomize#5031 Without this workaround, some Kustomize versions are generating `creationTimestamp: "null"` (null as a string). Signed-off-by: Edgar Hernández <[email protected]>
* Workflows are changed to push to Quay.io. * The go.yml workflow is changed to omit updating the coverage badge (we don't have one, for now). * The README.md file is updated to right ODH urls. Signed-off-by: Edgar Hernández <[email protected]>
…ge-push Adapt GH-workflows to correctly push to ODH container repositories
Kustomize patches for OpenShift were cherry-picked from the release-v0.10 branch. The cherry-pick succeeded, but the resulting manifests were not working, because of the differences. This fixes the manifests and bring them back to a working state. Signed-off-by: Edgar Hernández <[email protected]>
Update OWNERS files
These are the needed changes to have openshift-ci running the E2E tests successfully. There are several groups of E2E tests that can be deduced from the .github/workflows/e2e-test.yaml file: fast, slow, explainer, transformer-mms, qpext, grpc, helm, raw and kourier. For ODH, the `fast`, `slow` and `grpc` groups are the ones that cover the features that are going to be supported in the initial adoption of ODH. This commit contains the needed adaptations to the E2E tests of the `fast` and `slow` groups to successfully run them in an openshift cluster. It also adds a few scripts on test/scripts/openshift-ci to run these E2Es in the openshift-ci operator. Some of these changes should be seen as provisional and should be rolled back: * test/e2e/common/utils.py: because of the networking/DNS expectations, that are currently not covered in ODH's installation. * test/e2e/predictor/*: * In general all changes under this path should be seen as provisional. However, since ODH won't support all ServingRuntimes, it is possible that some of the tests will stay out. * There are some GRPC-related tests marked as skipped. Since this work is not enabling the `grpc` group, a subsequent commit/PR for enabling GRPC E2Es should remove/revert those skip marks. * Also, there are some tests skipped with the `Not testable in ODH at the moment` reason. The root cause of the failure should be investigated to re-enable these tests. * python/kserve/kserve/models/v1beta1_inference_service.py: This is injecting an annotation that is required given the specifics of OSSM/Maistra and OpenShift-Serverless that are used in ODH. This annotation is, currently, user responsibility and this was the cleanest way to add it in the E2Es. Being platform-specific, it's been discussed that this (and some other) annotation should be injected by some controller to relief the user from this responsibility. If this happens, this change should be reverted. Also, ideally, changes to the following files should be contributed back to upstream. Those changes are not required in upstream and should have no effect, but in openshift-ci become required because a different builder image is being used: * Dockerfile * agent.Dockerfile Signed-off-by: Edgar Hernández <[email protected]>
Openshift-ci onboarding These are the needed changes to have openshift-ci running the E2E tests successfully. There are several groups of E2E tests that can be deduced from the .github/workflows/e2e-test.yaml file: fast, slow, explainer, transformer-mms, qpext, grpc, helm, raw and kourier. For ODH, the `fast`, `slow` and `grpc` groups are the ones that cover the features that are going to be supported in the initial adoption of ODH. This commit contains the needed adaptations to the E2E tests of the `fast` and `slow` groups to successfully run them in an openshift cluster. It also adds a few scripts on test/scripts/openshift-ci to run these E2Es in the openshift-ci operator. Some of these changes should be seen as provisional and should be rolled back: * test/e2e/common/utils.py: because of the networking/DNS expectations, that are currently not covered in ODH's installation. These changes should be rolled back once the following ticked is fixed: opendatahub-io/odh-model-controller#59 * test/e2e/predictor/*: * In general all changes under this path should be seen as provisional. However, since ODH won't support all ServingRuntimes, it is possible that some of the tests will stay out. * There are some GRPC-related tests marked as skipped. Since this work is not enabling the `grpc` group, a subsequent commit/PR for enabling GRPC E2Es should remove/revert those skip marks. * Also, there are some tests skipped with the `Not testable in ODH at the moment` reason. The root cause of the failure should be investigated to re-enable these tests. * python/kserve/kserve/models/v1beta1_inference_service.py: This is injecting an annotation that is required given the specifics of OSSM/Maistra and OpenShift-Serverless that are used in ODH. This annotation is, currently, user responsibility and this was the cleanest way to add it in the E2Es. Being platform-specific, it's been discussed that this (and some other) annotation should be injected by some controller to relief the user from this responsibility. If this happens, this change should be reverted. Also, ideally, changes to the following files should be contributed back to upstream. Those changes are not required in upstream and should have no effect, but in openshift-ci become required because a different builder image is being used: * Dockerfile * agent.Dockerfile
Augments the `default` profile with some changes expected by an ODH installation: * Removes the `Namespace` CR, because the ODH operator does not expect such resource. The Namespace is expected to be created in advance to later create a KfDef on it, where resources are going to be installed. * Adds cluster roles, to extend the cluster's default user-facing roles with KServe privileges. Signed-off-by: Edgar Hernández <[email protected]>
Signed-off-by: Vaibhav Jain <[email protected]>
[Sync] kserve/kserve-master to master branch
Signed-off-by: Edgar Hernández <[email protected]>
Signed-off-by: Edgar Hernández <[email protected]>
Signed-off-by: Edgar Hernández <[email protected]>
…-sync Upstream master sync
Signed-off-by: heyselbi <[email protected]>
automate addition of new isues into ODH board
Code sync up to upstream commit for v0.11.1
Signed-off-by: Spolti <[email protected]>
Signed-off-by: Spolti <[email protected]>
Open Data Hub operator v2 is going to be consuming Kustomize manifests from component repos, and `odh-manifests` repo is going to be archived. This is moving/copying artifacts from `odh-manifests` into an already existent odh overlay. With these changes, the overlay can be directly consumed by ODH-operator v2. Signed-off-by: Edgar Hernández <[email protected]>
add spoltin in the OWNERS file
[RHODS-12555] - CVE-2023-44487
Signed-off-by: Spolti <[email protected]>
[RHODS-12555] - CVE-2023-44487 - qpext
…master [master] Preparation for odh-opeartor v2
Partial revert of opendatahub-io/odh-manifests#916, because opendatahub-io/odh-model-controller#84 has been completed. Signed-off-by: Edgar Hernández <[email protected]>
…-6506-odh replace upstream Dockerfiles with ubi dockerfiles.
…se with Changelog and push image to quay
Signed-off-by: Edgar Hernández <[email protected]>
Docs for authorization feature
Signed-off-by: Vedant Mahabaleshwarkar <[email protected]>
…ch-1 enable DirectPvcVolumeMount
Signed-off-by: jooho lee <[email protected]>
[pull] master from kserve:master
[pull] master from kserve:master
[pull] master from kserve:master
Add workflow to Trigger build on PR add LGTM and Create Tag and Release with Changelog and push image to quay
…onfig update local gateway information
chore: Fixes CWE-362 - anyio Race Condition. Affected versions of this package are vulnerable to Race Condition in _eventloop.get_asynclib() that cause crashes when multiple event loops of the same backend are running in separate threads and simultaneously attempting to use AnyIO for the first time. Signed-off-by: Spolti <[email protected]>
CWE-362 - anyio Race Condition
Signed-off-by: jooho lee <[email protected]>
set protocol https for .Status.Address.URL
Skipping CI for Draft Pull Request. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: KillianGolds The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
- source: | ||
kind: ConfigMap | ||
name: kserve-parameters | ||
fieldPath: data.kserve-storage-initializer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe this should be data.kserve-storage-initializer.image
or data.kserve-storage-initializer["image"]
(not sure how kustomize treats nesting in json the same as yaml or not)
If it works it should be the same solution for the rest as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried this, unfortunately .image results in an error:
Error: error looking up replacement source: wrong node kind: expected MappingNode but got ScalarNode: node contents:
quay.io/opendatahub/kserve-storage-initializer:latest
and ["image"] results in:
Error: fieldPath data.kserve-storage-initializer["image"] is missing for replacement source ConfigMap.[noVer].[noGrp]/kserve-parameters.[noNs]
…updates/kserve-agent-211 Update kserve-agent-211 to f138ff1
What this PR does / why we need it:
Updating deprecated kustomize labels.
Here is a gist of the differing outputs of kustomize build command has comparing original deprecated syntax and updated syntax: https://gist.github.com/KillianGolds/ef98951981f04ce5f95dbca514a328ea
With the old syntax storageInitializer resolves as such:
New syntax is overwriting the json and just placing the image:
The same is happening for router, logger, agent and batcher.
I believe this is due to Kustomize v5 lack of handling of JSON.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #
Type of changes
Please delete options that are not relevant.
Feature/Issue validation/testing:
Please describe the tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.
Test A
Test B
Logs
Special notes for your reviewer:
Checklist:
Release note:
Re-running failed tests
/rerun-all
- rerun all failed workflows./rerun-workflow <workflow name>
- rerun a specific failed workflow. Only one workflow name can be specified. Multiple /rerun-workflow commands are allowed per comment.