Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(py): Add support for storing models in S3 #765

Merged
merged 22 commits into from
Feb 27, 2025

Conversation

syntaxsdev
Copy link
Contributor

@syntaxsdev syntaxsdev commented Feb 5, 2025

Within the Python client, users will be able to directly store models to an S3 compatible object storage.
AWS S3 connections will automatically be consumed.

Description

The bulk of the changes were done in clients/python/src/_client.py
With E2E tests that were added to test functionality by creating a Minio instance

This implementation is within the ModelRegistry class

Example usage:

mr = ModelRegistry(...)
mr.save_to_s3(
  path="models",
  bucket_name="default",
  endpoint_url="xxx",
  access_key_id="xxx",
  secret_access_key="xxx"
)

You can path both files and paths to the path parameter.
If a path is nested (ex. data/models) then an s3_prefix variable will be required to allow for proper folder naming.
Environment variables are picked up, but variables supplied via direct params are preferred over environment variables that are mounted.

How Has This Been Tested?

A local and remote version of Minio S3

  1. Changes to the Makefile target were made to add a local Minio instance for the same Kind cluster than the Makefile target deploy-latest-mr uses.
  2. An .env file is generated so that vars can be used for the added tests
  3. Three new tests were added that single for storing a singular model to S3 and for storing a path of models, recursively

Tests are passing when completely rebuilding and testing the new e2e workflow
image

To test, run make test-e2e as per usual.

Merge criteria:

  • All the commits have been signed-off (To pass the DCO check)
  • The commits have meaningful messages; the author will squash them after approval or in case of manual merges will ask to merge with squash.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work.
  • Code changes follow the kubeflow contribution guidelines.

If you have UI changes

  • The developer has added tests or explained why testing cannot be added.
  • Included any necessary screenshots or gifs if it was a UI change.
  • Verify that UI/UX changes conform the UX guidelines for Kubeflow.

Copy link
Member

@tarilabs tarilabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you @syntaxsdev for this!

some initial comments below.
which type of test can we consider to make sure the functionality is covered?

I'm thinking we could have some dedicated e2e test by extending the current opt-in pytest mechanism and deploy minio in that "scenario" of e2e testing. Do you have some additional ideas?

@google-oss-prow google-oss-prow bot added size/XL and removed size/L labels Feb 17, 2025
Copy link
Member

@tarilabs tarilabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to make sure the end results works as expected for KServe (not necessarily as a test in the PR itself, making the check manually can also do)

otherwise
/lgtm
already thanks @syntaxsdev

@tarilabs
Copy link
Member

( I also believe this PR need rebasing to account for more recent openapi and generated changes )

@google-oss-prow google-oss-prow bot removed the lgtm label Feb 17, 2025
@syntaxsdev
Copy link
Contributor Author

Confirmed functionality works similarly by going through other kubeflow saving model demos

image

@syntaxsdev syntaxsdev requested a review from tarilabs February 18, 2025 21:50
Signed-off-by: syntaxsdev <[email protected]>
Comment on lines +1 to +3
#!/usr/bin/env bash

set -e
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tarilabs @syntaxsdev one question I have is that I was assuming we are using pytest libraries for any cluster management too, is that a wrong assumption?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no we are not "orchestrating cluster setup from pytest" if that was the question, as the request was to stick to makefile and scripting; the downside is netting on a basic common ground, but I see the advantage as reuse from Go or else in the future if needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct for reuse and manageability, this was the topic in shift left. @lugi0 might have some comments on this. I am not asking to change this PR but maybe we can capture a follow-up if that is the overall direction we need to be writing the tests with, if not we can ignore.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm favourable of revising the setup, as you know I prsonally prefer to reduce Makefile and scripts to the minimum as possible, but we need indeed to account for multiple factors.

@rareddy
Copy link
Contributor

rareddy commented Feb 25, 2025

@lucferbux can you look into the issues with the UI tests here?

@lucferbux
Copy link
Contributor

Agh, this might happen due to latest config PR cc @christianvogt I'm gonna take a look

@christianvogt
Copy link
Contributor

A fix for the tests has been posted here #832

Copy link
Contributor

@dhirajsb dhirajsb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API looks good to me, we should verify Makefile changes on all platforms.

@tarilabs tarilabs force-pushed the feat/store-on-s3-py branch 2 times, most recently from e2313b8 to c27af31 Compare February 26, 2025 12:30
Signed-off-by: Matteo Mortari <[email protected]>
@tarilabs tarilabs force-pushed the feat/store-on-s3-py branch from c27af31 to 6d0aeab Compare February 26, 2025 12:41
Signed-off-by: Matteo Mortari <[email protected]>
@tarilabs tarilabs force-pushed the feat/store-on-s3-py branch from f1c1d45 to 9883236 Compare February 26, 2025 19:18
Copy link
Member

@tarilabs tarilabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you @syntaxsdev , and all for this PR, this is a good foundation step to make life easier to the DS + MLOps Eng to orchestrate the Store-then-Register flow!

/lgtm
/approve

@google-oss-prow google-oss-prow bot added the lgtm label Feb 27, 2025
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: tarilabs

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-prow google-oss-prow bot merged commit 90ba800 into kubeflow:main Feb 27, 2025
17 checks passed
@tarilabs tarilabs mentioned this pull request Mar 3, 2025
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants