-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: remove dependency on pickle #83
Conversation
return None | ||
with self.pca_model_path.open("rb") as file: | ||
return pickle.load(file) | ||
unknown_types = sio.get_untrusted_types(file=self.pca_model_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would happen when not self.pca_model_path.exists()
? Probably we'd want to gracefully log and skip - same behaviour as before the change.
self.bin_counts_uni_path = self.path / "bin_counts_uni.parquet" | ||
self.bin_counts_biv_path = self.path / "bin_counts_biv.parquet" | ||
self.pca_model_path = self.path / "pca_model.pickle" | ||
self.pca_model_path = self.path / "pca_model.skops" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://skops.readthedocs.io/en/stable/persistence.html
for others reference.
So as I understand does it not use pickle? or does it use pickle and choose specific things as safe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same understanding. My understanding is that it was designed specifically for sklearn and with sklearn objects should be safe. Only ONNX is higher than this library on sklearn recommendation list for models persistance, https://scikit-learn.org/dev/model_persistence.html.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.bin_counts_uni_path = self.path / "bin_counts_uni.parquet" | ||
self.bin_counts_biv_path = self.path / "bin_counts_biv.parquet" | ||
self.pca_model_path = self.path / "pca_model.pickle" | ||
self.pca_model_path = self.path / "pca_model.skops" | ||
self.trn_pca_path = self.path / "trn_pca.npy" | ||
self.hol_pca_path = self.path / "hol_pca.npy" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looked into these ones, np.load seems to have allow_pickle=False by default so that's good.
No description provided.