-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Bug in sample
Parameter in feature_select()
: Prevent Unintended Row Removal and Dataset Modification
#495
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #495 +/- ##
==========================================
+ Coverage 94.39% 94.43% +0.03%
==========================================
Files 57 57
Lines 3159 3181 +22
==========================================
+ Hits 2982 3004 +22
Misses 177 177
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
sample
Parameter: Prevent Unintended Row Removal and Dataset Modificationsample
Parameter in feature_select()
: Prevent Unintended Row Removal and Dataset Modification
This looks functionally good to me @axiomcura - kudos and thanks! @d33bs could you give a quick, but close look to see if there are any technical things for Erik to consider prior to merging? One thing that I could potentially see is to create a new test (rather than append to an existing one), which would be more explicitly named to test the sample parameter (e.g., |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice job @axiomcura! I added some comments with this review and would be interested in getting your response. Please don't hesitate to let me know if you have any questions.
Generally I also wondered:
- Have we covered the change in functionality with new tests? I noticed there was a test for
feature_select
but perhaps not the other functions which were modified. Consider adding tests for these as well to ensure the sampling edge cases are covered (to avoid surprises later).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @axiomcura for making these changes! LGTM
Description
Thank you for your contribution to pycytominer!
Please succinctly summarize your proposed change.
What motivated you to make this change?
This pull request closes #494, which describes when using the
sample
parameter unexpectedly removes rows and modifies the original dataset. The changes aim to resolve this issue, ensuring that thesample
parameter behaves as intended without altering the original data and removing data.I have also created new tests to verify that the number of rows remains unchanged when applying the
samples
parameter, whether through individual operations and a combination of multiple operations.Please also link to any relevant issues that your code is associated with.
What is the nature of your change?
Checklist
Please ensure that all boxes are checked before indicating that a pull request is ready for review.