-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Spherize – use SVD, simplify calculations #320
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #320 +/- ##
==========================================
- Coverage 95.57% 95.01% -0.56%
==========================================
Files 56 56
Lines 3138 3187 +49
==========================================
+ Hits 2999 3028 +29
- Misses 139 159 +20
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Glad to see this fix, especially since the tests pass!
I made several inline comments, which we should discuss prior to merging. My general comments include:
- Please add tests for entry into the assertion that centering must be performed
- We really should also add testing conditions for the other assertions
- The other assertion statement errors need more user-friendly (read, less potential for frustration) text. Can we prompt a user on what to fix if they run into one?
- I'm wondering if an easier fix is to optionally return the numpy array rather than the DataFrame in
Spherize()
itself.
I made more specific comments for each of these four general comments in line below.
Thanks again!
Oops! @shntnu - I just saw the "removed the request" and WIP tag. Feel free to consider my comments as preliminary or to ignore them |
This PR fixes #179 |
@gwaybio sorry, it took a while to return to this. Thanks for the excellent code review and suggestions!
Done
First, I changed all assertions to exceptions because that's the preferred way of doing things (outside of tests). I then added tests to enter these exceptions, wherever it made sense to do so. In some cases, it isn't possible to do so (easily) because it would involve modifying the current behavior of functions in numpy. These checks serve to future-proof against changes in numpy behavior (specifically, in Note that, in general, it is possible to test for these exceptions by mocking upstream dependencies but that's brinking on over-testing IMO. Let me know what you think, specifically whether using
See above
This was such a great suggestion! Done. This is now ready for your review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating this code Shantanu!
I'm going to propose the following to get this merged in ASAP
- @MattsonCam - can you focus on the math side of things? The current tests pass so we can be confident about accuracy, but it would be great for you to read for understanding and provide suggestions on where the documentation can be upgraded.
- @kenibrewer - can you focus on the integration aspects? The added bug report function in
util.py
is neat, but you have your finger on the pulse of the pycytominer roadmap. Can you provide your thoughts here?
I think this can be a relatively quick PR merge, which I will do once we have these two approvals. Sorry about the delays Shantanu, we're right in the middle of me phasing out of PR review in favor of much more knowledgeable/qualified maintainers :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to move the new create_bug_report_message
function into a separate branch and pull request to be reviewed separately? There are a few aspects about the new pattern that merit discussion and I would prefer to keep this PR focused specifically on the spherize/SVD changes.
EDIT: To clarify. I agree with Greg that the feature has value, but it would be good to have a plan (and some documentation) about when pycytominer wants to "own" the error message as a bug and when we want to consider it data/user error and instead provide instructions on how to fix the problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very detailed pr. I left a few comments to consider. Great job @shntnu !
Makes sense, but instead, I'll just copy the code snippet here because I bet you'd want to do this more systematically later def create_bug_report_message(error_detail, context):
"""
Constructs an error message with a bug report prompt.
Parameters
----------
error_detail: str
Description of the error.
context: str
A description of the context for the error.
Returns
-------
str
Error message with a link to file a bug report.
"""
bug_report_url = "https://github.com/cytomining/pycytominer/issues"
return (
f"{error_detail} This is likely a bug in `{context}`. "
"Please help us improve our software by filing a bug report at "
f"{bug_report_url}."
)
def test_create_bug_report_message():
error_detail = "Division by zero error occurred."
context = "calculate_statistics function"
actual_message = create_bug_report_message(error_detail, context)
# check that the strings `error_detail` and `context` are in the message
assert error_detail in actual_message
assert context in actual_message
# check that the message contains a link to file a bug report
assert "https://github.com/cytomining/pycytominer/issues" in actual_message |
Thank you for nudging me to clarify the math further @MattsonCam! I understood my code better after explaining it :D I've resolved all your comments but feel free to reopen if necessary. |
It would be great to get this fix merged in. I just chatted with @MattsonCam and he will give a 👍 when he gets a chance @kenibrewer - is this ready from your perspective? |
Absolutely! Thank you for addressing them @shntnu! I've finished addressing your response comments, but didn't feel it was imperative to reopen any of them.
I give the 👍 |
Hooray! Over to one of you to merge once the tests pass (I made a documentation edit) |
this looks good to me @shntnu - I am going to merge this in now. Thanks for this contribution and for your patience as we reviewed! 🎉 |
@kenibrewer - thanks for your initial review. Given @MattsonCam 's approval (and your initial review focused on the bug report document now described in #365 ) - I will merge this in now to avoid additional delays. Thanks! |
2 weeks without activity and see #320 (comment)
Sorry I missed the pings here. I had modified my github notifications settings incorrectly and didn't get the usual notification emails I rely on. Everything looks good to me. |
No worries! |
Hooray! Thank you all. |
Description
Make several updates to
Spherize
epsilon
to regularize, not clip (the former is more standard; unclear why we were doing the latter)PCA-cor
andZCA-cor
necessary center the data, so raise an error if centering isFalse
What is the nature of your change?
Checklist
Please ensure that all boxes are checked before indicating that a pull request is ready for review.