Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zernike enhancements #1327

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Zernike enhancements #1327

wants to merge 5 commits into from

Conversation

jmeyers314
Copy link
Member

Two tweaks to the Zernike library:

• Adds a DoubleZernike.xycoef() method to directly get the pupil coefficient array at a given field position of a double zernike, without going through a galsim.Zernike object.
• Enables large jmax for annular Zernikes. Addresses #1326 .

I found a closed-form for a certain subset of annular Zernikes where the radial and azimuthal orders are equal, so used that as a test of annular Zernikes Z66, Z231, Z861, Z1891, and Z3321. GalSim's annular Zernike library reproduces these to tolerances of part per 10^12, 10^12, 10^9, 10^6, and 10^3 respectively. I also tried Z5151. This value of j is apparently well beyond the accuracy of the library though and yields errors in the ~factor of 10 range.

These were all for R_inner/R_outer = 0.5; I'd guess the accuracies probably improve as this ratio goes to 0 (annular Zernikes go to circular Zernikes), but didn't explicitly check.

@jmeyers314 jmeyers314 requested a review from rmjarvis January 23, 2025 18:38
print()
for n, tol in test_vals:
j = np.sum(np.arange(n+2))
n, m = galsim.zernike.noll_to_zern(j)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somewhat confusing reuse of variable n. Probably should rename the other one. (k maybe?)

for n, tol in test_vals:
j = np.sum(np.arange(n+2))
n, m = galsim.zernike.noll_to_zern(j)
print(f"Z{j} => (n, m) = ({n}, {n})")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
print(f"Z{j} => (n, m) = ({n}, {n})")
print(f"Z{j} => (n, m) = ({n}, {m})")

Probably followed by assert n == abs(m) since AFAIU, that is the intent of these choices for j.


rng = galsim.BaseDeviate(1234).as_numpy_generator()
x = rng.uniform(-1.0, 1.0, size=100)
y = rng.uniform(-1.0, 1.0, size=100)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we expect this to converge particularly well when r>1, right? So should probably limit this to have x**2 + y**2 < 1.
When I tried it, I was able to use significantly lower tolerances:

(10, 1.e-12)
(20, 1.e-12)
(40, 1.e-10)
(60, 1.e-8)
(80, 1.e-6)
(100, 2.e-3)

So it is still becoming less accurate as j get very large. But not as catastrophically bad as this test implies.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ones that are least accurate are always ones with r close to 1. Like 0.95 or so. This is probably a clue about where the recursion is going unstable. I might play around with the recursion formula to see if I can harden it up at all.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, limiting to 0.5 < r < 1.0 might even be appropriate. Though, OTOH, these are just 2d polynomials. They're orthogonal over an annulus but defined everywhere.

The particular polynomials here are proportional to r^n, so that might explain why the large r are trickier. Other values of j will presumably produce polynomials with larger amplitudes near r=0.

In case it's helpful while you're looking at recursions, here's the closed-form table I got out of Mathematica up to Z28:
Screenshot 2025-01-24 at 6 26 12 AM

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the problem is the conversion of the coef array from rrsq format to xy format. The xy version has lots of very large coefficients with alternating sign. The native rho/rhosq version is much more stable numerically.

If I change evalCartesian to:

        rho = (x + 1j*y) / self.R_outer
        rhosq = np.abs(rho)**2
        ar = _noll_coef_array(len(self.coef)-1, self.R_inner/self.R_outer).dot(self.coef[1:])
        return horner2d(rhosq, rho, ar).real

then all the tests pass at 1.e-12 tolerance, except Z5151, which needed 1.e-11. (It's also massively faster than the current implementation -- I guess the rrsq_to_xy function must be a slow step?)

I don't know if this means we should actually make the above change though. There might be other downsides (e.g. not getting to use the C++ layer horner2d implementation, at least without a little more work). But I suppose we could at least provide another method that would evaluate this way for users who want more accuracy at very high Zernike order.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. I sped up rrsq_to_xy significantly by rewriting it as a cached recursive function. Now the tradeoff seems to be roughly:

The xy array is about 2x the effort to compute as the rrsq array, which makes sense since the xy is derived from the rrsq. However, most of this effort is cached so I don't think there's a significant difference after the initial Zernike constructions.

The xy approach (so c++ horner) is about 10x faster when there are ~100 evaluation points. However, on my machine (M3 Pro) the complex-python approach actually runs faster when the number of evaluation points is ~10_000 or more (!). Maybe this is because the rrsq array is ~2x smaller than the xy array?

Given the improved accuracy and the above, I tentatively vote make the rrsq approach the default. I want to benchmark some donut fitting first though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm.... Maybe the horner step depends on jmax too. I just ran some donut image forward models (jmax ~65 and npoints ~10000) and they were ~4x faster using the xy approach. So maybe need to leave that as an expert option.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either way is fine with me. As long as there is a public API way to get the calculation that is accurate for high order.

I think most use cases won't care about the accuracy as much as the speed. But it seems important to have a way to get the accurate one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants