Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CuVS-Java: Update readme files #672

Closed

Conversation

narangvivek10
Copy link

PR to update:

  • Readme files
  • Change logging in examples

gforsyth and others added 11 commits February 5, 2025 13:51
Enables telemetry during CUVS CI runs.  This is done by parsing GitHub Actions run log metadata and should have no impact on build or test times.

xref rapidsai/build-infra#139

Authors:
  - Gil Forsyth (https://github.com/gforsyth)

Approvers:
  - Mike Sarahan (https://github.com/msarahan)
  - Corey J. Nolet (https://github.com/cjnolet)

URL: rapidsai#652
This change reworks the api to allow it to be used with Java 21. The implementation is moved to an internal package, compiled with JDK 22, and packaged as an mrjar. The benefit of this structure is that the api can be used in environments that compile to a minimum of Java 21, but run on more recent JDKs like 22 and 23 - which is exactly what Elasticsearch and Lucene do.  In fact, a minimum compilation target of Java 21 is common, since 21, at the time of writing, is the most recent LTS Java release.

The most significant change is that the non-trivial api types are now, for the most part, interfaces. Instance can be created by one of the factory methods, which lookup an spi to find the implementation. If on a release greater than Java 21, then a functioning implementation is returned. Otherwise, a no-op implementation is returned. This is a reasonably standard way for a Java api to behave, and allows the developer to handle the case where the platform does not have a functioning implementation. 

This change also refactors the native downcall method handles so that they are static final constants - which optimise better by the JVM. It's also the generally accepted pattern, where the handles are tied to the lifetime of class which effectively mediates access - by virtue of reachability.

Another thing that I added is the ability to programmatically set the temporary directory used for intermediate operations - this is important to how both Lucene and Elasticsearch work - since they commonly only have permission to write to certain parts of the disk.

Additionally,
1. the error codes from native calls are plumbed in and checked. As well as `cuvsGetLastErrorText`.
2. a state is added to any classes that hold a reference to native resources that could be released.
3. a local arena is used for memory allocation only needed per downcall invocation, e.g. the return value.
4. I moved the tests to be integration tests, since they need to run on the jar (rather than the exploded classes). They can be run by any of; `mvn verify`, or `mvn integration-test`, or `mvn -Dit.test="*Hnsw*" verify`
5. I refactored the entry-points to the api to be static methods and added an `spi` layer. You can see the minimal impact on the tests.
6. Move the native library out of the top-level directory in the jar and into an os/arch position in the META-INF. 
7. add service provider support for custom implementations.

Authors:
  - Chris Hegarty (https://github.com/ChrisHegarty)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: rapidsai#628
Authors:
  - Divye Gala (https://github.com/divyegala)
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Tamas Bela Feher (https://github.com/tfeher)
  - Corey J. Nolet (https://github.com/cjnolet)

URL: rapidsai#616
…dsai#667)

The ann-bench tool has been observed to deadlock on thread.join() due to unnecessary mutex lock.
The problem is that the destructor doesn't release the thread mutex and thus doesn't allow the thread to escape the condition_variable.wait() function.
The fix is to just remove the lock in the destructor (which doesn't modify the state of the task anyway).

Authors:
  - Artem M. Chirkin (https://github.com/achirkin)

Approvers:
  - Tamas Bela Feher (https://github.com/tfeher)

URL: rapidsai#667
This PR adds a Go API. It's far from completion and still work in progress. Feel free to suggest improvements!

Authors:
  - Ajit Mistry (https://github.com/ajit283)
  - Corey J. Nolet (https://github.com/cjnolet)
  - Ben Frederickson (https://github.com/benfred)

Approvers:
  - Ben Frederickson (https://github.com/benfred)
  - Gil Forsyth (https://github.com/gforsyth)

URL: rapidsai#212
This PR adds docs for cuvs nn_descent

Authors:
  - Severin Dicks (https://github.com/Intron7)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: rapidsai#668
Copy link

copy-pr-bot bot commented Feb 7, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@cjnolet cjnolet added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Feb 7, 2025
@cjnolet
Copy link
Member

cjnolet commented Feb 7, 2025

/ok to test

java/README.md Outdated Show resolved Hide resolved
Co-authored-by: Corey J. Nolet <[email protected]>
java/README.md Outdated Show resolved Hide resolved
Co-authored-by: Corey J. Nolet <[email protected]>

`./build.sh` will generate the `libcuvs_java.so` file in the `internal/` directory, and then build the final jar file for the cuVS Java API in the `cuvs-java/` directory.
- gcc 11.4
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the result of cuVS being dependent upong gcc 11.4? Is gcc 11.4 really a requirement or can users still use GCC 9.x (which is the requirement for buildign cuVS). Let's link to the cuVS build from source page here if these dependencies are only for building cuVS itself. Having these types of constraints duplicated across the codebas becomes hard to manage and maintain.


`./build.sh` will generate the `libcuvs_java.so` file in the `internal/` directory, and then build the final jar file for the cuVS Java API in the `cuvs-java/` directory.
- gcc 11.4
- [nvcc 12.4 or above](https://developer.nvidia.com/cuda-downloads)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cuVS supports 11.x and box. Is there a reason the java API only supports 12.4? If not, let's just point to the cuVS build from source docs here and not duplicate these things.

`./build.sh` will generate the `libcuvs_java.so` file in the `internal/` directory, and then build the final jar file for the cuVS Java API in the `cuvs-java/` directory.
- gcc 11.4
- [nvcc 12.4 or above](https://developer.nvidia.com/cuda-downloads)
- [cmake 3.28 or above](https://cmake.org/download/)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here- I'm going to assume the only differences are the maven version and JDK. Why ubuntu 22.04? it doesn't look good to only support a very specific version of linux. I propose removing it, since users can get the toolchain they need from building the proper conda environment.


## Building

libcuvs libraries are needed for this API. If libcuvs libraries are already not built please do `./build.sh libcuvs java` in the top level directory to build this API.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really not a fan of having this embedded in the java directory. I'd much prefer that this information is provided directly in the build cuVS from source docs and this README just links to that. It just simply makes things hard to find, maintain (and update) when this info is scattered around the codebase.

Let's keep it here for the moment while this API is in experimental state, but plan to move into the docs in 25.04 release cycle.

Suggested change
libcuvs libraries are needed for this API. If libcuvs libraries are already not built please do `./build.sh libcuvs java` in the top level directory to build this API.
The libcuvs C and C++ libraries are needed for this API. If libcuvs libraries have not been built and installed, use `./build.sh libcuvs java` in the top level directory to build this API.

Alternatively, if libcuvs libraries are already built and you just want to build this API, please
do `./build.sh java` in the top level directory or just do `./build.sh` in this directory.

:warning: If you notice the tests failing please replace `mvn verify` with `mvn clean package` in the `build.sh` script found in this directory and try again. This should build the API (and skip the tests).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a notice to me, not a warning.

java/README.md Outdated Show resolved Hide resolved
Co-authored-by: Corey J. Nolet <[email protected]>

## Javadocs

To generate javadocs, in this directory (after building the API) please do:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before we remove the experimental designator, we need to get these building in CI and deployed.


mvn clean compile assembly:single
## Prerequisites
- All the prerequisites in the CuVS Java API readme
Copy link
Member

@cjnolet cjnolet Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be where you'd point to the cuVS build and install guide, which lists the dependencies for libcuvs (and eventually for building the java api).

```
java --enable-native-access=ALL-UNNAMED -cp target/cuvs-java-examples-25.02.0.jar:$HOME/.m2/repository/com/nvidia/cuvs/cuvs-java/25.02.0/cuvs-java-25.02.0.jar com.nvidia.cuvs.examples.CagraExample
```
Should output the following (with different timestamps):
Copy link
Member

@cjnolet cjnolet Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unecessary- I would just provide the command-line to run the example. Let the unit tests verify the results. I don't mind keeping it this way for 25.02, but let's fix this before we remove the experimental designator.

@narangvivek10 narangvivek10 changed the base branch from branch-25.02 to branch-25.04 February 7, 2025 19:40
@narangvivek10 narangvivek10 requested review from a team as code owners February 7, 2025 19:40
@narangvivek10 narangvivek10 changed the base branch from branch-25.04 to branch-25.02 February 7, 2025 19:42
@cjnolet cjnolet changed the base branch from branch-25.02 to branch-25.04 February 7, 2025 19:45
@narangvivek10
Copy link
Author

Creating a fresh PR instead. I will include all the suggestions in the new one. Closing this now. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improves an existing functionality non-breaking Introduces a non-breaking change
Projects
Development

Successfully merging this pull request may close these issues.

10 participants