diff --git a/.github/ISSUE_TEMPLATE/release_checklist.md b/.github/ISSUE_TEMPLATE/release_checklist.md index c33ad2b..c93803d 100644 --- a/.github/ISSUE_TEMPLATE/release_checklist.md +++ b/.github/ISSUE_TEMPLATE/release_checklist.md @@ -13,7 +13,8 @@ assignees: "" - [ ] Locally rendered documentation contains all appropriate pages, including API references (check no modules are missing), tutorials, and other human written text is up-to-date with any changes in the code. -- [ ] Installation instructions in the README, documentation and on the website are updated and tested +- [ ] Installation instructions in the README, documentation and on the website + are updated and tested - [ ] Successfully run any tutorial examples or do functional testing in some other way. - [ ] Grammar and writing quality have been checked (no typos). diff --git a/docs/api.md b/docs/api.md index e9172d0..684b6a5 100644 --- a/docs/api.md +++ b/docs/api.md @@ -1,5 +1,6 @@ # API reference -This section contains the automatic API reference for `Cif` and `CifEnsemble` modules in the `cifkit` package. +This section contains the automatic API reference for `Cif` and `CifEnsemble` +modules in the `cifkit` package. ::: cifkit diff --git a/docs/index.md b/docs/index.md index c9372de..84559da 100644 --- a/docs/index.md +++ b/docs/index.md @@ -3,10 +3,11 @@ ## Statement of need `cifkit` uses .cif files by offering higher-level functions and variables that -enable users to perform complex tasks efficiently with a few lines of code. `cifkit` distinguishes itself from existing libraries by offering higher-level -functions and variables that allow solid-state synthesists to obtain intuitive and -measurable properties impactful properties. It facilitates the visualization of -coordination geometry from each site using four coordination determination +enable users to perform complex tasks efficiently with a few lines of code. +`cifkit` distinguishes itself from existing libraries by offering higher-level +functions and variables that allow solid-state synthesists to obtain intuitive +and measurable properties impactful properties. It facilitates the visualization +of coordination geometry from each site using four coordination determination methods and extracts physics-based features like volume and packing efficiency—crucial for structural analysis in machine learning tasks. Moreover, `cifkit` extracts atomic mixing information at the bond pair level, tasks that @@ -46,7 +47,9 @@ size, tags, coordination numbers, elements, and atomic mixing. ## Processing speed expectation -Processing approximately 10,000 .cif files on a standard laptop (iMac with M1 chip) may take about 30 to 60 minutes. At this rate, we can process nearly all .cif files within 1–2 days. +Processing approximately 10,000 .cif files on a standard laptop (iMac with M1 +chip) may take about 30 to 60 minutes. At this rate, we can process nearly all +.cif files within 1–2 days. ## Installation diff --git a/package-lock.json b/package-lock.json new file mode 100644 index 0000000..fc9c2c9 --- /dev/null +++ b/package-lock.json @@ -0,0 +1,26 @@ +{ + "name": "cifkit", + "lockfileVersion": 3, + "requires": true, + "packages": { + "": { + "dependencies": { + "prettier": "^3.3.3" + } + }, + "node_modules/prettier": { + "version": "3.3.3", + "resolved": "https://registry.npmjs.org/prettier/-/prettier-3.3.3.tgz", + "integrity": "sha512-i2tDNA0O5IrMO757lfrdQZCc2jPNDVntV0m/+4whiDfWaTKfMNgR7Qz0NAeGz/nRqF4m5/6CLzbP4/liHt12Ew==", + "bin": { + "prettier": "bin/prettier.cjs" + }, + "engines": { + "node": ">=14" + }, + "funding": { + "url": "https://github.com/prettier/prettier?sponsor=1" + } + } + } +} diff --git a/package.json b/package.json new file mode 100644 index 0000000..1dafaf1 --- /dev/null +++ b/package.json @@ -0,0 +1,5 @@ +{ + "dependencies": { + "prettier": "^3.3.3" + } +} diff --git a/paper-review-1.md b/paper-review-1.md deleted file mode 100644 index fe2e41e..0000000 --- a/paper-review-1.md +++ /dev/null @@ -1,82 +0,0 @@ -## Response to @espottesmith - -**Paper comments:** - -> The summary is generally appropriate. From context, a non-expert might realize what *.cif files are (since crystal structures are mentioned in the second sentence), but this could be more explicitly stated in the first sentence. - -Thank you for your suggestion. The summary now incorporates what *.cif files are: - -```markdown -`cifkit` provides higher-level functions and properties for coordination -geometry and atomic site analysis from .cif files, which are standard file -formats for storing crystallographic data such as atomic fractional coordinates, -symmetry operations, and unit cell dimensions. -``` - -> I come from a computational background, so I'm not the target audience, but I am unconvinced that the basic functionality (i.e., parsing CIF files and extracting information like structures and formulas) is easier in cifkit than in pymatgen/ASE. Similar numbers of lines of code would be required, at least comparing pymatgen and the example given in the cifkit paper (see https://pymatgen.org/usage.html#reading-and-writing-structuresmolecules). Considering higher-level features of cifkit, including filtering groups of CIFs and visualization, I think that cifkit distinguishes itself from other codes. In this area, the authors' argument is more convincing. Perhaps a slight reframing of the Statement of Need is appropriate? - -This is a valid point. I have modified the "Statement of Need" and emphasized the utility of higher-level functions by modifying the section as follow: - -```markdown -`cifkit` distinguishes itself from existing libraries by offering higher-level -functions and variables that allow solid-state synthesists to obtain intuitive and -measurable properties impactful properties. It facilitates the visualization of -coordination geometry from each site using four coordination determination -methods and extracts physics-based features like volume and packing -efficiency—crucial for structural analysis in machine learning tasks. Moreover, -`cifkit` extracts atomic mixing information at the bond pair level, tasks that -would otherwise require extensive manual effort using GUI-based tools like -VESTA, Diamond, and CrystalMaker. These functions can be further developed -on-demand, as demonstrated by `cifkit`'s ability to extract coordination -geometry information based on four coordination number determination methods for -a newly discovered phase [@tyvanchuk_crystal_2024]. -``` - -> Otherwise, the paper looks good! - -Thank you! - -**Code comments:** - -> The folder "[occupacny"](https://github.com/bobleesj/cifkit/tree/main/src/cifkit/occupacny) should probably be "occupancy" - -Modified. Thank you. - -> I initially tried installing with Python 3.9 (as far as I can tell, the "pyproject.toml" and the docs don't specify a version). This caused problems, as cifkit is using type annotation syntax that is not supported in Python 3.9. I now see on PYPI that cifkit only supports Python 3.10, 3.11, and 3.12. Please make this more clear for users. - -In `pyproject.toml`, I have added the following `requires-python = '>=3.10, <3.13` to support 3.10, 3.11, 3.12. Per your suggestion, I have included the supported Python badges in the documentation under the Getting Started section. - -> In terms of code style, more extensive docstrings (for instance, explaining what the inputs and outputs should be) would be appreciated. It seems there are already some "TODO"s to this effect (see e.g., models/cif.py), so perhaps the authors are planning on making these changes soon. Function bodies are typically well commented (note that I have not read every file) - -I have added numpy-style docstrings for the main classes `Cif` and `CifEnsemble` here: https://github.com/bobleesj/cifkit/pull/49. I have also identified some code methods that could be refactored, as noted here: https://github.com/bobleesj/cifkit/issues/53. My next step is to further refine these docstrings by consolidating them into reusable methods. - -**Documentation comments:** - -> Examples in the docs aren't the most consistent. For instance, some example lines of code use "ensemble" as the variable, and some use "ensemble_test". It may also be better if the commented outputs reflected the example folder ("Example.ErCoIn_big_folder_path") so users can more easily verify correctness. Finally, incorporating the visualization tools in the docs (currently they're demonstrated only in the README) would be appreciated. - -Thank you for your suggestion. I have now maintained the instance name `ensemble` consistently. The actual output using the Jupyter notebook has been used for all examples. - -> The statement of need could be made more clear in the docs - -I have added the statement of need in the beginning of the docs: https://bobleesj.github.io/cifkit/. - -> Community guidelines basically amount to "submit a PR or talk to the lead author", but for a project at this scale, perhaps that's appropriate. - -Thank you. I have created a PR template https://github.com/bobleesj/cifkit/blob/main/.github/pull_request_template.md and instructions on how to contribute here https://github.com/bobleesj/cifkit/blob/main/CONTRIBUTING.md. - -**Overall comments:** - -> In terms of "substantial scholarly effort", this is a reasonably small, simple, and young project. That said, it is clearly already being used for scholarly research, and it is somewhat unique in terms of the features that it offers. As such, I think it should be considered substantial. - -Thank you. For one of the projects mentioned, SAF (Structure/Composition Analyzer), which uses approximately 100 features generated using cifkit, we have a pre-print version available: https://chemrxiv.org/engage/chemrxiv/article-details/670aa269cec5d6c142f3b11a. I will continue to maintain the software for on-going research efforts. - -> The paper does not contain original data or results -Yes, .cif files must be provided for us to conduct analysis. - -> The authors do not make significant performance claims - -I have added information to the documentation stating that processing approximately 10,000 .cif files on a standard laptop (iMac with M1 chip) takes about 30 to 60 minutes. At this rate, we can process nearly all .cif files within 1–2 days. However, I have decided not to include performance metrics in the manuscript. Our code may evolve further—for instance, by using matrix multiplications to compute distances, as in ASE—and performance hasn't been a major bottleneck for our research. - -> With some relatively small tweaks, I think this will make a good addition to JOSS. - -Thank you for your review! diff --git a/paper-review-2.md b/paper-review-2.md deleted file mode 100644 index 60b233a..0000000 --- a/paper-review-2.md +++ /dev/null @@ -1,157 +0,0 @@ -## Response to @lancekavalsky - -**Code comments:** - -> For io related methods, e.g. the ones that generate histograms, more clarity is needed regarding where they write things by default. Running the histogram tutorial from the documentation, it wrote the histograms to a folder deep in my conda environment site-packages, which would likely not be intuitive to many users (particularly as this package is presented as being catered to users with less coding experience) and may cause issues on shared resources. - -Thank you for the details. Yes, the histogram images were saved to the Anaconda environment, as that was where the .cif files were provided by default in the installed package. In the docs, I have included information about options to set the output path for users for clarity: - -```python -# Optional: Specify the output directory where the .png file will be saved. -ensemble.generate_site_mixing_type_histogram(output_dir="path/to/directory") - -# Optional: Call plt.show() to display the histogram on screen. -ensemble.generate_site_mixing_type_histogram(display=False) -``` - -For API doc users, I have included docstrings: - -```python -def generate_supercell_size_histogram( - self, display=False, output_dir=None -): - """Generate a histogram of the 'supercell_count' property from CIF files. - - This method creates a histogram based on the 'supercell_count' statistics of - the CIF files. If 'output_dir' is specified, the histogram image (.png) will be - saved to that directory. If 'output_dir' is not specified, the image will be saved - to the directory specified by 'self.dir_path'. - - Parameters - ---------- - display : bool, optional - If True, the plot is displayed using plt.show(). Default is False. - output_dir : str, optional - The directory path where the histogram should be saved. If None, - the histogram is saved in the directory defined by 'self.dir_path'. - """ -``` - -> I will echo the previous comment that more docstrings would be invaluable in helping code clarity. In particular, I would urge adopting a standardized approach to this, such as Numpy, to be more in line with community standards - -Per your suggestion, I have added NumPy-style docstrings to the core `Cif` and `CifEnsemble` classes. I will continue updating the documentation after modularizing some functions, such as combining the histogram generation functions into a single one to reduce verbosity. - -> I have opened a technical issue I encountered here - -Thank you. I have addressed the issue via this PR: https://github.com/bobleesj/cifkit/pull/47. To ensure compatibility, I created a test function to ensure that raw .cif files sourced from ICSD, COD can be parsed and the supercell can be generated: - -```python -@pytest.mark.parametrize( - "cif_folder_path, expected_file_count, expected_supercell_stats", - [ - ("tests/data/cif/sources/ICSD", 4, {216: 2, 307: 1, 320: 1}), - ("tests/data/cif/sources/COD", 2, {519: 1, 1383: 1}), - ("tests/data/cif/sources/MP", 2, {108: 1, 594: 1}), - ("tests/data/cif/sources/PCD", 1, {364: 1}), - ("tests/data/cif/sources/MS", 1, {2988: 1}), - ("tests/data/cif/sources/CCDC", 1, {3844: 1}), - ], -) -``` - -**Documentation comments:** - -> As the core classes for this package are Cif and CifEnsemble, more explicit explanations as to the inputs and parameters would be helpful -- especially for CifEnsemble. For example, unless I'm missing it, a comprehensive list of everything that the preprocess parameter triggers is not mentioned in the documentation. - -Thank you. I have included docstrings for the `Cif` and `CifEnsemble` classes, providing explicit parameters and preprocessing triggers as shown below. While the documentation can be further refined, I believe it serves its purpose for now. - -```python -class CifEnsemble: - def __init__( - self, - cif_dir_path: str, - add_nested_files=False, - preprocess=True, - logging_enabled=False, - ) -> None: - """Initialize a CifEnsemble object, containing a collection of Cif - objects. - - Parameters - ---------- - cif_dir_path : str - Path to the folder path containing .cif file(s). - add_nested_files : bool, optional - Option to include .cif files contained in sub-directories within cif_dir_path - , by default False - preprocess : bool, optional - Option to edit .cif files before initializing each into a Cif object, - by default True. Preprocess modifies atomic site labels in - atom_site_label. Some site labels may contain a comma or a symbol like M - due to atomic mixing. It reformats each atom_site_label so it can be - parsed into an element type matching atom_site_type_symbol. For PCD - databases, addresses in publ_author_address often have an incorrect - format requiring manual modifications. It also relocates any ill-formatted - files, such as those with duplicate labels in atom_site_label, missing - fractional coordinates, or files requiring supercell generation. - - logging_enabled : bool, optional - Option to log while pre-processing Cif objects, by default False - - Attributes - ---------- - dir_path: str - Path to the folder containing .cif files - file_paths: list[str] - List of file paths to .cif files - cifs: list[Cif] - List of Cif objects - file_count: int - Number of .cif files in the folder - logging_enabled: bool - Option to log while pre-processing Cif objects - """ -``` - -> In the documentation there are a couple instances of general clean-up required. One example is the first box on Getting Started uses a CIF method ensemble.cif_folder_path which gives an error when run. Another example is under the CIF specific documentation which refers to a README.md for complete documentation, but it is unclear where this file is located (since that info doesn't appear to be the main README in the repo?). - -I have revised the documentation to enhance clarity and personally tested each example to ensure accuracy. Additionally, I have included comments indicating the location of default Example files provided in the package for first-time users: - -```python -# In `cifkit` we provide .cif files that can be accessed through `from cifkit import Example` as shown below. For advancuser, these example .cif files are located under `src/cifkit/data` in the package. - -from cifkit import Example -from cifkit import Cif - -# Initialize with the example file provided -cif = Cif(Example.Er10Co9In20_file_path) -... -``` - -> There are options in the Cif class to use either the by_d_min_method or by_best_methods. Please refer to the README.md for complete documentation. - -I have included detailed documentations in the API. https://github.com/bobleesj/cifkit/blob/dbaf32400b70f323ba5965526193704b2613ea7b/src/cifkit/models/cif.py#L619 and https://github.com/bobleesj/cifkit/blob/dbaf32400b70f323ba5965526193704b2613ea7b/src/cifkit/models/cif.py#L682. - -> This is relatively minor, but on the documentation website it wasn't immediately clear to me what would be contained under the Notebooks tab at the top. Given this outlines several of the core functionalities of cifkit, I would consider renaming it to something more descriptive. - -Thank you for your feedback. I have replaced the name "Notebooks" but into 4 sections: `Getting started` `Cif` `CifEnsemble` and `API References`. - -> The contributing guidelines at this stage are somewhat vague. For example, is a minimum code coverage required for added features? - -I have included a PR request template (https://github.com/bobleesj/cifkit/blob/main/.github/pull_request_template.md) as well as the `CONTRIBUTING.md` for how to fork, clone, commit, etc. - -**Paper comments:** - -> In the summary it mentions that this package is designed to process datasets on "the order of tens of thousands". It is not clear to me where this is coming from and what exactly causes the bottleneck for going beyond this estimate. Details regarding what determined this limitation would be helpful to judge high-throughput performance - -Please see my comment above in response to @espottesmith's point about "The authors do not make significant performance claims." - -Additionally, as @ml-evs pointed out, our package's strength lies not in high-throughput processing (ASE, pymagen do a much better job) but in its specific features for "coordination geometry" and "atomic site analysis" with features that are demanded in experimental research. Consequently, I have modified the manuscript title to "cifkit: A Python package for coordination geometry and atomic site analysis" and removed "high-throughput." - -> The examples in the paper are presented with limited explanation as to what they are showing. While the comments help in the second example, and some of the methods are self-explanatory by their naming, more comments would help cement the clarity here. - -Thank you for this feedback. I have added higher-level functions that are considered novel in our package and included more explanatory comments. - -> Overall, this work would make a great addition to JOSS pending the minor revisions described above. - -Thank you for your review and recommendation! diff --git a/paper-review-3.md b/paper-review-3.md deleted file mode 100644 index bdda5f7..0000000 --- a/paper-review-3.md +++ /dev/null @@ -1,40 +0,0 @@ -## Response to @ml-evs - -> While this package clearly has some useful tools (particularly the nice polyhedron visualisation), I fail to see how this tool is easier to use than pymatgen/ASE as a selling point. If it is targeted at people with limited programming/Python experience, then much more care must be taken in guiding them through the installation process (with virtual environments etc). - -This is a valid comment considering most users end up using the packages that are built on top of `cifkit` rather than building. I've also provided `conda install cifkit_env -n cifkit` option in the package to help manage dependencies easily. This installation and the Getting Started sections were tested by undergraduate and PhD students of Dr. Anton Oliynyk. - -> Whilst "high throughput" is mentioned throughout the docs and paper, I see little in the way of e.g., assistance with parallelisation or batch processing (e.g., if I have 1m structures can I stream data without going via a file all the time?), nor any advanced error handling that could allow this package to be used in a highly automated way -- perhaps the authors could clarify what they mean with regards to high throughput in this case? - -Repeating my comment above regarding the comment "the order of tens of thousands" by @lancekavalsky, - -Based on reviewer feedback and internal discussions, we concluded that our package's strength lies not in high-throughput processing (ASE and pymatgen excel at this), but in its specific features for "coordination geometry" and "atomic site analysis." These features are crucial for experimental research and are often tediously acquired from GUI-based software. Consequently, I've modified the manuscript title to "cifkit: A Python package for coordination geometry and atomic site analysis" and removed "high-throughput." - -To clarify, our current implementation doesn't include batch processing. We process each .cif file individually. With approximately 1,000 atoms in the supercell, processing takes about 0.3 seconds for one file and 3,000 atoms—less than 3 seconds total. I've added to the documentation that processing roughly 10,000 .cif files on a standard laptop (iMac with M1 chip) takes about 30 to 60 minutes. At this rate, we can process nearly all .cif files within 1–2 days. However, we've decided not to include performance metrics in the manuscript. Our code may evolve further—perhaps using matrix multiplications to compute distances, as in ASE—and performance hasn't been a major bottleneck for our research. - - -Regarding error handling, we've processed tens of thousands of binary and ternary .cif files, primarily from PCD and thousands from ICSD. We've aimed to identify the most common errors to ensure we can use as many .cif files from the database as possible. Of course, there are some cases where .cif files lack space group operations or fractional coordinates. In such instances, we move those files into separate folders before initializing them into objects. We've also tested with COD, CCSD, and other datasets to encounter more diverse errors, which users can report later for me to address. - -> I do not think it is appropriate to mention test coverage explicitly in the paper; although the package does appear to be well-tested with good CI, pinned dependencies and lots of test cases, this metric can be a red herring. - -Thank you for your suggestion. All dependency pinnings have been removed, and CI now supports Python 3.10, 3.11, and 3.12. Additionally, the mention of Codecov has been removed. - -> There are a few glaring omissions in the references that could help provide useful background to readers: - -Based on your feedback, cifkit now supports CIF files directly downloaded from ICSD, COD, PCD, CCDC, and those created with Materials Studio. I have retained the `hall_crystallographic_1991` citation as it introduces the original CIF format. On demand, I will modify our code to support the new version if databases begin to export .cif in v2. - -> The CIF framework itself is not appropriately cited (see https://www.iucr.org/resources/cif/cif2) -- the paper/documentation should also discuss which versions of the CIF standard are supported (it is a growing standard) with changes between v1 and v2. Gemmi, the library used for the underlying CIF parsing, published in JOSS already https://joss.theoj.org/papers/10.21105/joss.04200 Other potential but not mandatory references: - -This is greatly appreciated. The Gemmi has been cited, along with PyVista, another JOSS paper. Additionally, citations for NumPy, SciPy, and Matplotlib have been included. - -> chemenv, used under the hood by pymatgen for coordination analysis https://journals.iucr.org/b/issues/2020/04/00/lo5066 (and subsequently much of the pre-existing literature for determining coordination environments). - -Thank you. Added in the manuscript. - -> Other dedicated CIF parsing projects supported by IUCr: PyCIFRw and COD's CIF parser (with pycodcif Python package) https://journals.iucr.org/j/issues/2016/01/00/po5052/index.html There are many others listed at https://www.iucr.org/resources/cif/software - -Since Gemmi provided sufficient parsing capabilities for our tasks with minor fixes done by cifkit such as such as adding “#” to the first line to ICSD .cif files, we have not incorporated additional CIF parsers. - -> I feel this submission is borderline on the "Substantial scholarly effort" front, given the guidelines at https://joss.readthedocs.io/en/latest/submitting.html#substantial-scholarly-effort I am not convinced by the text that the library was crucial in the three examples listed by the authors in the applications section; perhaps this could be expanded? Perhaps even the functionality from the other related packages by the author listed in the paper (SAF and CBA) could be exposed in the cifcheck namespace as (effectively) one package? - -Yes, cikfit and its applications are actively used by approximately 4-5 research groups. The SAF and CBA modules each have their own descriptions in respective journal publications, making it challenging to consolidate them under a broad `cifkit' umbrella. We would prefer to serve as an 'engine' that provides advanced functionality. Therefore, we do not want to combine packages into one. However, features will continue to be developed on demand, primarily for experimental groups led by Prof. Oliynyk, both within his group and through his active collaborations. diff --git a/paper.md b/paper.md index 23efe94..3333f3f 100644 --- a/paper.md +++ b/paper.md @@ -1,6 +1,6 @@ - --- -title: "cifkit: A Python package for coordination geometry and atomic site analysis" +title: + "cifkit: A Python package for coordination geometry and atomic site analysis" tags: - Python - CIF @@ -63,11 +63,11 @@ coordination environment identification through ChemEnv generating and running atomistic simulations. `cifkit` distinguishes itself from existing libraries by offering higher-level -functions and variables that allow solid-state synthesists to obtain intuitive and -measurable properties impactful properties. It facilitates the visualization of -coordination geometry from each site using four coordination determination -methods and extracts physics-based features like volume and packing -efficiency—crucial for structural analysis in machine learning tasks. Moreover, +functions and variables that allow solid-state synthesists to obtain intuitive +and measurable properties impactful properties. It facilitates the visualization +of coordination geometry from each site using four coordination determination +methods and extracts physics-based features like volume and packing efficiency, +which are crucial for structural analysis in machine learning tasks. Moreover, `cifkit` extracts atomic mixing information at the bond pair level, tasks that would otherwise require extensive manual effort using GUI-based tools like VESTA, Diamond, and CrystalMaker. These functions can be further developed @@ -98,7 +98,7 @@ polyhedra descriptors from atomic sites. The full installation process can be executed via a Jupyter notebook, accessible through the Google Colab URL provided in the official documentation. -![polyhedron-image-distribution-of-files](docs/assets/img/ErCoIn-histogram-combined.png) +![Atomic site coordination geometry (left) and distribution based on coordination number (right)](docs/assets/img/ErCoIn-histogram-combined.png) ```python from cifkit import Cif, Example @@ -143,7 +143,7 @@ from cifkit import CifEnsemble, Example >>> ensemble.filter_by_formulas(["LaRu2Ge2"]) # Return file paths by site mixing types ->>> ensemble.filter_by_site_mixing_types(["full_occupancy", "deficiency_without_atomic_mixing"]) +>>> ensemble.filter_by_site_mixing_types(["deficiency_without_atomic_mixing"]) # Determine shortest pair distance per .cif file >>> ensemble.filter_by_CN_min_dist_method_containing([14]) @@ -152,9 +152,9 @@ from cifkit import CifEnsemble, Example # Applications `cifkit` has been used for research conducted at academic and national -laboratories for crystal structural analysis and machine learning studies, and -expanding. CIF Bond Analyzer (CBA) utilizes `cifkit` to extract coordination -geometry information for newly a discovered phase [@tyvanchuk_crystal_2024]. The +laboratories for crystal structural analysis and machine learning studies. CIF +Bond Analyzer (CBA) utilizes `cifkit` to extract coordination geometry +information for newly a discovered phase [@tyvanchuk_crystal_2024]. The Structure Analysis/Featurizer (SAF) employs `cifkit` to construct and extract physics-based geometric features for binary and ternary compounds [@jaffal_composition_2024]. Furthermore, geometric features generated with diff --git a/src/cifkit/data/ErCoIn/polyhedrons/ErCoIn5_In1.png b/src/cifkit/data/ErCoIn/polyhedrons/ErCoIn5_In1.png index 17c6e4b..ef9411e 100644 Binary files a/src/cifkit/data/ErCoIn/polyhedrons/ErCoIn5_In1.png and b/src/cifkit/data/ErCoIn/polyhedrons/ErCoIn5_In1.png differ