-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
disdrodb
review
#156
Comments
Hello and welcome to pyOpenSci!!! Editor in Chief checksHi there! Thank you for submitting your package for pyOpenSci Please check our Python packaging guide for more information on the elements
Editor commentsFew small nits as I am going through the repository. These nits are non-blocking from continuing on with your review, but thought I would call them out.
An overall impression I've received from going through your packages is how much effort you all have put into documenting the tricky steps that people might not be super familiar with. This is not trivial work, so I just wanted to call out how impressed I was! The one piece that I would like to open up for discussion is
I do agree that |
Hi @isabelizimm, Thanks a lot for going through the software and thoroughly reviewing the documentation ! Documentation Glitches I've addressed the glitches you encountered in the Reasons Behind the Creation of the disdrodb-data Repository The decision to separate the The By cloning the Given the potential high-frequency of updates to the It's worth noting that the We believe this structure best supports the project's current needs and future growth, balancing ease of use with the complexity of data management and software development. Thank you once again for your constructive feedback and for opening up this discussion. I am looking forward for additional questions and suggestions you might have :) |
Hey there-- I've been thinking about this a bunch, and also have solicited some advice from the awesome pyOpenSci community on their experiences with storing data outside a package! Feel free to chime in on that thread as well if you would like (I couldn't find you on Slack to @)! From your experience, it seems like the two repo solution has come from:
I do think that this setup feels a bit awkward, and the complexity might turn people away from the package. BUT I think these problems are super solvable and there's some great tools out there to help! A few options that might make sense:
Option 1: adding functions in
|
Hey Isabel, I tried joining the Regarding your message, I noticed there might be some confusion around the Maybe we could rename the The actual station data are stored in cloud repositories such Zenodo (see here for some example). We chose GitHub for the metadata repository for several reasons:
When starting the project, we thought about using Opting for the two-repo solution was a deliberate choice to facilitate frequent updates to the metadata without necessitating constant releases of the disdrodb Python package each time a metadata file is edited or a new station is added to the metadata archive. This approach might look unusual, but as far as I know, it’s also the first attempt in the landscape of meteorological (and non?) data management to create a decentralized online data archive infrastructure backed by a GitHub-powered metadata repository 😊 Regarding the options you proposed, they seem to stem from a misunderstanding of the Option 2: making disdrodb-metadata into its own package, and then disdrodb depending on that The suggestion to package If I understand the suggestion correctly, you assumed The We want this (cloned) directory to be kept in synchronization with the main repository so that any repo update can be immediately be available to the user, without requiring new package releases. The packaging solution goes in contrast with the potential high frequency of repo updates. Option 1: adding functions in disdrodb that grab the data from the disdrodb-metadata The Users interested in downloading disdrometer data simply:
I recognize that using (and installing) To eliminate the need for
Implementing this second option would demand significant effort for functionality that git already offers efficiently and reliably. I hope you don't feel overwhelmed by all that I've just written down 😅 For me, all this discussion it's also a great way to understand how we can further improve the documentation of our software 😃 |
Oh! It looks like I have misunderstood the |
Okay, I am back! Thank you so much for your detailed response, that was super helpful to re-orient my brain to understand this set up. My opinion is still that the workflow for new users is most ergonomic if they can BUT since the intent of Let's keep this conversation going with an editor with domain expertise on this specific type of data to help determine what would be most useful to the community, and have reviewers (who will be closer to the codebase) to give an extra eye on usability and "getting started" feelings. So, tldr, no need to make any changes to the setup yet. I am actively looking for an editor, and will let you know when we have someone lined up for you! |
Hey there! Here's a long awaited update! |
Thanks ! Looking ahead for your feedbacks ;) |
Hi @ghiggi, nice to meet you! I'll be looking into finding some reviewers this week to get the ball rolling here. This is my first time editing for PyOS, so please bear with me! |
Hi @Zeitsperre. I hope you are doing well. Is there any news regarding the review of the software? |
Hi @ghiggi, Apologies for the delay! I have a few reviewers in mind that will be contacted this week. In terms of the review itself, it can sometimes take more than a month to get all review comments back. Depending on the amount of changes requested, the remaining changes could take some time and effort. I have seen some review turnarounds of a few months at most. At the very least, for the purposes of your presentation you should be able to use either the Zenodo DOI or perhaps there's a way to park a PyOS DOI ? I'll look into that after contacting the reviewers. |
hey there @Zeitsperre et all. i'm just checking in on this review. Is there anything that i can do to help / support it moving forward. To answer the question above setting this project up with Zenodo is highly recommended. |
Hi @lwasser @Zeitsperre ! |
hey @ghiggi we are working on a reboot right now to find more reviewers. I'm so glad that you have Zenodo setup already. I'll keep you posted here and if you have questions please feel free to reach out!! |
@ghiggi i wanted to followup. if you still wish to move forward with hte review (i'm so sorry this has taken so long, we will find a new editor). We have a new EiC @coatless working on packages. Can you please just confirm that you still wish to move forward with a review. If so, i'll help here to onboard someone to move things forward. |
Hi @lwasser, Thank you for following up. A lot of time has passed since we initially requested the software review. |
sounds good @ghiggi I understand! I'll go ahead and close this review! Best of luck in the other review processes!! |
Submitting Author: Gionata Ghiggi (@ghiggi)
All current maintainers: (@ghiggi)
Package Name: disdrodb
One-Line Description of Package: disdrodb - A software for the decentralized archiving and standardization of global disdrometer data
Repository Link: https://github.com/ltelab/disdrodb
Version submitted: v.0.0.21
EIC: @isabelizimm
Editor: @Zeitsperre
Reviewer 1: TBD
Reviewer 2: TBD
Archive: TBD
JOSS DOI: TBD
Version accepted: TBD
Date accepted (month/day/year): TBD
Code of Conduct & Commitment to Maintain Package
Description
The raindrop size distribution (DSD) describes the concentration and size distributions of raindrops in a volume of air. It is a crucial piece of information to model the propagation of microwave signals through the atmosphere (key for telecommunication and weather radar remote sensing calibration), to improve microphysical schemes in numerical weather prediction models, and to understand land surface processes (rainfall interception, soil erosion).
Recognizing the importance of understanding DSD's spatial and temporal variability, scientists worldwide have initiated efforts to "count the drops" by deploying disdrometers—specialized instruments designed to record DSD. Numerous measurement campaigns have been conducted by meteorological services, national agencies (e.g., NASA, ARM, NCAR), and university research groups. Despite these efforts, a significant portion of the collected data remains difficult to access. These data are often stored in diverse formats with inadequate documentation and metadata, posing challenges in sharing, analyzing, comparing, and reusing the data.
In response to these challenges, the disdrodb Python package offers:
A Decentralized Data Archive Infrastructure: The disdrodb package establishes a decentralized data archive, fostering the exchange and retrieval of raw disdrometer data within the scientific community. This infrastructure addresses the issue of data accessibility, documentation and promotes collaborative research.
Standardization of Raw Data. The disdrodb package provides tools to convert heterogeneous raw data into a uniform netCDF4 format, known as the DISDRODB L0 product. This standardization is a significant step forward, ensuring that data from different sources become compatible and easier to analyze, compare, and share, thereby enhancing the overall utility and reusability of the data.
Scope
Domain Specific & Community Partnerships
How the and why the package falls under the categories you indicated above
Data Retrieval
The disdrodb package facilitates the retrieval of raw measurement acquired by disdrometer stations which are included in the DISDRODB Decentralized Data Archive. This remote archive comprises public cloud repositories such as Zenodo. The disdrodb package tracks the available stations through the DISDRODB Metadata Archive which is hosted on GitHub.
Data Munging
After downloading the desired data, users can use disdrodb to convert the heterogeneous raw data into a uniform netCDF4 format (DISDRODB L0) with a single command. This conversion facilitates subsequent scientific analysis and product generation. For each disdrometer station, the disdrodb python package has a specialized reader that enable to accurately parse the raw sensor data.
Data Deposition
The disdrodb package offers a workflow for users who wish to contribute their disdrometer measurements to the DISDRODB community. This workflow ensures the long-term documentation of the data and simplifies the data upload process to the DISDRODB Decentralized Data Archive. Users must perform three main tasks:
Create a reader that reads the raw data into a dataframe, adhering to the DISDRODB guidelines.
Provide the metadata of the disdrometer station, which will be included in the DISDRODB Metadata Archive.
Upload the station's raw data to a remote repository and insert the station data URL into the DISDRODB Metadata Repository. The disdrodb package can automate this final step if the chosen remote repository is Zenodo.
Who is the target audience and what are scientific applications of this package?
The primary audience for this package includes researchers and students in the fields of remote sensing and atmospheric science, specifically those focused on precipitation. The package is designed to support applications in remote sensing and atmospheric science.
Are there other Python packages that accomplish the same thing? If so, how does yours differ?
To our knowledge, there are no other packages that offer an integrated infrastructure for retrieving, sharing, archiving, reading, and standardizing disdrometer data.
However, the pyDSD package exists for studying the DSD. It provides methods for high-level scientific analysis of disdrometer raw data, such as computing DSD parameters and simulating weather radar reflectivities.
The DISDRODB Working Group plans to leverage and adapt pyDSD codes in the future to generate uniform, high-level scientific products for all stations within the DISDRODB Global Archive.
Technical checks
For details about the pyOpenSci packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:
Publication Options
JOSS Checks
paper.md
matching JOSS's requirements with a high-level description in the package root or ininst/
.Note: JOSS accepts our review as theirs. You will NOT need to go through another full review. JOSS will only review your paper.md file. Be sure to link to this pyOpenSci issue when a JOSS issue is opened for your package. Also be sure to tell the JOSS editor that this is a pyOpenSci reviewed package once you reach this step.
Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?
This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.
Confirm each of the following by checking the box.
Please fill out our survey
submission and improve our peer review process. We will also ask our reviewers
and editors to fill this out.
P.S. Have feedback/comments about our review process? Leave a comment here
Editor and Review Templates
The editor template can be found here.
The review template can be found here.
The text was updated successfully, but these errors were encountered: