Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to treat MITgcm tiled output as a collection "a la zarr"? #143

Open
gaelforget opened this issue Jan 7, 2025 · 2 comments
Open

how to treat MITgcm tiled output as a collection "a la zarr"? #143

gaelforget opened this issue Jan 7, 2025 · 2 comments

Comments

@gaelforget
Copy link
Owner

It seems that the various packages offer such option :

https://github.com/JuliaGeo/NCDatasets.jl
or https://juliageo.org/CommonDataModel.jl/stable/
or https://github.com/JuliaIO/Kerchunk.jl
or https://github.com/JuliaIO/Zarr.jl
or https://github.com/rafaqz/Rasters.jl

note: MITgcm also outputs global binary files, which I imagine could be retiled on the fly via MeshArrays.jl , but I would start with tiled output of MITgcm (1 per processor)

@asinghvi17
Copy link

Assuming the output is tiled in lat/long coordinates, then you can create a Kerchunk file manually without needing Python.

Do you have an example directory structure?

@gaelforget
Copy link
Owner Author

gaelforget commented Jan 15, 2025

Assuming the output is tiled in lat/long coordinates, then you can create a Kerchunk file manually without needing Python.

This is for a global grid where grid lines are not aligned with meridians or parallels. Hence longitude and latitude are two dimensional arrays, rather than dimensions.

Does this preclude using Kerchunk?

Do you have an example directory structure?

Here is typical directory structure.

julia> using Climatology
julia> readdir(ScratchSpaces.ECCO)
15-element Vector{String}:
 "ADVx_TH"
 "ADVy_TH"
 "DFxE_TH"
 "DFyE_TH"
 "ETAN"
 "MXLDEPTH"
 "SALT"
 "SIarea"
 "THETA"
 "UVELMASS"
 "VVELMASS"
 "WVELMASS"
 "interp_coeffs_halfdeg.jld2"
 "oceQnet"
 "sIceLoad"

and inside each variable subfolder :

julia> readdir(joinpath(ScratchSpaces.ECCO,"ETAN"))
13-element Vector{String}:
 "ETAN.0001.nc"
 "ETAN.0002.nc"
 "ETAN.0003.nc"
 "ETAN.0004.nc"
 "ETAN.0005.nc"
 "ETAN.0006.nc"
 "ETAN.0007.nc"
 "ETAN.0008.nc"
 "ETAN.0009.nc"
 "ETAN.0010.nc"
 "ETAN.0011.nc"
 "ETAN.0012.nc"
 "ETAN.0013.nc"

which you can download as follows :

using Climatology
get_ecco_variable_if_needed("ETAN")

notes :

  • in this example, we are looking at monthly climatology (12 time records); in others, we have time series (e.g. 1992-2011 w 12*20 months)
  • in this example, each file contains a 90x90 tile + time dimension (+ depth dimension for some variables).

Here is an example of the file content :

Dimensions
   i2 = 90
   i3 = 90
   i1 = 12

Variables
  area   (90 × 90)
    Datatype:    Float64 (Float64)
    Dimensions:  i3 × i2
    Attributes:
     long_name            = grid cell area
     units                = m^2
     standard_name        = cell_area

  i1   (12)
    Datatype:    Float64 (Float64)
    Dimensions:  i1
    Attributes:
     long_name            = array index 1
     units                = 1

  i2   (90)
    Datatype:    Float64 (Float64)
    Dimensions:  i2
    Attributes:
     long_name            = array index 2
     units                = 1

  i3   (90)
    Datatype:    Float64 (Float64)
    Dimensions:  i3
    Attributes:
     long_name            = array index 3
     units                = 1

  land   (90 × 90)
    Datatype:    Float64 (Float64)
    Dimensions:  i3 × i2
    Attributes:
     long_name            = land mask
     units                = 1
     standard_name        = land_binary_mask

  lat   (90 × 90)
    Datatype:    Float64 (Float64)
    Dimensions:  i3 × i2
    Attributes:
     units                = degrees_north
     standard_name        = latitude

  lon   (90 × 90)
    Datatype:    Float64 (Float64)
    Dimensions:  i3 × i2
    Attributes:
     units                = degrees_east
     standard_name        = longitude

  tim   (12)
    Datatype:    Dates.DateTime (Float64)
    Dimensions:  i1
    Attributes:
     long_name            = time
     units                = days since 0000-1-1 0:0:0
     standard_name        = time

  timstep   (12)
    Datatype:    Float64 (Float64)
    Dimensions:  i1
    Attributes:
     long_name            = final time step number
     units                = 1

  ETAN   (90 × 90 × 12)
    Datatype:    Float32 (Float32)
    Dimensions:  i3 × i2 × i1
    Attributes:
     long_name            = Free Surface Height Anomaly (Ocean-Ice Interface) (climatology)
     units                = m
     coordinates          = lon lat tim

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants