liminfo

NetCDF Reference

Free reference guide: NetCDF Reference

24 results

About NetCDF Reference

The NetCDF Reference is a comprehensive quick-lookup guide for the Network Common Data Form (NetCDF), the standard self-describing file format used in meteorology, oceanography, climate science, and geospatial data analysis. It covers NetCDF-3 Classic (2GB limit), NetCDF-3 64-bit offset, and NetCDF-4 (HDF5-based with group, compression, and user-defined type support), along with the core data model of dimensions, variables, and attributes that forms the backbone of every .nc file.

The reference is organized into four categories: Basic Concepts (NetCDF versions, dimensions like time/lat/lon/level, variable declarations, global and variable attributes), CF Conventions (Climate and Forecast metadata standards including standard_name, UDUNITS units, time coordinate calendars, cell_methods for aggregation), ncdump/Tools (ncdump header inspection, ncgen CDL-to-binary conversion, CDO climate data operators for time averaging and spatial subsetting, NCO operators for variable extraction and attribute editing, OPeNDAP remote access, THREDDS/ERDDAP data servers), and xarray (open_dataset, DataArray selection with sel/isel, groupby/resample/rolling operations, built-in plotting with cartopy projection, open_mfdataset for multi-file datasets, Dask integration for out-of-core computation, and netCDF4 low-level library usage).

Whether you are inspecting a climate model output with ncdump -h, computing monthly means with CDO, extracting a spatial subset with NCO, or building a Python analysis pipeline with xarray and Dask for terabyte-scale ERA5 reanalysis data, this reference provides the exact command syntax, function signatures, and configuration patterns you need at your fingertips.

Key Features

  • NetCDF version comparison covering Classic (2GB limit), 64-bit offset, and NetCDF-4 with HDF5 features like chunking, compression, and parallel I/O
  • Complete CF Conventions guide including standard_name variables (air_temperature, precipitation_flux), time coordinate calendar types, and cell_methods aggregation syntax
  • ncdump command reference for header inspection (-h), coordinate display (-c), human-readable time conversion (-t), and full CDL dump
  • CDO (Climate Data Operators) quick reference for timmean, yearmean, mergetime, and spatial subsetting with sellonlatbox
  • NCO operators guide covering ncks variable extraction, dimension slicing, ncrename, ncatted attribute editing, and ncea ensemble averaging
  • Python xarray API reference for open_dataset, sel/isel selection, groupby/resample/rolling statistics, and to_netcdf with compression encoding
  • Dask integration patterns for lazy evaluation of large NetCDF datasets with configurable chunking strategies
  • Remote data access via OPeNDAP URLs and THREDDS/ERDDAP server protocols for selective download without full file transfer

Frequently Asked Questions

What is NetCDF and when should I use it?

NetCDF (Network Common Data Form) is a self-describing, platform-independent binary file format designed for storing multidimensional array data. It is the standard format for climate model outputs, weather reanalysis datasets (ERA5, MERRA-2), satellite observations, and oceanographic data. Use it when you need to store gridded scientific data with metadata (units, coordinate systems, time references) embedded in the file itself.

What is the difference between NetCDF-3 and NetCDF-4?

NetCDF-3 Classic has a 2GB file size limit and supports only fixed-size dimensions (except one unlimited dimension). NetCDF-3 64-bit offset removes the size limit. NetCDF-4 is built on HDF5 and adds groups (hierarchical structure), internal compression (zlib), user-defined data types, multiple unlimited dimensions, chunked storage for better I/O performance, and parallel I/O support.

What are CF Conventions and why do they matter?

CF (Climate and Forecast) Conventions are a metadata standard that defines how variables, coordinates, and attributes should be named and described in NetCDF files. They specify standard_name attributes (e.g., air_temperature), UDUNITS-compatible units, time coordinate formats with calendar types (standard, noleap, 360_day), and cell_methods for describing data aggregation. CF compliance ensures interoperability between tools like CDO, NCO, xarray, and visualization software.

How do I inspect a NetCDF file quickly?

Use ncdump -h file.nc to view dimensions, variables, and attributes without dumping data. Add -c to include coordinate variable values, or -t to convert time values to human-readable date strings. For a quick overview in Python, use xr.open_dataset("file.nc") and print the dataset object to see a structured summary of all dimensions, coordinates, variables, and attributes.

How do I compute time averages from NetCDF data?

With CDO: cdo timmean in.nc out.nc for overall time mean, cdo yearmean for annual averages, cdo monmean for monthly. With xarray in Python: ds.mean(dim="time") for time average, ds.groupby("time.month").mean() for monthly climatology, ds.resample(time="Y").mean() for annual resampling. Both approaches handle CF time coordinates automatically.

How do I extract a spatial or variable subset?

With CDO: cdo sellonlatbox,lon1,lon2,lat1,lat2 in.nc out.nc for spatial subsetting. With NCO: ncks -v temp file.nc out.nc to extract a single variable, ncks -d time,0,11 for time dimension slicing. With xarray: ds.sel(lat=slice(30,40), lon=slice(120,130)) for coordinate-based selection, supporting nearest-neighbor lookup with method="nearest".

How do I handle very large NetCDF files that do not fit in memory?

Use xarray with Dask by specifying chunks when opening: ds = xr.open_dataset("large.nc", chunks={"time": 100, "lat": 50, "lon": 50}). This enables lazy evaluation where computations are deferred until .compute() is called, processing data in manageable chunks without loading the entire file into memory. For multiple files, xr.open_mfdataset("data_*.nc", parallel=True) combines files with Dask parallelism.

Can I access NetCDF data remotely without downloading the file?

Yes. OPeNDAP allows you to open a remote NetCDF file by URL: ds = xr.open_dataset("https://thredds.server.org/path/file.nc"). Only the requested subset is transferred over the network. THREDDS Data Server provides OPeNDAP, HTTP, and WMS access with subsetting and aggregation services. ERDDAP offers a RESTful API that can output data in CSV, JSON, or NetCDF format from a single URL query.