How to identify time, lon, and lat coordinates in xarray?
xarray rename coordinates
xarray dimensions without coordinates
xarray select time range
xarray assign values
xarray vs pandas
xarray interpolate time
What is the best way to determine which coordinates of an
dataArray object contain
dataArray might look like this:
<xarray.Dataset> Dimensions: (ensemble: 9, lat: 224, lon: 464, time: 12054) Coordinates: * lat (lat) float64 25.06 25.19 25.31 25.44 ... 52.56 52.69 52.81 52.94 * lon (lon) float64 -124.9 -124.8 -124.7 ... -67.31 -67.19 -67.06 * time (time) datetime64[ns] 1980-01-01 1980-01-02 ... 2012-12-31 Dimensions without coordinates: ensemble Data variables: elevation (lat, lon) float64 dask.array<shape=(224, 464), chunksize=(224, 464)> temp (ensemble, time, lat, lon) float64 dask.array<shape=(9, 12054, 224, 464), chunksize=(1, 287, 224, 464)>
One approach could be to loop through the variables identified by the variable coords, like
temp.coords, looking for the
standard_name attributes of
latitude. But many datasets don't seem to include
standard_name attributes for all variables.
I guess another approach be to search over the
units attributes and try to identify if they have appropriate
units attributes (e.g.
Is there a better way?
The MetPy package includes some helpers for systematic coordinate identification like this. You can see the basics of how this works in the xarray with MetPy tutorial. For example, if you want the time coordinate of a DataArray called
temp (assuming it came from a dataset that has been parsed by MetPy), you would simply call:
This is done internally by parsing the coordinate metadata according to the CF conventions.
Here's a short example:
import xarray as xr import metpy.calc as mpcalc ds = xr.tutorial.load_dataset('air_temperature') ds = ds.metpy.parse_cf() x,y,t = ds['air'].metpy.coordinates('x','y','time') print([coord.name for coord in (x, y, t)])
['lon', 'lat', 'time']
Indexing and selecting data, As xarray objects can store coordinates corresponding to each dimension of an the first coordinate time and with 'IA' value from the second coordinate space . You may find increased performance by loading your data into memory first, Xarray follows these conventions, but it mostly semantic and you don’t have to follow it. I see it like this: a data variable is the data of interest, and a coordinate is a label to describe the data of interest. For example latitude, longitude and time are coordinates while the temperature is a data variable.
You can probably do something similar to the code below with xarray filter_by:
def x_axis(nc): xnames = ['longitude', 'grid_longitude', 'projection_x_coordinate'] xunits = [ 'degrees_east', 'degree_east', 'degree_E', 'degrees_E', 'degreeE', 'degreesE', ] xvars = list(set( nc.get_variables_by_attributes( axis=lambda x: x and str(x).lower() == 'x' ) + nc.get_variables_by_attributes( standard_name=lambda x: x and str(x).lower() in xnames ) + nc.get_variables_by_attributes( units=lambda x: x and str(x).lower() in xunits ) )) return xvars
Working with Multidimensional Coordinates, time: 36; x: 275; y: 205 In this example, the logical coordinates are x and y , while the physical Plotting¶. Let's examine these coordinate variables by plotting them. This help us distinguish it from the original multidimensional variable xc . Note: This group-by-latitude approach does not take into account the finite-size geometry of grid cells. It simply bins each value according to the coordinates at the cell center. Xarray has no understanding of grid cells and their geometry. More precise geographic regridding for Xarray data is available via the xesmf package. [ ]:
Data Structures, DataArray(data, coords=[times, locs], dims=['time', 'space']) In : foo Out: Data and coordinate variables are also contained separately in the data_vars and In : ds. assign (temperature2 = 2 * ds. temperature) Out: <xarray.Dataset> Dimensions: (time: 3, x: 2, y: 2) Coordinates: lat (x, y) float64 42.25 42.21 42.63 42.59 lon (x, y) float64 -99.83 -99.32 -99.79 -99.23 * time (time) datetime64[ns] 2014-09-06 2014-09-07 2014-09-08 reference_time datetime64[ns] 2014-09-05 Dimensions without
If you are looking for just the special coords that act as indexes, then you can iterate over the
ds.indexes and do some string parsing on their names. Something like:
ds = xr.tutorial.load_dataset('air_temperature') ds.lat.attrs.pop('standard_name') for k in ds.indexes.keys(): v = ds[k] sn = v.attrs.get('standard_name') if not sn: if 'lon' in k: v.attrs.update(standard_name='longitude') continue if 'lat' in k: v.attrs.update(standard_name='latitude') continue if 'time' in k or k in ['day', 't', 'month', 'year']: v.attrs.update(standard_name='time')
Plotting, To use xarray's plotting capabilities with time coordinates containing Dataset> Dimensions: (lat: 25, lon: 53, time: 2920) Coordinates: * lat (lat) float32 75.0 72.5 If you'd like to find out what's really going on in the coordinate system, read on. Align and reindex¶. xarray’s reindex, reindex_like and align impose a DataArray or Dataset onto a new set of coordinates corresponding to dimensions. The original values are subset to the index labels still found in the new labels, and values corresponding to new labels not found in the original object are in-filled with NaN.
How do I …, change a data variable to a coordinate variable find out if my xarray object is wrapping a Dask Array round off time values to a specified frequency. xarray.Dataset.assign_coords¶ Dataset.assign_coords (self, coords=None, **coords_kwargs) ¶ Assign new coordinates to this object. Returns a new object with all the original data in addition to the new coordinates. Parameters. coords (dict, optional) – A dict with keys which are variables names. If the values are callable, they are computed
xarray with MetPy Tutorial, xarray is a powerful Python package that provides N-dimensional labeled arrays which allow simplified projection handling and coordinate identification. a coordinate from the property time = data['temperature'].metpy.time # To verify For DataArrays, MetPy also allows using the coordinate axis types Latitude and Longitude Finder. Latitude and Longitude are the units that represent the coordinates at geographic coordinate system.To make a search, use the name of a place, city, state, or address, or click the location on the map to find lat long coordinates.
xarray, Provide accessors to enhance interoperability between xarray and MetPy. For example, MetPy can identify the coordinate corresponding to a particular axis Return the data as unix timestamp (for easier time derivatives). >>> xr. merge ([x, y, z]) <xarray.Dataset> Dimensions: (lat: 3, lon: 3, time: 2) Coordinates: * lat (lat) float64 35.0 40.0 42.0 * lon (lon) float64 100.0 120.0 150.0
- I just loop and look for
lon. Is there a convention outlined somewhere?
day/etc. are also tough. some common climate data library with
climate_toolz.standardize_dimswould be great.
- When I first tried installing
metpyinto my conda environment, it wanted to upgrade and downgrade a bunch of stuff. For this coordinate identification it was sufficient to use
conda install -c conda-forge metpy pint pooch --no-deps.
- I like this idea. Could
standard_namealso be provided by an Intake Catalog?