1

I have looked a lot at the xarray documentation of the interp function and I cannot really make sense of it. I see it is a reprojection but it doesn't really fit a real case example. Is their someone that could make sense of it for example by reprojecting this dataset on a webmercator datum?

Something like the example:

import xarray as xr
from pyproj import Transformer

ds = xr.tutorial.open_dataset("air_temperature").isel(time=0)
fig, axes = plt.subplots(ncols=2, figsize=(10, 4))
lon, lat = np.meshgrid(ds.lon, ds.lat)
shp = lon.shape
# reproject the grid
gcs_to_3857 = Transformer.from_crs(4326, 3857, always_xy=True)
x, y = gcs_to_3857.transform(lon.ravel(), lat.ravel())
# future index for a regular raster
X= np.linspace(x.min(), x.max(), shp[1])
Y= np.linspace(y.min(), y.max(), shp[0])   
data["x"] = xr.DataArray(np.reshape(x, shp), dims=("lat", "lon"))
data["y"] = xr.DataArray(np.reshape(y, shp), dims=("lat", "lon"))

And here, I am stuck

Should be something like ds.interp(x=X,y=Y) but the array is indexed on lat lon

It is all a bit confusing to me...

1
  • 2
    would using rioxarray's reproject work for you? Commented Feb 7, 2022 at 8:35

2 Answers 2

6

You could also use reproject from rioxarray as suggested. Here is the code:

import xarray as xr
import numpy as np
from rasterio.enums import Resampling
import matplotlib.pyplot as plt

ds = xr.tutorial.open_dataset('air_temperature').isel(time=0)
ds = ds.rio.write_crs('EPSG:4326')
dst = ds.rio.reproject('EPSG:3857', shape=(250, 250), resampling=Resampling.bilinear, nodata=np.nan)

fig, ax = plt.subplots(1, 2, figsize=(14, 5))
ds.air.plot(ax=ax[0])
dst.air.plot(ax=ax[1])

enter image description here

Sign up to request clarification or add additional context in comments.

3 Comments

This is an excellent use of rasterio and simple at that. I have read their doc and I could not figure out how to do this basic reprojection. Another snippet they should add to their documentation! Thanks a lot !
I have tested your code and it doesn't work for me AttributeError: 'Dataset' object has no attribute 'rio'
OK, I needed to update to a more recent xarray version
3

I think the idea is something like this:

In [1]: import xarray as xr
   ...: import numpy as np
   ...: from pyproj import Transformer
   ...:
   ...: ds = xr.tutorial.open_dataset("air_temperature").isel(time=0)

design a target grid in transformed space:

In [2]: # find the new bounds in mercator space
   ...: gcs_to_3857 = Transformer.from_crs(4326, 3857, always_xy=True)
   ...: x, y = gcs_to_3857.transform(
   ...:     [ds.lon.min(), ds.lon.max()],
   ...:     [ds.lat.min(), ds.lat.max()],
   ...: )

In [3]: # design a target grid for the re-projected data - can be any dims you want
   ...: X = np.linspace(x[0], x[1], 500)
   ...: Y = np.linspace(y[0], y[1], 600)
   ...: XX, YY = np.meshgrid(X, Y)

Transform this grid back into lat/lon

In [4]: # build a reverse transformer from Mercator back to lat/lons
   ...: merc_to_latlng = Transformer.from_crs(3857, 4326, always_xy=True)
   ...: new_lons, new_lats = merc_to_latlng.transform(XX.ravel(), YY.ravel())
   ...: new_lons = new_lons.reshape(XX.shape)
   ...: new_lats = new_lats.reshape(YY.shape)

Create new DataArrays to index the lat/lon values corresponding to the grid points on the target grid (indexed by x and y in Mercator space):

In [5]: # create arrays indexed by (x, y); also convert lons back to (0, 360)
   ...: new_lons_da = xr.DataArray((new_lons % 360), dims=["y", "x"], coords=[Y, X])
   ...: new_lats_da = xr.DataArray(new_lats, dims=["y", "x"], coords=[Y, X])

Use xarray's advanced indexing to interpolate the data to the new points while re-indexing onto the new grid

In [6]: ds_mercator = ds.interp(lon=new_lons_da, lat=new_lats_da, method="linear")

Now the data is indexed by x and y, with points equally spaced in Mercator space:

In [7]: ds_mercator
Out[7]:
<xarray.Dataset>
Dimensions:  (y: 600, x: 500)
Coordinates:
    time     datetime64[ns] 2013-01-01
    lon      (y, x) float64 200.0 200.3 200.5 200.8 ... 329.2 329.5 329.7 330.0
    lat      (y, x) float64 15.0 15.0 15.0 15.0 15.0 ... 75.0 75.0 75.0 75.0
  * y        (y) float64 1.689e+06 1.708e+06 1.727e+06 ... 1.291e+07 1.293e+07
  * x        (x) float64 -1.781e+07 -1.778e+07 ... -3.369e+06 -3.34e+06
Data variables:
    air      (y, x) float64 nan nan nan nan nan ... 237.3 237.6 238.0 238.3 nan
Attributes:
    Conventions:  COARDS
    title:        4x daily NMC reanalysis (1948)
    description:  Data is from NMC initialized reanalysis\n(4x/day).  These a...
    platform:     Model
    references:   http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...

The new projection can be seen in the axes (and distortion) of the transformed (right) as compared to the original (left) datasets:

In [8]: fig, axes = plt.subplots(1, 2, figsize=(14, 5))
   ...: ds.air.plot(ax=axes[0])
   ...: ds_mercator.air.plot(ax=axes[1])
Out[8]: <matplotlib.collections.QuadMesh at 0x2b3b94be0

Air temperature in PlateCarree (left) and reprojected to Mercator (right)

1 Comment

This is brilliant, you should get that into the docs of xarray, this is so useful and a real-case example!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.