Salem

Salem is a cat. Salem is also a small library to do geoscientific data processing and plotting. It extends xarray to add geolocalised subsetting, masking, and plotting operations to xarray’s DataArray and DataSet structures.

Warning

Salem is at an early development stage and is more a proof of feasibility than a library for “everything you’ll ever have to do with geoscientific data”. There are plenty of more mature tools out there (see What others tools should I know about?). Salem basically reinvents the wheel, but in the way I want the wheel to be. Some might still find it useful: as an addition to xarray, or if you use the WRF model for example.

Documentation

Examples

Subsetting and selecting data

Let’s open a WRF model output file:

In [1]: import salem

In [2]: from salem.utils import get_demo_file

In [3]: ds = salem.open_xr_dataset(get_demo_file('wrfout_d01.nc'))

Let’s take a time slice of the variable T2 for a start:

In [4]: t2 = ds.T2.isel(Time=2)

In [5]: t2.salem.quick_map()
Out[5]: <salem.graphics.Map at 0x7f8c8c7c25d0>
_images/plot_wrf_t2.png

Although we are on a Lambert Conformal projection, it’s possible to subset the file using longitudes and latitudes:

In [6]: t2_sub = t2.salem.subset(corners=((77., 20.), (97., 35.)), crs=salem.wgs84)

In [7]: t2_sub.salem.quick_map()
Out[7]: <salem.graphics.Map at 0x7f8c8d675fd0>
_images/plot_wrf_t2_corner_sub.png

It’s also possible to use geometries or shapefiles to subset your data:

In [8]: shdf = salem.read_shapefile(get_demo_file('world_borders.shp'))

In [9]: shdf = shdf.loc[shdf['CNTRY_NAME'].isin(['Nepal', 'Bhutan'])]  # GeoPandas' GeoDataFrame

In [10]: t2_sub = t2_sub.salem.subset(shape=shdf, margin=2)  # add 2 grid points

In [11]: t2_sub.salem.quick_map()
Out[11]: <salem.graphics.Map at 0x7f8c8d64fa90>
_images/plot_wrf_t2_country_sub.png

Based on the same principle, one can mask out the useless grid points:

In [12]: t2_roi = t2_sub.salem.roi(shape=shdf)

In [13]: t2_roi.salem.quick_map()
Out[13]: <salem.graphics.Map at 0x7f8c8cdfefd0>
_images/plot_wrf_t2_roi.png

Plotting

Maps can be pimped with topographical shading, points of interest, and more:

In [14]: smap = t2_roi.salem.get_map(data=t2_roi-273.15, cmap='RdYlBu_r', vmin=-14, vmax=18)

In [15]: _ = smap.set_topography(get_demo_file('himalaya.tif'))

In [16]: smap.set_shapefile(shape=shdf, color='grey', linewidth=3)

In [17]: smap.set_points(91.1, 29.6)

In [18]: smap.set_text(91.2, 29.7, 'Lhasa', fontsize=17)

In [19]: smap.visualize()
_images/plot_wrf_t2_topo.png

Maps are persistent, which is useful when you have many plots to do. Plotting further data on them is possible, as long as the geolocalisation information is shipped with the data (in that case, the DataArray’s attributes are lost in the conversion from Kelvins to degrees Celsius so we have to set it explicitly):

In [20]: smap.set_data(ds.T2.isel(Time=1)-273.15, crs=ds.salem.grid)

In [21]: smap.visualize(title='2m temp - large domain', cbar_title='C')
_images/plot_wrf_t2_transform.png

Reprojecting data

Salem can also transform data from one grid to another:

In [22]: dse = salem.open_xr_dataset(get_demo_file('era_interim_tibet.nc'))

In [23]: t2_era_reproj = ds.salem.transform(dse.t2m)

In [24]: assert t2_era_reproj.salem.grid == ds.salem.grid

In [25]: t2_era_reproj.isel(time=0).salem.quick_map()
Out[25]: <salem.graphics.Map at 0x7f8c8d5c28d0>
_images/plot_era_repr_nn.png
In [26]: t2_era_reproj = ds.salem.transform(dse.t2m, interp='spline')

In [27]: t2_era_reproj.isel(time=0).salem.quick_map()
Out[27]: <salem.graphics.Map at 0x7f8c8d5c2350>
_images/plot_era_repr_spline.png

Frequently Asked Questions

Is your library mature for production code?

No. The API is not always as clever as I wish it would, and it will probably change in the future. It is quite well tested though, at least for the cases I encounter in my daily work.

What others tools should I know about?

If you want to plot on maps, cartopy is probably one of the best tools you could pick. For reprojection workflows on large or numerous gridded files you probably want to use rasterio.

The python atmopsheric sciences community is a bit spread between iris and xarray for N-Dimensional data handling (I picked the later for it’s strong interaction with pandas). Several great libraries are available to meteorologists and climatologists, for example MetPy, windspharm, xgcm, and all the ones I forgot to mention.

But then, why developing Salem?

As an atmospheric scientist, I hate to have to take care about projections and maps. Salem was created to hide all these concerns. By the time I started, it seemed a good idea to provide map transformation tools without depending on GDAL (thanks to conda-forge GDAL is now much easier to install). It is still possible to do reprojection work in Salem using scipy and pyproj alone. It is al done in python and on memory, so don’t expect miracles on that side.

Furthermore, I use the atmospheric model WRF quite often in my work. Its output files are absolutely not compliant with the CF conventions. To my knowledge, there is no tool to plot and manipulate WRF data with Python, and Salem will be further developed with this model in mind.

What’s this ”.salem_cache” directory in my home folder?

At the first import, Salem will create a hidden directory called .salem_cache in your home folder. It will be used to download Salem’s demo files and standard shapefiles. This directory is also used by joblib to store the result of slow operations such as reading and transforming shapefiles, or downloading google maps from the internet. The cache should not become too large, but if it does: simply delete it.

What’s New

v0.2.0 (08 November 2016)

Salem is now released under a 3-clause BSD license.

Enhancements
  • New wrf_zlevel() and wrf_plevel() functions for vertical interpolation
  • Salem can now plot on cartopy’s maps thanks to a new proj_to_cartopy() function.
  • Doc improvements
  • New diagnostic variable: ‘WS’

v0.1.1 (27 October 2016)

Enhancements
  • Some doc improvements
  • New ds keyword to the accessors subset() and roi() methods
Bug fixes
  • Natural Earth file lr (low-res) now shipped with sample-data, mr and hr can still be downloaded if needed
  • Remove use to deprecated rasterio.drivers()
  • GeoNetCDF files without time variable should now open without error

v0.1.0 (22 October 2016)

Big refactoring (PR15), partly backwards incompatible (mostly renaming). Improved xarray accessors, WRF tools, merged Cleo into the codebase, added a first draft of documentation.

v0.0.9 (22 October 2016)

Initial release.

Installation

Salem is a pure python package, but it has several non-trivial dependencies.

Required dependencies

  • Python 2.7 or 3+ (Py3 recommended)
  • numpy (of course)
  • scipy: for its interpolation tools, among other things
  • pyproj: for map projections transformations
  • netCDF4: to read most geoscientific files
  • joblib: for it’s Memory class
  • six: for Py2 compatibility

Optional dependencies

Because not using them is a bad idea
  • pandas: working with labeled data
  • xarray (0.8 or later): pandas in N-dimensions
For vector and raster operations

(rasterio and geopandas both require GDAL)

For plotting

Instructions

The very best (unique?) way to install Salem without too much hassle is to install its dependencies with conda and conda-forge:

conda config --add channels conda-forge
conda install <package-name>

Currently, Salem can only be installed via pip:

pip install salem

If you want to install the latest master:

pip install git+https://github.com/fmaussion/salem.git

Warning

At the first import, Salem will create a hidden directory called .salem_cache in your home folder. It will be used to download Salem’s demo files and standard shapefiles. This directory is also used by joblib to store the result of slow operations such as reading and transforming shapefiles, or downloading google maps from the internet. The cache should not become too large, but if it does: simply delete it.

xarray accessors

One of the main purposes of Salem is to add georeferencing tools to xarray ‘s data structures. These tools can be accessed via a special .salem attribute, available for both xarray.DataArray and xarray.Dataset objects after a simple import salem in your code.

Initializing the accessor

Automated projection parsing

Salem will try to understand in which projection the Dataset or DataArray is defined. For example, with a platte carree (or lon/lat) projection, Salem will know what to do based on the coordinates’ names:

In [1]: import numpy as np

In [2]: import xarray as xr

In [3]: import salem

In [4]: da = xr.DataArray(np.arange(20).reshape(4, 5), dims=['lat', 'lon'],
   ...:                   coords={'lat':np.linspace(0, 30, 4), 'lon':np.linspace(-20, 20, 5)})
   ...: 

In [5]: da.salem
Out[5]: <salem.sio.DataArrayAccessor at 0x7f8c8474df50>

In [6]: da.salem.quick_map();
_images/plot_xarray_simple.png

While the above should work with many (most?) climate datasets (such as atmospheric reanalyses or GCM output), certain NetCDF files will have a less standard map projection requiring special parsing. There are conventions to formalise these things in the NetCDF model, but Salem doesn’t understand them yet. Currently, Salem can parse: - platte carree (or lon/lat) projections - WRF projections (see WRF tools) - virually any projection explicitly provided by the user - for geotiff files only: any projection that rasterio can understand

From geotiffs

Salem uses rasterio to open and parse geotiff files:

In [7]: fpath = salem.get_demo_file('himalaya.tif')

In [8]: ds = salem.open_xr_dataset(fpath)

In [9]: hmap = ds.salem.get_map(cmap='topo')

In [10]: hmap.set_data(ds['data'])

In [11]: hmap.visualize();
_images/plot_xarray_geotiff.png
Custom projections

Alternatively, Salem will understand any projection supported by pyproj. The proj info has to be provided as attribute:

In [12]: dutm = xr.DataArray(np.arange(20).reshape(4, 5), dims=['y', 'x'],
   ....:                     coords={'y': np.arange(3, 7)*2e5,
   ....:                             'x': np.arange(1, 6)*2e5})
   ....: 

In [13]: psrs = 'epsg:32630'  # http://spatialreference.org/ref/epsg/wgs-84-utm-zone-30n/

In [14]: dutm.attrs['pyproj_srs'] = psrs

In [15]: dutm.salem.quick_map(interp='linear');
_images/plot_xarray_utm.png

Using the accessor

The accessor’s methods are available trough the .salem attribute.

Keeping track of attributes

Some datasets carry their georeferencing information in global attributes (WRF model output files for example). This makes it possible for Salem to determine the data’s map projection. From the variables alone, however, this is not possible. This is the reason why it is recommended to use the open_xr_dataset() and open_wrf_dataset() function, which add an attribute to the variables automatically:

In [16]: dsw = salem.open_xr_dataset(salem.get_demo_file('wrfout_d01.nc'))

In [17]: dsw.T2.pyproj_srs
Out[17]: '+units=m +proj=lcc +lat_1=29.0400009155 +lat_2=29.0400009155 +lat_0=29.0399971008 +lon_0=89.8000030518 +x_0=0 +y_0=0 +a=6370000 +b=6370000'

Unfortunately, the DataArray attributes are lost when doing operations on them. It is the task of the user to keep track of this attribute:

In [18]: dsw.T2.mean(dim='Time', keep_attrs=True).salem  # triggers an error without keep_attrs
Out[18]: <salem.sio.DataArrayAccessor at 0x7f8c8464e6d0>
Reprojecting data

You can reproject a Dataset onto another one with the transform() function:

In [19]: dse = salem.open_xr_dataset(salem.get_demo_file('era_interim_tibet.nc'))

In [20]: dsr = ds.salem.transform(dse)

In [21]: dsr
Out[21]: 
<xarray.Dataset>
Dimensions:  (time: 4, x: 1875, y: 877)
Coordinates:
  * y        (y) float64 31.54 31.53 31.52 31.52 31.51 31.5 31.49 31.48 ...
  * x        (x) float64 78.32 78.33 78.33 78.34 78.35 78.36 78.37 78.38 ...
  * time     (time) datetime64[ns] 2012-06-01 2012-06-01T06:00:00 ...
Data variables:
    t2m      (time, y, x) float64 278.6 278.6 278.6 278.6 278.6 278.6 278.6 ...
Attributes:
    pyproj_srs: +units=m +init=epsg:4326 

In [22]: dsr.t2m.mean(dim='time').salem.quick_map();
_images/plot_xarray_transfo.png

Currently, salem implements, the neirest neighbor (default), linear, and spline interpolation methods:

In [23]: dsr = ds.salem.transform(dse, interp='spline')

In [24]: dsr.t2m.mean(dim='time').salem.quick_map();
_images/plot_xarray_transfo_spline.png
Subsetting data
In [25]: shdf = salem.read_shapefile(salem.get_demo_file('world_borders.shp'))

In [26]: shdf = shdf.loc[shdf['CNTRY_NAME'] == 'Nepal']

In [27]: dsr = dsr.salem.subset(shape=shdf, margin=10)

In [28]: dsr.t2m.mean(dim='time').salem.quick_map();
_images/plot_xarray_subset_out.png
Regions of interest
In [29]: dsr = dsr.salem.roi(shape=shdf)

In [30]: dsr.t2m.mean(dim='time').salem.quick_map();
_images/plot_xarray_roi_out.png

Graphics

Two options are offered to you when plotting geolocalised data on maps: you can use cartopy , or you can use salem’s Map object.

Plotting with cartopy

Plotting on maps using xarray and cartopy is extremely convenient.

With salem you can keep your usual plotting workflow, even with more exotic map projections:

In [1]: import matplotlib.pyplot as plt

In [2]: import cartopy

In [3]: from salem import open_wrf_dataset, get_demo_file

In [4]: ds = open_wrf_dataset(get_demo_file('wrfout_d01.nc'))

In [5]: ax = plt.axes(projection=cartopy.crs.Orthographic(70, 30))

In [6]: ax.set_global();

In [7]: ds.T2C.isel(time=1).plot.contourf(ax=ax, transform=ds.salem.cartopy());

In [8]: ax.coastlines();
_images/cartopy_orthographic.png

You can also use the salem accessor to initialise the plot’s map projection:

In [9]: proj = ds.salem.cartopy()

In [10]: ax = plt.axes(projection=proj)

In [11]: ax.coastlines();

In [12]: ax.add_feature(cartopy.feature.BORDERS, linestyle=':');

In [13]: ax.set_extent(ds.salem.grid.extent, crs=proj);
_images/cartopy_base.png

Plotting with salem

Salem comes with a homegrown plotting tool. It is less flexible than cartopy, but it was created to overcome some of cartopy’s limitations (e.g. the impossibility to add tick labels to lambert conformal maps), and to make nice looking regional maps:

In [14]: ds.T2C.isel(time=1).salem.quick_map()
Out[14]: <salem.graphics.Map at 0x7f8c8693cc50>
_images/salem_quickmap.png

Salem maps are different from cartopy’s in that they don’t change the matplotlib axes’ projection. The map background is always going to be a call to imshow(), with an image size decided at instanciation:

In [15]: from salem import mercator_grid, Map, open_xr_dataset

In [16]: grid = mercator_grid(center_ll=(10.76, 46.79), extent=(9e5, 4e5))

In [17]: grid.nx, grid.ny  # size of the input grid
Out[17]: (1350, 600)

In [18]: smap = Map(grid, nx=500)

In [19]: smap.grid.nx, smap.grid.ny  # size of the "image", and thus of the axes
Out[19]: (500, 222)

In [20]: smap.visualize(addcbar=False)
_images/map_central_europe.png

The map has it’s own grid, wich is used internally in order to transform the data that has to be plotted on it:

In [21]: ds = open_xr_dataset(get_demo_file('histalp_avg_1961-1990.nc'))

In [22]: smap.set_data(ds.prcp)  # histalp is a lon/lat dataset

In [23]: smap.visualize()
_images/map_histalp.png

Refer to Recipes for more examples on how to use salem’s maps.

WRF tools

Let’s open a WRF model output file with xarray first:

In [1]: import xarray as xr

In [2]: from salem.utils import get_demo_file

In [3]: ds = xr.open_dataset(get_demo_file('wrfout_d01.nc'))

WRF files are a bit messy:

In [4]: ds
Out[4]: 
<xarray.Dataset>
Dimensions:           (Time: 3, bottom_top: 27, bottom_top_stag: 28, south_north: 150, south_north_stag: 151, west_east: 150, west_east_stag: 151)
Coordinates:
    XLONG             (Time, south_north, west_east) float32 70.7231 70.9749 ...
    XLONG_U           (Time, south_north, west_east_stag) float32 70.5974 ...
    XLAT_U            (Time, south_north, west_east_stag) float32 7.76865 ...
    XLAT_V            (Time, south_north_stag, west_east) float32 7.66444 ...
    XLONG_V           (Time, south_north_stag, west_east) float32 70.7436 ...
    XLAT              (Time, south_north, west_east) float32 7.78907 7.82944 ...
  * Time              (Time) int64 0 1 2
  * south_north       (south_north) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 ...
  * west_east         (west_east) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 ...
  * bottom_top        (bottom_top) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 ...
  * west_east_stag    (west_east_stag) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 ...
  * south_north_stag  (south_north_stag) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 ...
  * bottom_top_stag   (bottom_top_stag) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 ...
Data variables:
    Times             (Time) |S19 '2008-10-26_12:00:00' ...
    T2                (Time, south_north, west_east) float32 301.977 302.008 ...
    RAINC             (Time, south_north, west_east) float32 0.0 0.0 0.0 0.0 ...
    RAINNC            (Time, south_north, west_east) float32 0.0 0.0 0.0 0.0 ...
    U                 (Time, bottom_top, south_north, west_east_stag) float32 1.59375 ...
    V                 (Time, bottom_top, south_north_stag, west_east) float32 -2.07031 ...
    PH                (Time, bottom_top_stag, south_north, west_east) float32 0.0 ...
    PHB               (Time, bottom_top_stag, south_north, west_east) float32 0.0 ...
Attributes:
    note: Global attrs removed.

WRF files aren’t exactly CF compliant: you’ll need a special parser for the timestamp, the coordinate names are a bit exotic and do not correspond to the dimension names, they contain so-called staggered variables (and their correponding coordinates), etc.

Salem defines a special parser for these files:

In [5]: import salem

In [6]: ds = salem.open_wrf_dataset(get_demo_file('wrfout_d01.nc'))

This parser greatly simplifies the file structure:

In [7]: ds
Out[7]: 
<xarray.Dataset>
Dimensions:       (bottom_top: 27, south_north: 150, time: 3, west_east: 150)
Coordinates:
    lon           (south_north, west_east) float32 70.7231 70.9749 71.2267 ...
    lat           (south_north, west_east) float32 7.78907 7.82944 7.86932 ...
  * time          (time) datetime64[ns] 2008-10-26T12:00:00 ...
  * south_north   (south_north) float64 -2.235e+06 -2.205e+06 -2.175e+06 ...
  * west_east     (west_east) float64 -2.235e+06 -2.205e+06 -2.175e+06 ...
  * bottom_top    (bottom_top) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 ...
Data variables:
    T2            (time, south_north, west_east) float32 301.977 302.008 ...
    RAINC         (time, south_north, west_east) float32 0.0 0.0 0.0 0.0 0.0 ...
    RAINNC        (time, south_north, west_east) float32 0.0 0.0 0.0 0.0 0.0 ...
    U             (time, bottom_top, south_north, west_east) float32 1.73438 ...
    V             (time, bottom_top, south_north, west_east) float32 -2.125 ...
    PH            (time, bottom_top, south_north, west_east) float32 18.7031 ...
    PHB           (time, bottom_top, south_north, west_east) float32 277.504 ...
    T2C           (time, south_north, west_east) float32 28.8266 28.8578 ...
    PRCP          (time, south_north, west_east) float32 nan nan nan nan nan ...
    WS            (time, bottom_top, south_north, west_east) float32 2.74293 ...
    GEOPOTENTIAL  (time, bottom_top, south_north, west_east) float32 296.207 ...
    Z             (time, bottom_top, south_north, west_east) float32 30.1944 ...
    PRCP_NC       (time, south_north, west_east) float32 nan nan nan nan nan ...
    PRCP_C        (time, south_north, west_east) float32 nan nan nan nan nan ...
Attributes:
    note: Global attrs removed.
    pyproj_srs: +units=m +proj=lcc +lat_1=29.0400009155 +lat_2=29.0400009155 +lat_0=29.0399971008 +lon_0=89.8000030518 +x_0=0 +y_0=0 +a=6370000 +b=6370000

Note that some dimensions / coordinates have been renamed, new variables have been defined, and the staggered dimensions have disappeared.

Diagnostic variables

Salem adds a layer between xarray and the underlying NetCDF file. This layer computes new variables “on the fly” or, in the case of staggered variables, “unstaggers” them:

In [8]: ds.U
Out[8]: 
<xarray.DataArray 'U' (time: 3, bottom_top: 27, south_north: 150, west_east: 150)>
[1822500 values with dtype=float32]
Coordinates:
    lon          (south_north, west_east) float32 70.7231 70.9749 71.2267 ...
    lat          (south_north, west_east) float32 7.78907 7.82944 7.86932 ...
  * time         (time) datetime64[ns] 2008-10-26T12:00:00 ...
  * south_north  (south_north) float64 -2.235e+06 -2.205e+06 -2.175e+06 ...
  * west_east    (west_east) float64 -2.235e+06 -2.205e+06 -2.175e+06 ...
  * bottom_top   (bottom_top) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 ...
Attributes:
    FieldType: 104
    MemoryOrder: XYZ
    description: x-wind component
    units: m s-1
    stagger: X
    pyproj_srs: +units=m +proj=lcc +lat_1=29.0400009155 +lat_2=29.0400009155 +lat_0=29.0399971008 +lon_0=89.8000030518 +x_0=0 +y_0=0 +a=6370000 +b=6370000

This computation is done only on demand (just like a normal NetCDF variable), this new layer is therefore relatively cheap.

In addition to unstaggering, Salem adds a number of “diagnostic” variables to the dataset. Some are convenient (like T2C, temperature in Celsius instead of Kelvins), but others are more complex (e.g. SLP for sea-level pressure, or PRCP which computes step-wize total precipitation out of the accumulated fields). For a list of diagnostic variables (and TODOs!), refer to GH18.

In [9]: ds.PRCP.isel(time=-1).salem.quick_map(cmap='Purples', vmax=5)
Out[9]: <salem.graphics.Map at 0x7f8c84863910>
_images/plot_wrf_diag.png

Vertical interpolation

The WRF vertical coordinates are eta-levels, which is not a very practical coordinate for analysis or plotting. With the functions wrf_zlevel() and wrf_plevel() it is possible to interpolate the 3d data at either altitude or pressure levels:

In [10]: ws_h = ds.isel(time=1).salem.wrf_zlevel('WS', levels=10000.)

In [11]: ws_h.salem.quick_map();
_images/plot_wrf_zinterp.png

Note that currently, the interpolation is quite slow, see GH25. It’s probably wise to do it on single time slices or aggregated data, rather than huge data cubes.

Geogrid simulator

Salem provides a small tool which comes handy when defining new WRF domains. It parses the WPS configuration file (namelist.wps) and generates the grids and maps corresponding to each domain.

geogrid_simulator() will search for the &geogrid section of the file and parse it:

In [12]: fpath = get_demo_file('namelist_mercator.wps')

In [13]: with open(fpath, 'r') as f:  # this is just to show the file
   ....:     print(f.read())
   ....: 
&geogrid
 parent_id         =   1, 1, 2, 3,
 parent_grid_ratio =   1, 3, 5, 5,
 i_parent_start    =   1, 28, 54, 40,
 j_parent_start    =   1, 24, 97, 50,
 e_we              =  211, 178, 101, 136,
 e_sn              =  131, 220, 141, 136,
 dx = 30000,
 dy = 30000,
 map_proj = 'mercator',
 ref_lat   =  -8.0,
 ref_lon   =  -62.0,
 truelat1  =  -8.0,
/



In [14]: from salem import geogrid_simulator

In [15]: g, maps = geogrid_simulator(fpath)

In [16]: maps[0].set_rgb(natural_earth='lr')  # add a background image

In [17]: maps[0].visualize(title='Domains 1 to 4')
_images/plot_geo_simu.png

Georeferencing

A point on earth can be defined unambiguously in two ways:

DATUM (lon, lat, datum)
longitudes and latitudes are angular coordinates of a point on an ellipsoid (datum)
PROJ (x, y, projection)
x (eastings) and y (northings) are cartesian coordinates of a point in a map projection (the unit of x, y is usually meter)

Salem adds a third coordinate reference system (crs) to this list:

GRID (i, j, Grid)
on a structured grid, the (x, y) coordinates are distant of a constant (dx, dy) step. The (x, y) coordinates are therefore equivalent to a new reference frame (i, j) proportional to the projection’s (x, y) frame.

Transformations between datums and projections is handled by several tools in the python ecosystem, for example GDAL or the more lightweight pyproj, which is the tool Salem is using internally [1].

The concept of Grid added by Salem is useful when transforming data between two structured datasets, or from an unstructured dataset to a structured one.

[1]Most datasets nowadays are defined in the WGS 84 datum, therefore the concepts of datum and projection are often interchangeable: (lon, lat) coordinates are equivalent to cartesian (x, y) coordinates in the plate carree projection.

Grids

A Grid is defined by a projection, a reference point in this projection, a grid spacing and a number of grid points:

In [1]: import numpy as np

In [2]: import salem

In [3]: from salem import wgs84

In [4]: grid = salem.Grid(nxny=(3, 2), dxdy=(1, 1), ll_corner=(0.5, 0.5), proj=wgs84)

In [5]: x, y = grid.xy_coordinates

In [6]: x
Out[6]: 
array([[ 0.5,  1.5,  2.5],
       [ 0.5,  1.5,  2.5]])

In [7]: y
Out[7]: 
array([[ 0.5,  0.5,  0.5],
       [ 1.5,  1.5,  1.5]])

The default in Salem is to define the grids according to the pixels center point:

In [8]: smap = salem.Map(grid)

In [9]: smap.set_data(np.arange(6).reshape((2, 3)))

In [10]: lon, lat = grid.ll_coordinates

In [11]: smap.set_points(lon.flatten(), lat.flatten())

In [12]: smap.visualize(addcbar=False)
_images/plot_example_grid.png

But with the pixel_ref keyword you can use another convention. For Salem, the two conventions are identical:

In [13]: grid_c = salem.Grid(nxny=(3, 2), dxdy=(1, 1), ll_corner=(0, 0),
   ....:                     proj=wgs84, pixel_ref='corner')
   ....: 

In [14]: assert grid_c == grid

While it’s good to know how grids work, most of the time grids should be inferred directly from the data files:

In [15]: ds = salem.open_xr_dataset(salem.get_demo_file('himalaya.tif'))

In [16]: grid = ds.salem.grid

In [17]: grid.proj.srs
Out[17]: '+units=m +init=epsg:4326 '

In [18]: grid.extent
Out[18]: 
[78.312513473116013,
 93.937514288103202,
 24.237496572347762,
 31.545830286884001]

Grids come with several convenience functions, for example for transforming a point onto the grid coordinates:

In [19]: grid.transform(85, 27, crs=salem.wgs84)
Out[19]: (801.99834136842378, 544.9996059728029)

Or for reprojecting structured data (the xarray accessors transform() method calls map_gridded_data() internally):

In [20]: dse = salem.open_xr_dataset(salem.get_demo_file('era_interim_tibet.nc'))

In [21]: t2_era_reproj = ds.salem.transform(dse.t2m.isel(time=0))

In [22]: t2_era_reproj.salem.quick_map()
Out[22]: <salem.graphics.Map at 0x7f8c8c8ccb10>
_images/plot_reproj_grid.png

Recipes

Plotting

How to make nice looking plots.

Topographic shading

Add topographic shading to a plot

_images/sphx_glr_plot_topo_shading_001.png
from salem import mercator_grid, Map, get_demo_file
import matplotlib.pyplot as plt

# prepare the figure
f, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, figsize=(8, 7))

# map extent
grid = mercator_grid(center_ll=(10.76, 46.79), extent=(18000, 14000))
sm = Map(grid, countries=False)
sm.set_lonlat_contours(interval=0)

# add topography
fpath = get_demo_file('hef_srtm.tif')
sm.set_topography(fpath)
sm.visualize(ax=ax1, addcbar=False, title='relief_factor=0.7 (default)')

# stronger shading
sm.set_topography(fpath, relief_factor=1.4)
sm.visualize(ax=ax2, addcbar=False, title='relief_factor=1.4')

# add color shading
z = sm.set_topography(fpath)
sm.set_data(z)
sm.set_cmap('topo')
sm.visualize(ax=ax3, title='Color with shading', addcbar=False)

# color without topo shading
sm.set_topography()
sm.visualize(ax=ax4, title='Color without shading', addcbar=False)

# make it nice
plt.tight_layout()
plt.show()

Total running time of the script: ( 0 minutes 1.394 seconds)

Generated by Sphinx-Gallery

Plot overlays

Add contours and wind arrows to a salem plot

_images/sphx_glr_plot_wind_overlay_001.png
import salem
import numpy as np
import matplotlib.pyplot as plt

# get the data at the latest time step
ds = salem.open_wrf_dataset(salem.get_demo_file('wrfout_d01.nc')).isel(time=-1)

# get the wind data at 10000 m a.s.l.
u = ds.salem.wrf_zlevel('U', 10000.)
v = ds.salem.wrf_zlevel('V', 10000.)
ws = ds.salem.wrf_zlevel('WS', 10000.)

# get the axes ready
f, ax = plt.subplots()

# plot the salem map background, make countries in grey
smap = ds.salem.get_map(countries=False)
smap.set_shapefile(countries=True, color='grey')
smap.plot(ax=ax)

# transform the coordinates to the map reference system and contour the data
xx, yy = smap.grid.transform(ws.west_east.values, ws.south_north.values,
                             crs=ws.salem.grid.proj)
cs = ax.contour(xx, yy, ws, cmap='viridis', levels=np.arange(0, 81, 10),
                linewidths=2)

# Quiver only every 7th grid point
u = u[4::7, 4::7]
v = v[4::7, 4::7]

# transform their coordinates to the map reference system and plot the arrows
xx, yy = smap.grid.transform(u.west_east.values, u.south_north.values,
                             crs=u.salem.grid.proj)
xx, yy = np.meshgrid(xx, yy)
qu = ax.quiver(xx, yy, u.values, v.values)
qk = plt.quiverkey(qu, 0.7, 0.95, 50, '50 m s$^{-1}$',
                   labelpos='E', coordinates='figure')

# done!
plt.show()

Total running time of the script: ( 0 minutes 12.158 seconds)

Generated by Sphinx-Gallery

Plot on a google map background

Google static maps API

In this script, we use motionless to download an image from the google static map API and plot it on a Map. We then add information to the map such as a glacier outline (from the RGI), and ground penetrating radar measurements (GPR, from the GlaThiDa database).

The GPR measurements were realized in 1997, the glacier outlines are from 2003, and the map background is from 2016. This illustrates the retreat of the Kesselwandferner glacier.

_images/sphx_glr_plot_googlestatic_001.png
import numpy as np
import pandas as pd
import salem
from salem import get_demo_file, DataLevels, GoogleVisibleMap, Map
import matplotlib.pyplot as plt

# prepare the figure
f, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))

# read the shapefile and use its extent to define a ideally sized map
shp = salem.read_shapefile(get_demo_file('rgi_kesselwand.shp'))
# I you need to do a lot of maps you might want
# to use an API key and set it here with key='YOUR_API_KEY'
g = GoogleVisibleMap(x=[shp.min_x, shp.max_x], y=[shp.min_y, shp.max_y],
                     maptype='satellite')  # try out also: 'terrain'

# the google static image is a standard rgb image
ggl_img = g.get_vardata()
ax1.imshow(ggl_img)
ax1.set_title('Google static map')

# make a map of the same size as the image (no country borders)
sm = Map(g.grid, factor=1, countries=False)
sm.set_shapefile(shp)  # add the glacier outlines
sm.set_rgb(ggl_img)  # add the background rgb image
sm.visualize(ax=ax2)  # plot it
ax2.set_title('GPR measurements')

# read the point GPR data and add them to the plot
df = pd.read_csv(get_demo_file('gtd_ttt_kesselwand.csv'))
dl = DataLevels(df.THICKNESS, levels=np.arange(10, 201, 10), extend='both')
x, y = sm.grid.transform(df.POINT_LON.values, df.POINT_LAT.values)
ax2.scatter(x, y, color=dl.to_rgb(), s=50, edgecolors='k', linewidths=1)
dl.append_colorbar(ax2, label='Ice thickness (m)')

# make it nice
plt.tight_layout()
plt.show()

Total running time of the script: ( 0 minutes 1.437 seconds)

Generated by Sphinx-Gallery

Generated by Sphinx-Gallery

API reference

Here we will add the documentation for selected modules.

Grid

Grid A structured grid on a map projection.
Grid methods
Grid.extent_in_crs Get the extent of the grid in a desired crs.
Grid.ij_to_crs Converts local i, j to cartesian coordinates in a specified crs
Grid.map_gridded_data Reprojects any structured data onto the local grid.
Grid.region_of_interest Computes a region of interest (ROI).
Grid.regrid Make a copy of the grid with an updated spatial resolution.
Grid.transform Converts any coordinates into the local grid.

Georeferencing utils

check_crs Checks if the crs represents a valid grid, projection or ESPG string.
proj_is_same Checks is two pyproj projections are equal.
proj_to_cartopy Converts a pyproj.Proj to a cartopy.crs.Projection
transform_proj Wrapper around the pyproj.transform function.
transform_geometry Reprojects a shapely geometry.
transform_geopandas Reprojects a geopandas dataframe.
mercator_grid Local transverse mercator map centered on a specified point.

Graphics

DataLevels Assigns the right color to your data.
Map Plotting georeferenced data.
get_cmap Get a colormap from mpl, and also those defined by Salem.
Map & DataLevels methods
DataLevels.append_colorbar Appends a colorbar to existing axes
DataLevels.colorbarbase Returns a ColorbarBase to add to the cax axis.
DataLevels.set_cmap Set a colormap.
DataLevels.set_data Any kind of data array (also masked).
DataLevels.set_plot_params Shortcut to all parameters related to the plot.
DataLevels.set_extend Colorbar extensions: ‘neither’ | ‘both’ | ‘min’ | ‘max’
DataLevels.set_levels Levels you define.
DataLevels.set_nlevels Automatic N levels.
DataLevels.set_vmax Maximum level value.
DataLevels.set_vmin Mininum level value.
DataLevels.visualize Quick plot, useful for debugging.
DataLevels.plot Add a kind of plot of the data to an axis.
Map methods
Map.set_data Adds data to the plot.
Map.set_contour Adds data to contour on the map.
Map.set_contourf Adds data to contourfill on the map.
Map.set_geometry Adds any Shapely geometry to the map.
Map.set_lonlat_contours Add longitude and latitude contours to the map.
Map.set_points Shortcut for set_geometry() accepting coordinates as input.
Map.set_rgb Manually force to a rgb img
Map.set_shapefile Add a shapefile to the plot.
Map.set_text Add a text to the map.
Map.set_topography Add topographical shading to the map.

Input/output

get_demo_file Returns the path to the desired demo file.
read_shapefile Reads a shapefile using geopandas.
read_shapefile_to_grid Same as read_shapefile but directly transformed to a grid.

xarray

open_xr_dataset Thin wrapper around xarray’s open_dataset.
open_wrf_dataset Wrapper around xarray’s open_dataset to make WRF files a bit better.
xarray accessors

Salem adds accessors to xarray objects. They can be accessed via the .salem attribute and add the following methods:

DataArray
DataArrayAccessor.subset Get a subset of the dataset.
DataArrayAccessor.roi Make a region of interest (ROI) for the dataset.
DataArrayAccessor.transform Reprojects an other Dataset or DataArray onto this grid.
DataArrayAccessor.cartopy Get a cartopy.crs.Projection for this dataset.
DataArrayAccessor.get_map Make a salem.Map out of the dataset.
DataArrayAccessor.quick_map Make a plot of the DataArray.
DataArrayAccessor.interpz Interpolates the array along the vertical dimension
Dataset
DatasetAccessor.subset Get a subset of the dataset.
DatasetAccessor.roi Make a region of interest (ROI) for the dataset.
DatasetAccessor.transform Reprojects an other Dataset or DataArray onto this grid.
DatasetAccessor.transform_and_add Reprojects an other Dataset and adds it’s content to the current one.
DatasetAccessor.cartopy Get a cartopy.crs.Projection for this dataset.
DatasetAccessor.get_map Make a salem.Map out of the dataset.
DatasetAccessor.quick_map Make a plot of a variable of the DataSet.
DatasetAccessor.wrf_zlevel Interpolates to a specified height above sea level.
DatasetAccessor.wrf_plevel Interpolates to a specified pressure level (hPa).

Old-style datasets

Old-style Datasets (prior to xarray), kept for backwards compatibility reasons and because they are quite lightweight. They might be replaced by xarray’s datasets one day.

GeoDataset Interface for georeferenced datasets.
GeoTiff Geolocalised tiff images (needs rasterio).
GeoNetcdf NetCDF files with geolocalisation info.
WRF WRF proof-of-concept template.
GoogleCenterMap Google static map centered on a point.
GoogleVisibleMap Google static map automatically sized and zoomed to a selected region.

Other

geogrid_simulator Emulates geogrid.exe, which is useful when defining new WRF domains.

Get in touch

  • To ask questions or discuss Salem, send me an e-mail.
  • Report bugs, share your ideas or view the source code on GitHub.

License

Salem is available under the open source 3-clause BSD license.

About

Status:

Experimental - in development

License:

3-clause BSD

Authors:

Fabien Maussion

Funding:http://acinn.uibk.ac.at/sites/all/themes/imgi/images/acinn_logo.png