API

Parameters

class roocs_utils.parameter.area_parameter.AreaParameter(input)[source]

Bases: roocs_utils.parameter.base_parameter._BaseParameter

Class for area parameter used in subsetting operation.

Area can be input as:
A string of comma separated values: “0.,49.,10.,65”
A sequence of strings: (“0”, “-10”, “120”, “40”)
A sequence of numbers: [0, 49.5, 10, 65]

An area must have 4 values.

Validates the area input and parses the values into numbers.

asdict()[source]

Returns a dictionary of the area values

parse_method = '_parse_sequence'
property tuple

Returns a tuple of the area values

class roocs_utils.parameter.collection_parameter.CollectionParameter(input)[source]

Bases: roocs_utils.parameter.base_parameter._BaseParameter

Class for collection parameter used in operations.

A collection can be input as:
A string of comma separated values: “cmip5.output1.INM.inmcm4.rcp45.mon.ocean.Omon.r1i1p1.latest.zostoga,cmip5.output1.MPI-M.MPI-ESM-LR.rcp45.mon.ocean.Omon.r1i1p1.latest.zostoga”
A sequence of strings: e.g. (“cmip5.output1.INM.inmcm4.rcp45.mon.ocean.Omon.r1i1p1.latest.zostoga”, “cmip5.output1.MPI-M.MPI-ESM-LR.rcp45.mon.ocean.Omon.r1i1p1.latest.zostoga”)
A sequence of roocs_utils.utils.file_utils.FileMapper objects

Validates the input and parses the items.

parse_method = '_parse_sequence'
property tuple

Returns a tuple of the collection items

class roocs_utils.parameter.level_parameter.LevelParameter(input)[source]

Bases: roocs_utils.parameter.base_parameter._BaseParameter

Class for level parameter used in subsetting operation.

Level can be input as:
A string of slash separated values: “1000/2000”
A sequence of strings: e.g. (“1000.50”, “2000.60”)
A sequence of numbers: e.g. (1000.50, 2000.60)

A level input must be 2 values.

If using a string input a trailing slash indicates you want to use the lowest/highest level of the dataset. e.g. “/2000” will subset from the lowest level in the dataset to 2000.

Validates the level input and parses the values into numbers.

asdict()[source]

Returns a dictionary of the level values

parse_method = '_parse_range'
property tuple

Returns a tuple of the level values

class roocs_utils.parameter.time_parameter.TimeParameter(input)[source]

Bases: roocs_utils.parameter.base_parameter._BaseParameter

Class for time parameter used in subsetting operation.

Time can be input as:
A string of slash separated values: “2085-01-01T12:00:00Z/2120-12-30T12:00:00Z”
A sequence of strings: e.g. (“2085-01-01T12:00:00Z”, “2120-12-30T12:00:00Z”)

A time input must be 2 values.

If using a string input a trailing slash indicates you want to use the earliest/ latest time of the dataset. e.g. “2085-01-01T12:00:00Z/” will subset from 01/01/2085 to the final time in the dataset.

Validates the times input and parses the values into isoformat.

asdict()[source]

Returns a dictionary of the time values

parse_method = '_parse_range'
property tuple

Returns a tuple of the time values

class roocs_utils.parameter.dimension_parameter.DimensionParameter(input)[source]

Bases: roocs_utils.parameter.base_parameter._BaseParameter

Class for dimensions parameter used in averaging operation.

Area can be input as:
A string of comma separated values: “time,latitude,longitude”
A sequence of strings: (“time”, “longitude”)

Dimensions can be None or any number of options from time, latitude, longitude and level provided these exist in the dataset being operated on.

Validates the dims input and parses the values into a sequence of strings.

asdict()[source]

Returns a dictionary of the dimensions

parse_method = '_parse_sequence'
property tuple

Returns a tuple of the dimensions

roocs_utils.parameter.parameterise.parameterise(collection=None, area=None, level=None, time=None)[source]

Parameterises inputs to instances of parameter classes which allows them to be used throughout roocs. For supported formats for each input please see their individual classes.

Parameters
  • collection – Collection input in any supported format.

  • area – Area input in any supported format.

  • level – Level input in any supported format.

  • time – Time input in any supported format.

Returns

Parameters as instances of their respective classes.

Project Utils

class roocs_utils.project_utils.DatasetMapper(dset, project=None, force=False)[source]

Bases: object

Class to map to data path, dataset ID and files from any dataset input.

dset must be a string and can be input as:
A dataset ID: e.g. “cmip5.output1.INM.inmcm4.rcp45.mon.ocean.Omon.r1i1p1.latest.zostoga”
A file path: e.g. “/badc/cmip5/data/cmip5/output1/MOHC/HadGEM2-ES/rcp85/mon/atmos/Amon/r1i1p1/latest/tas/tas_Amon_HadGEM2-ES_rcp85_r1i1p1_200512-203011.nc”
A path to a group of files: e.g. “/badc/cmip5/data/cmip5/output1/MOHC/HadGEM2-ES/rcp85/mon/atmos/Amon/r1i1p1/latest/tas/*.nc”
A directory e.g. “/badc/cmip5/data/cmip5/output1/MOHC/HadGEM2-ES/rcp85/mon/atmos/Amon/r1i1p1/latest/tas”
An instance of the FileMapper class (that represents a set of files within a single directory)

When force=True, if the project can not be identified, any attempt to use the base_dir of a project to resolve the data path will be ignored. Any of data_path, ds_id and files that can be set, will be set.

property base_dir

The base directory of the input dataset.

property data_path

Dataset input converted to a data path.

property ds_id

Dataset input converted to a ds id.

property files

The files found from the input dataset.

property project

The project of the dataset input.

property raw

Raw dataset input.

roocs_utils.project_utils.datapath_to_dsid(datapath)[source]

Switches from dataset path to ds id.

Parameters

datapath – dataset path.

Returns

dataset id of input dataset path.

roocs_utils.project_utils.derive_ds_id(dset)[source]

Derives the dataset id of the provided dset.

Parameters

dset – dset input of type described by DatasetMapper.

Returns

ds id of input dataset.

roocs_utils.project_utils.derive_dset(dset)[source]

Derives the dataset path of the provided dset.

Parameters

dset – dset input of type described by DatasetMapper.

Returns

dataset path of input dataset.

roocs_utils.project_utils.dset_to_filepaths(dset, force=False)[source]

Gets filepaths deduced from input dset.

Parameters
  • dset – dset input of type described by DatasetMapper.

  • force – When True and if the project of the input dset cannot be identified, DatasetMapper will attempt to find the files anyway. Default is False.

Returns

File paths deduced from input dataset.

roocs_utils.project_utils.dsid_to_datapath(dsid)[source]

Switches from ds id to dataset path.

Parameters

dsid – dataset id.

Returns

dataset path of input dataset id.

roocs_utils.project_utils.get_data_node_dirs_dict()[source]

Get a dictionary of the data node roots used for retreiving original files.

roocs_utils.project_utils.get_facet(facet_name, facets, project)[source]

Get facet from project config

roocs_utils.project_utils.get_project_base_dir(project)[source]

Get the base directory of a project from the config.

roocs_utils.project_utils.get_project_from_data_node_root(url)[source]

Identify the project from data node root by identifyng the data node root in the input url.

roocs_utils.project_utils.get_project_from_ds(ds)[source]

Gets the project from an xarray Dataset/DataArray.

Parameters

ds – xarray Dataset/DataArray.

Returns

The project derived from the input dataset.

roocs_utils.project_utils.get_project_name(dset)[source]

Gets the project from an input dset.

Parameters

dset – dset input of type described by DatasetMapper.

Returns

The project derived from the input dataset.

roocs_utils.project_utils.get_projects()[source]

Gets all the projects available in the config.

roocs_utils.project_utils.map_facet(facet, project)[source]

Return mapped facet value from config or facet name if not found.

roocs_utils.project_utils.switch_dset(dset)[source]

Switches between dataset path and ds id.

Parameters

dset – either dataset path or dataset ID.

Returns

either dataset path or dataset ID - switched from the input.

roocs_utils.project_utils.url_to_file_path(url)[source]

Convert input url of an original file to a file path

Xarray Utils

roocs_utils.xarray_utils.xarray_utils.convert_coord_to_axis(coord)[source]

Converts coordinate type to its single character axis identifier (tzyx).

Parameters

coord – (str) The coordinate to convert.

Returns

(str) The single character axis identifier of the coordinate (tzyx).

roocs_utils.xarray_utils.xarray_utils.get_coord_by_attr(ds, attr, value)[source]

Returns a coordinate based on a known attribute of a coordinate.

Parameters
  • ds – Xarray Dataset or DataArray

  • attr – (str) Name of attribute to look for.

  • value – Expected value of attribute you are looking for.

Returns

Coordinate of xarray dataset if found.

roocs_utils.xarray_utils.xarray_utils.get_coord_by_type(ds, coord_type, ignore_aux_coords=True)[source]

Returns the xarray Dataset or DataArray coordinate of the specified type.

Parameters
  • ds – Xarray Dataset or DataArray

  • coord_type – (str) Coordinate type to find.

  • ignore_aux_coords – (bool) If True then coordinates that are not dimensions are ignored. Default is True.

Returns

Xarray Dataset coordinate (ds.coords[coord_id])

roocs_utils.xarray_utils.xarray_utils.get_coord_type(coord)[source]

Gets the coordinate type.

Parameters

coord – coordinate of xarray dataset e.g. coord = ds.coords[coord_id]

Returns

The type of coordinate as a string. Either longitude, latitude, time, level or None

roocs_utils.xarray_utils.xarray_utils.get_main_variable(ds, exclude_common_coords=True)[source]

Finds the main variable of an xarray Dataset

Parameters
  • ds – xarray Dataset

  • exclude_common_coords – (bool) If True then common coordinates are excluded from the search for the main variable. common coordinates are time, level, latitude, longitude and bounds. Default is True.

Returns

(str) The main variable of the dataset e.g. ‘tas’

roocs_utils.xarray_utils.xarray_utils.is_latitude(coord)[source]

Determines if a coordinate is latitude.

Parameters

coord – coordinate of xarray dataset e.g. coord = ds.coords[coord_id]

Returns

(bool) True if the coordinate is latitude.

roocs_utils.xarray_utils.xarray_utils.is_level(coord)[source]

Determines if a coordinate is level.

Parameters

coord – coordinate of xarray dataset e.g. coord = ds.coords[coord_id]

Returns

(bool) True if the coordinate is level.

roocs_utils.xarray_utils.xarray_utils.is_longitude(coord)[source]

Determines if a coordinate is longitude.

Parameters

coord – coordinate of xarray dataset e.g. coord = ds.coords[coord_id]

Returns

(bool) True if the coordinate is longitude.

roocs_utils.xarray_utils.xarray_utils.is_time(coord)[source]

Determines if a coordinate is time.

Parameters

coord – coordinate of xarray dataset e.g. coord = ds.coords[coord_id]

Returns

(bool) True if the coordinate is time.

roocs_utils.xarray_utils.xarray_utils.open_xr_dataset(dset)[source]

Opens an xarray dataset from a dataset input.

Parameters

dset – (Str or Path) ds_id, directory path or file path ending in *.nc.

Any list will be interpreted as list of files

Other utilities

roocs_utils.utils.common.parse_size(size)[source]

Parse size string into number of bytes.

Parameters

size – (str) size to parse in any unit

Returns

(int) number of bytes

class roocs_utils.utils.time_utils.AnyCalendarDateTime(year, month, day, hour, minute, second)[source]

Bases: object

A class to represent a datetime that could be of any calendar.

Has the ability to add and subtract a day from the input based on MAX_DAY, MIN_DAY, MAX_MONTH and MIN_MONTH

DAY_RANGE = range(1, 32)
HOUR_RANGE = range(0, 24)
MINUTE_RANGE = range(0, 60)
MONTH_RANGE = range(1, 13)
SECOND_RANGE = range(0, 60)
add_day()[source]

Add a day to the input datetime.

sub_day(n=1)[source]

Subtract a day to the input datetime.

validate_input(input, name, range)[source]
property value
roocs_utils.utils.time_utils.str_to_AnyCalendarDateTime(dt)[source]

Takes a string representing date/time and returns a DateTimeAnyTime object. String formats should start with Year and go through to Second, but you can miss out anything from month onwards.

Parameters

dt – string representing a date/time [string]

Returns

AnyCalendarDateTime object

roocs_utils.utils.time_utils.to_isoformat(tm)[source]

Returns an ISO 8601 string from a time object (of different types).

Parameters

tm – Time object

Returns

(str) ISO 8601 time string

class roocs_utils.utils.file_utils.FileMapper(file_list, dirpath=None)[source]

Bases: object

Class to represent a set of files that exist in the same directory as one object.

Parameters
  • file_list – the list of files to represent. If dirpath not providedm these should be full file paths.

  • dirpath – The directory path where the files exist. Default is None.

If dirpath is not provided it will be deduced from the file paths provided in file_list.

file_list

list of file names of the files represented.

file_paths

list of full file paths of the files represented.

dirpath

The directory path where the files exist. Either deduced or provided.

roocs_utils.utils.file_utils.is_file_list(coll)[source]

Checks whether a collection is a list of files.

Parameters

(list) (coll) – collection to check.

Returns

True if collection is a list of files, else returns False.