API
Parameters
- class roocs_utils.parameter.area_parameter.AreaParameter(input)[source]
Bases:
roocs_utils.parameter.base_parameter._BaseParameter
Class for area parameter used in subsetting operation.
Area can be input as:A string of comma separated values: “0.,49.,10.,65”A sequence of strings: (“0”, “-10”, “120”, “40”)A sequence of numbers: [0, 49.5, 10, 65]An area must have 4 values.
Validates the area input and parses the values into numbers.
- allowed_input_types = [<class 'collections.abc.Sequence'>, <class 'str'>, <class 'roocs_utils.parameter.param_utils.Series'>, <class 'NoneType'>]
- asdict()[source]
Returns a dictionary of the area values
- class roocs_utils.parameter.collection_parameter.CollectionParameter(input)[source]
Bases:
roocs_utils.parameter.base_parameter._BaseParameter
Class for collection parameter used in operations.
A collection can be input as:A string of comma separated values: “cmip5.output1.INM.inmcm4.rcp45.mon.ocean.Omon.r1i1p1.latest.zostoga,cmip5.output1.MPI-M.MPI-ESM-LR.rcp45.mon.ocean.Omon.r1i1p1.latest.zostoga”A sequence of strings: e.g. (“cmip5.output1.INM.inmcm4.rcp45.mon.ocean.Omon.r1i1p1.latest.zostoga”, “cmip5.output1.MPI-M.MPI-ESM-LR.rcp45.mon.ocean.Omon.r1i1p1.latest.zostoga”)A sequence of roocs_utils.utils.file_utils.FileMapper objectsValidates the input and parses the items.
- allowed_input_types = [<class 'collections.abc.Sequence'>, <class 'str'>, <class 'roocs_utils.parameter.param_utils.Series'>, <class 'roocs_utils.utils.file_utils.FileMapper'>]
- class roocs_utils.parameter.level_parameter.LevelParameter(input)[source]
Bases:
roocs_utils.parameter.base_parameter._BaseIntervalOrSeriesParameter
Class for level parameter used in subsetting operation.
Level can be input as:A string of slash separated values: “1000/2000”A sequence of strings: e.g. (“1000.50”, “2000.60”)A sequence of numbers: e.g. (1000.50, 2000.60)A level input must be 2 values.
If using a string input a trailing slash indicates you want to use the lowest/highest level of the dataset. e.g. “/2000” will subset from the lowest level in the dataset to 2000.
Validates the level input and parses the values into numbers.
- asdict()[source]
Returns a dictionary of the level values
- class roocs_utils.parameter.time_parameter.TimeParameter(input)[source]
Bases:
roocs_utils.parameter.base_parameter._BaseIntervalOrSeriesParameter
Class for time parameter used in subsetting operation.
Time can be input as:A string of slash separated values: “2085-01-01T12:00:00Z/2120-12-30T12:00:00Z”A sequence of strings: e.g. (“2085-01-01T12:00:00Z”, “2120-12-30T12:00:00Z”)A time input must be 2 values.
If using a string input a trailing slash indicates you want to use the earliest/ latest time of the dataset. e.g. “2085-01-01T12:00:00Z/” will subset from 01/01/2085 to the final time in the dataset.
Validates the times input and parses the values into isoformat.
- asdict()[source]
Returns a dictionary of the time values
- get_bounds()[source]
Returns a tuple of the (start, end) times, calculated from the value of the parameter. Either will default to None.
- class roocs_utils.parameter.time_components_parameter.TimeComponentsParameter(input)[source]
Bases:
roocs_utils.parameter.base_parameter._BaseParameter
Class for time components parameter used in subsetting operation.
- The Time Components are any, or none of:
year: [list of years]
month: [list of months]
day: [list of days]
hour: [list of hours]
minute: [list of minutes]
second: [list of seconds]
- month is special: you can use either strings or values:
“feb”, “mar” == 2, 3 == “02,03”
Validates the times input and parses them into a dictionary.
- allowed_input_types = [<class 'dict'>, <class 'str'>, <class 'roocs_utils.parameter.param_utils.TimeComponents'>, <class 'NoneType'>]
- asdict()[source]
- get_bounds()[source]
Returns a tuple of the (start, end) times, calculated from the value of the parameter. Either will default to None.
- class roocs_utils.parameter.param_utils.Interval(*data)[source]
Bases:
object
A simple class for handling an interval of any type. It holds a start and end but does not try to resolve the range, it is just a container to be used by other tools. The contents can be of any type, such as datetimes, strings etc.
- class roocs_utils.parameter.param_utils.Series(*data)[source]
Bases:
object
A simple class for handling a series selection, created by any sequence as input. It has a value that holds the sequence as a list.
- class roocs_utils.parameter.param_utils.TimeComponents(year=None, month=None, day=None, hour=None, minute=None, second=None)[source]
Bases:
object
A simple class for parsing and representing a set of time components. The components are stored in a dictionary of {time_comp: values}, such as:
{“year”: [2000, 2001], “month”: [1, 2, 3]}
- Note that you can provide month strings as strings or numbers, e.g.:
“feb”, “Feb”, “February”, 2
- roocs_utils.parameter.param_utils.area
alias of
roocs_utils.parameter.param_utils.Series
- roocs_utils.parameter.param_utils.collection
alias of
roocs_utils.parameter.param_utils.Series
- roocs_utils.parameter.param_utils.dimensions
alias of
roocs_utils.parameter.param_utils.Series
- roocs_utils.parameter.param_utils.interval
alias of
roocs_utils.parameter.param_utils.Interval
- roocs_utils.parameter.param_utils.level_interval
alias of
roocs_utils.parameter.param_utils.Interval
- roocs_utils.parameter.param_utils.level_series
alias of
roocs_utils.parameter.param_utils.Series
- roocs_utils.parameter.param_utils.parse_datetime(dt, defaults=None)[source]
Parses string to datetime and returns isoformat string for it. If defaults is set, use that in case dt is None.
- roocs_utils.parameter.param_utils.parse_range(x, caller)[source]
- roocs_utils.parameter.param_utils.parse_sequence(x, caller)[source]
- roocs_utils.parameter.param_utils.series
alias of
roocs_utils.parameter.param_utils.Series
- roocs_utils.parameter.param_utils.string_to_dict(s, splitters=('|', ':', ','))[source]
Convert a string to a dictionary of dictionaries, based on splitting rules: splitters.
- roocs_utils.parameter.param_utils.time_components
alias of
roocs_utils.parameter.param_utils.TimeComponents
- roocs_utils.parameter.param_utils.time_interval
alias of
roocs_utils.parameter.param_utils.Interval
- roocs_utils.parameter.param_utils.time_series
alias of
roocs_utils.parameter.param_utils.Series
- roocs_utils.parameter.param_utils.to_float(i, allow_none=True)[source]
- roocs_utils.parameter.parameterise.parameterise(collection=None, area=None, level=None, time=None, time_components=None)[source]
Parameterises inputs to instances of parameter classes which allows them to be used throughout roocs. For supported formats for each input please see their individual classes.
- Parameters
collection – Collection input in any supported format.
area – Area input in any supported format.
level – Level input in any supported format.
time – Time input in any supported format.
time_components – Time Components input in any supported format.
- Returns
Parameters as instances of their respective classes.
Project Utils
- class roocs_utils.project_utils.DatasetMapper(dset, project=None, force=False)[source]
Bases:
object
Class to map to data path, dataset ID and files from any dataset input.
dset must be a string and can be input as:A dataset ID: e.g. “cmip5.output1.INM.inmcm4.rcp45.mon.ocean.Omon.r1i1p1.latest.zostoga”A file path: e.g. “/badc/cmip5/data/cmip5/output1/MOHC/HadGEM2-ES/rcp85/mon/atmos/Amon/r1i1p1/latest/tas/tas_Amon_HadGEM2-ES_rcp85_r1i1p1_200512-203011.nc”A path to a group of files: e.g. “/badc/cmip5/data/cmip5/output1/MOHC/HadGEM2-ES/rcp85/mon/atmos/Amon/r1i1p1/latest/tas/*.nc”A directory e.g. “/badc/cmip5/data/cmip5/output1/MOHC/HadGEM2-ES/rcp85/mon/atmos/Amon/r1i1p1/latest/tas”An instance of the FileMapper class (that represents a set of files within a single directory)When force=True, if the project can not be identified, any attempt to use the base_dir of a project to resolve the data path will be ignored. Any of data_path, ds_id and files that can be set, will be set.
- SUPPORTED_EXTENSIONS = ('.nc', '.gz')
- property base_dir
The base directory of the input dataset.
- property data_path
Dataset input converted to a data path.
- property ds_id
Dataset input converted to a ds id.
- property files
The files found from the input dataset.
- property project
The project of the dataset input.
- property raw
Raw dataset input.
- roocs_utils.project_utils.datapath_to_dsid(datapath)[source]
Switches from dataset path to ds id.
- Parameters
datapath – dataset path.
- Returns
dataset id of input dataset path.
- roocs_utils.project_utils.derive_ds_id(dset)[source]
Derives the dataset id of the provided dset.
- Parameters
dset – dset input of type described by DatasetMapper.
- Returns
ds id of input dataset.
- roocs_utils.project_utils.derive_dset(dset)[source]
Derives the dataset path of the provided dset.
- Parameters
dset – dset input of type described by DatasetMapper.
- Returns
dataset path of input dataset.
- roocs_utils.project_utils.dset_to_filepaths(dset, force=False)[source]
Gets filepaths deduced from input dset.
- Parameters
dset – dset input of type described by DatasetMapper.
force – When True and if the project of the input dset cannot be identified, DatasetMapper will attempt to find the files anyway. Default is False.
- Returns
File paths deduced from input dataset.
- roocs_utils.project_utils.dsid_to_datapath(dsid)[source]
Switches from ds id to dataset path.
- Parameters
dsid – dataset id.
- Returns
dataset path of input dataset id.
- roocs_utils.project_utils.get_data_node_dirs_dict()[source]
Get a dictionary of the data node roots used for retreiving original files.
- roocs_utils.project_utils.get_facet(facet_name, facets, project)[source]
Get facet from project config
- roocs_utils.project_utils.get_project_base_dir(project)[source]
Get the base directory of a project from the config.
- roocs_utils.project_utils.get_project_from_data_node_root(url)[source]
Identify the project from data node root by identifyng the data node root in the input url.
- roocs_utils.project_utils.get_project_from_ds(ds)[source]
Gets the project from an xarray Dataset/DataArray.
- Parameters
ds – xarray Dataset/DataArray.
- Returns
The project derived from the input dataset.
- roocs_utils.project_utils.get_project_name(dset)[source]
Gets the project from an input dset.
- Parameters
dset – dset input of type described by DatasetMapper.
- Returns
The project derived from the input dataset.
- roocs_utils.project_utils.get_projects()[source]
Gets all the projects available in the config.
- roocs_utils.project_utils.map_facet(facet, project)[source]
Return mapped facet value from config or facet name if not found.
- roocs_utils.project_utils.switch_dset(dset)[source]
Switches between dataset path and ds id.
- Parameters
dset – either dataset path or dataset ID.
- Returns
either dataset path or dataset ID - switched from the input.
- roocs_utils.project_utils.url_to_file_path(url)[source]
Convert input url of an original file to a file path
Xarray Utils
Other utilities
- roocs_utils.utils.common.parse_size(size)[source]
Parse size string into number of bytes.
- Parameters
size – (str) size to parse in any unit
- Returns
(int) number of bytes
- class roocs_utils.utils.time_utils.AnyCalendarDateTime(year, month, day, hour, minute, second)[source]
Bases:
object
A class to represent a datetime that could be of any calendar.
Has the ability to add and subtract a day from the input based on MAX_DAY, MIN_DAY, MAX_MONTH and MIN_MONTH
- DAY_RANGE = range(1, 32)
- HOUR_RANGE = range(0, 24)
- MINUTE_RANGE = range(0, 60)
- MONTH_RANGE = range(1, 13)
- SECOND_RANGE = range(0, 60)
- add_day()[source]
Add a day to the input datetime.
- sub_day(n=1)[source]
Subtract a day to the input datetime.
- validate_input(input, name, range)[source]
- property value
- roocs_utils.utils.time_utils.str_to_AnyCalendarDateTime(dt, defaults=None)[source]
Takes a string representing date/time and returns a DateTimeAnyTime object. String formats should start with Year and go through to Second, but you can miss out anything from month onwards.
- Parameters
dt – (str) string representing a date/time.
defaults – (list) The default values to use for year, month, day, hour, minute and second if they cannot be parsed from the string. A default value must be provided for each component. If defaults=None, [-1, 1, 1, 0, 0, 0] is used.
- Returns
AnyCalendarDateTime object
- roocs_utils.utils.time_utils.to_isoformat(tm)[source]
Returns an ISO 8601 string from a time object (of different types).
- Parameters
tm – Time object
- Returns
(str) ISO 8601 time string
- class roocs_utils.utils.file_utils.FileMapper(file_list, dirpath=None)[source]
Bases:
object
Class to represent a set of files that exist in the same directory as one object.
- Parameters
file_list – the list of files to represent. If dirpath not providedm these should be full file paths.
dirpath – The directory path where the files exist. Default is None.
If dirpath is not provided it will be deduced from the file paths provided in file_list.
- file_list
list of file names of the files represented.
- file_paths
list of full file paths of the files represented.
- dirpath
The directory path where the files exist. Either deduced or provided.
- roocs_utils.utils.file_utils.is_file_list(coll)[source]
Checks whether a collection is a list of files.
- Parameters
(list) (coll) – collection to check.
- Returns
True if collection is a list of files, else returns False.