{
"cells": [
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [],
"source": [
"import roocs_utils"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['AreaParameter',\n",
" 'CONFIG',\n",
" 'CollectionParameter',\n",
" 'LevelParameter',\n",
" 'TimeParameter',\n",
" '__author__',\n",
" '__builtins__',\n",
" '__cached__',\n",
" '__contact__',\n",
" '__copyright__',\n",
" '__doc__',\n",
" '__file__',\n",
" '__license__',\n",
" '__loader__',\n",
" '__name__',\n",
" '__package__',\n",
" '__path__',\n",
" '__spec__',\n",
" '__version__',\n",
" 'area_parameter',\n",
" 'base_parameter',\n",
" 'collection_parameter',\n",
" 'config',\n",
" 'exceptions',\n",
" 'get_config',\n",
" 'level_parameter',\n",
" 'parameter',\n",
" 'parameterise',\n",
" 'roocs_utils',\n",
" 'time_parameter',\n",
" 'utils',\n",
" 'xarray_utils']"
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dir(roocs_utils)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Parameters"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Parameters classes are used to parse inputs of collection, area, time and level used as arguments in the subsetting operation"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The area values can be input as:\n",
"* A string of comma separated values: “0.,49.,10.,65” \n",
"* A sequence of strings: (“0”, “-10”, “120”, “40”) \n",
"* A sequence of numbers: [0, 49.5, 10, 65]"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'lon_bnds': (0.0, 10.0), 'lat_bnds': (49.0, 65.0)}\n",
"(0.0, 49.0, 10.0, 65.0)\n"
]
}
],
"source": [
"area = roocs_utils.AreaParameter(\"0.,49.,10.,65\")\n",
"\n",
"# the lat/lon bounds can be returned in a dictionary\n",
"print(area.asdict())\n",
"\n",
"# the values can be returned as a tuple\n",
"print(area.tuple)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A collection can be input as \n",
"* A string of comma separated values: “cmip5.output1.INM.inmcm4.rcp45.mon.ocean.Omon.r1i1p1.latest.zostoga,cmip5.output1.MPI-M.MPI-ESM-LR.rcp45.mon.ocean.Omon.r1i1p1.latest.zostoga” \n",
"* A sequence of strings: e.g. (“cmip5.output1.INM.inmcm4.rcp45.mon.ocean.Omon.r1i1p1.latest.zostoga”,“cmip5.output1.MPI-M.MPI-ESM-LR.rcp45.mon.ocean.Omon.r1i1p1.latest.zostoga”)"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"('cmip5.output1.INM.inmcm4.rcp45.mon.ocean.Omon.r1i1p1.latest.zostoga', 'cmip5.output1.MPI-M.MPI-ESM-LR.rcp45.mon.ocean.Omon.r1i1p1.latest.zostoga')\n"
]
}
],
"source": [
"collection = roocs_utils.CollectionParameter(\"cmip5.output1.INM.inmcm4.rcp45.mon.ocean.Omon.r1i1p1.latest.zostoga,cmip5.output1.MPI-M.MPI-ESM-LR.rcp45.mon.ocean.Omon.r1i1p1.latest.zostoga\")\n",
"\n",
"# the collection ids can be returned as a tuple\n",
"print(collection.tuple)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Level can be input as:\n",
"* A string of slash separated values: “1000/2000” \n",
"* A sequence of strings: e.g. (“1000.50”, “2000.60”) A sequence of numbers: e.g. (1000.50, 2000.60)\n",
"\n",
"Level inputs should be a range of the levels you want to subset over"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'first_level': 1000.5, 'last_level': 2000.6}\n",
"(1000.5, 2000.6)\n"
]
}
],
"source": [
"level = roocs_utils.LevelParameter((1000.50, 2000.60))\n",
"\n",
"# the first and last level in the range provided can be returned in a dictionary\n",
"print(level.asdict())\n",
"\n",
"# the values can be returned as a tuple\n",
"print(level.tuple)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Time can be input as:\n",
"* A string of slash separated values: “2085-01-01T12:00:00Z/2120-12-30T12:00:00Z” \n",
"* A sequence of strings: e.g. (“2085-01-01T12:00:00Z”, “2120-12-30T12:00:00Z”)\n",
"\n",
"Time inputs should be the start and end of the time range you want to subset over"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'start_time': '2085-01-01T12:00:00+00:00', 'end_time': '2120-12-30T12:00:00+00:00'}\n",
"('2085-01-01T12:00:00+00:00', '2120-12-30T12:00:00+00:00')\n"
]
}
],
"source": [
"time = roocs_utils.TimeParameter(\"2085-01-01T12:00:00Z/2120-12-30T12:00:00Z\")\n",
"\n",
"# the first and last time in the range provided can be returned in a dictionary\n",
"print(time.asdict())\n",
"\n",
"# the values can be returned as a tuple\n",
"print(time.tuple)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Parameterise parameterises inputs to instances of parameter classes which allows them to be used throughout roocs."
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'collection': Datasets to analyse:\n",
" cmip5.output1.INM.inmcm4.rcp45.mon.ocean.Omon.r1i1p1.latest.zostoga,\n",
" 'area': Area to subset over:\n",
" (0.0, 49.0, 10.0, 65.0),\n",
" 'level': Level range to subset over\n",
" first_level: 1000.5\n",
" last_level: 2000.6,\n",
" 'time': Time period to subset over\n",
" start time: 2085-01-01T12:00:00+00:00\n",
" end time: 2120-12-30T12:00:00+00:00}"
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"roocs_utils.parameter.parameterise(\"cmip5.output1.INM.inmcm4.rcp45.mon.ocean.Omon.r1i1p1.latest.zostoga\", \"0.,49.,10.,65\", (1000.50, 2000.60), \"2085-01-01T12:00:00Z/2120-12-30T12:00:00Z\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Xarray utils"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Xarray utils can bu used to identify the main variable in a dataset as well as idnetifying the type of a coordinate or returning a coordinate based on an attribute or a type"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [],
"source": [
"from roocs_utils.xarray_utils import xarray_utils as xu\n",
"import xarray as xr"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [],
"source": [
"ds = xr.open_mfdataset(\"../tests/mini-esgf-data/test_data/badc/cmip5/data/cmip5/output1/MOHC/HadGEM2-ES/rcp85/mon/atmos/Amon/r1i1p1/latest/tas/*.nc\", use_cftime=True, combine=\"by_coords\")"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"main var = tas\n"
]
},
{
"data": {
"text/html": [
"
2010-12-04T13:50:30Z altered by CMOR: Treated scalar dimension: 'height'. 2010-12-04T13:50:30Z altered by CMOR: replaced missing value flag (-1.07374e+09) with standard missing value (1e+20).