California Department of Water Resources Historical Data¶
-
ulmo.cdec.historical.
get_stations
()¶ Fetches information on all CDEC sites.
Returns: df : pandas DataFrame
a pandas DataFrame (indexed on site id) with station information.
-
ulmo.cdec.historical.
get_sensors
(sensor_id=None)¶ Gets a list of sensor ids as a DataFrame indexed on sensor number. Can be limited by a list of numbers.
Usage example:
from ulmo import cdec # to get all available sensor info sensors = cdec.historical.get_sensors() # or to get just one sensor sensor = cdec.historical.get_sensors([1])
Parameters: sites : iterable of integers or
None
Returns: df : pandas DataFrame
a python dict with site codes mapped to site information
-
ulmo.cdec.historical.
get_station_sensors
(station_ids=None, sensor_ids=None, resolutions=None)¶ Gets available sensors for the given stations, sensor ids and time resolutions. If no station ids are provided, all available stations will be used (this is not recommended, and will probably take a really long time).
The list can be limited by a list of sensor numbers, or time resolutions if you already know what you want. If none of the provided sensors or resolutions are available, an empty DataFrame will be returned for that station.
Usage example:
from ulmo import cdec # to get all available sensors available_sensors = cdec.historical.get_station_sensors(['NEW'])
Parameters: station_ids : iterable of strings or
None
sensor_ids : iterable of integers or
None
check out or use the
get_sensors()
function to see a list of available sensor numbersresolutions : iterable of strings or
None
Possible values are ‘event’, ‘hourly’, ‘daily’, and ‘monthly’ but not all of these time resolutions are available at every station.
Returns: dict : a python dict
a python dict with site codes as keys with values containing pandas DataFrames of available sensor numbers and metadata.
-
ulmo.cdec.historical.
get_data
(station_ids=None, sensor_ids=None, resolutions=None, start=None, end=None)¶ Downloads data for a set of CDEC station and sensor ids. If either is not provided, all available data will be downloaded. Be really careful with choosing hourly resolution as the data sets are big, and CDEC’s servers are slow as molasses in winter.
Usage example:
from ulmo import cdec dat = cdec.historical.get_data(['PRA'],resolutions=['daily'])
Parameters: station_ids : iterable of strings or
None
sensor_ids : iterable of integers or
None
check out or use the
get_sensors()
function to see a list of available sensor numbersresolutions : iterable of strings or
None
Possible values are ‘event’, ‘hourly’, ‘daily’, and ‘monthly’ but not all of these time resolutions are available at every station.
Returns: dict : a python dict
a python dict with site codes as keys. Values will be nested dicts containing all of the sensor/resolution combinations.
Climate Prediction Center Weekly Drought¶
Climate Prediction Center Weekly Drought Index dataset
-
ulmo.cpc.drought.
get_data
(state=None, climate_division=None, start=None, end=None, as_dataframe=False)¶ Retreives data.
Parameters: state :
None
or strIf specified, results will be limited to the state corresponding to the given 2-character state code.
climate_division :
None
or intIf specified, results will be limited to the climate division.
start :
None
or date (see note on dates and times)Results will be limited to those after the given date. Default is the start of the current calendar year.
end :
None
or date (see note on dates and times)If specified, results will be limited to data before this date.
as_dataframe: bool
If
False
(default), a dict with a nested set of dicts will be returned with data indexed by state, then climate division. IfTrue
then a pandas.DataFrame object will be returned. The pandas dataframe is used internally, so setting this toTrue
is a little bit faster as it skips a serialization step.Returns: data : dict or pandas.Dataframe
A dict or pandas.DataFrame representing the data. See the
as_dataframe
parameter for more.
CUAHSI WaterOneFlow¶
ulmo.cuahsi.his_central¶
CUAHSI HIS Central web services
-
ulmo.cuahsi.his_central.
get_services
(bbox=None)¶ Retrieves a list of services.
Parameters: bbox :
None
or 4-tupleOptional argument for a bounding box that covers the area you want to look for services in. This should be a tuple containing (min_longitude, min_latitude, max_longitude, and max_latitude) with these values in decimal degrees. If not provided then the full set of services will be queried from HIS Central.
Returns: services_dicts : list
A list of dicts that each contain information on an individual service.
ulmo.cuahsi.wof¶
CUAHSI WaterOneFlow web services
-
ulmo.cuahsi.wof.
get_sites
(wsdl_url, suds_cache=('default', ))¶ Retrieves information on the sites that are available from a WaterOneFlow service using a GetSites request. For more detailed information including which variables and time periods are available for a given site, use
get_site_info()
.Parameters: wsdl_url : str
URL of a service’s web service definition language (WSDL) description. All WaterOneFlow services publish a WSDL description and this url is the entry point to the service.
suds_cache: ``None`` or tuple
SOAP local cache duration for WSDL description and client object. Pass a cache duration tuple like (‘days’, 3) to set a custom duration. Duration may be in months, weeks, days, hours, or seconds. If unspecified, the default duration (1 day) will be used. Use
None
to turn off caching.Returns: sites_dict : dict
a python dict with site codes mapped to site information
-
ulmo.cuahsi.wof.
get_site_info
(wsdl_url, site_code, suds_cache=('default', ))¶ Retrieves detailed site information from a WaterOneFlow service using a GetSiteInfo request.
Parameters: wsdl_url : str
URL of a service’s web service definition language (WSDL) description. All WaterOneFlow services publish a WSDL description and this url is the entry point to the service.
site_code : str
Site code of the site you’d like to get more information for. Site codes MUST contain the network and be of the form <network>:<site_code>, as is required by WaterOneFlow.
suds_cache: ``None`` or tuple
SOAP local cache duration for WSDL description and client object. Pass a cache duration tuple like (‘days’, 3) to set a custom duration. Duration may be in months, weeks, days, hours, or seconds. If unspecified, the default duration (1 day) will be used. Use
None
to turn off caching.Returns: site_info : dict
a python dict containing site information
-
ulmo.cuahsi.wof.
get_values
(wsdl_url, site_code, variable_code, start=None, end=None, suds_cache=('default', ))¶ Retrieves site values from a WaterOneFlow service using a GetValues request.
Parameters: wsdl_url : str
URL of a service’s web service definition language (WSDL) description. All WaterOneFlow services publish a WSDL description and this url is the entry point to the service.
site_code : str
Site code of the site you’d like to get values for. Site codes MUST contain the network and be of the form <network>:<site_code>, as is required by WaterOneFlow.
variable_code : str
Variable code of the variable you’d like to get values for. Variable codes MUST contain the network and be of the form <vocabulary>:<variable_code>, as is required by WaterOneFlow.
start :
None
or datetime (see note on dates and times)Start of a date range for a query. If both start and end parameters are omitted, the entire time series available will be returned.
end :
None
or datetime (see note on dates and times)End of a date range for a query. If both start and end parameters are omitted, the entire time series available will be returned.
suds_cache: ``None`` or tuple
SOAP local cache duration for WSDL description and client object. Pass a cache duration tuple like (‘days’, 3) to set a custom duration. Duration may be in months, weeks, days, hours, or seconds. If unspecified, the default duration (1 day) will be used. Use
None
to turn off caching.Returns: site_values : dict
a python dict containing values
-
ulmo.cuahsi.wof.
get_variable_info
(wsdl_url, variable_code=None, suds_cache=('default', ))¶ Retrieves site values from a WaterOneFlow service using a GetVariableInfo request.
Parameters: wsdl_url : str
URL of a service’s web service definition language (WSDL) description. All WaterOneFlow services publish a WSDL description and this url is the entry point to the service.
variable_code : None or str
If None (default) then information on all variables will be returned, otherwise, this should be set to the variable code of the variable you’d like to get more information on. Variable codes MUST contain the network and be of the form <vocabulary>:<variable_code>, as is required by WaterOneFlow.
suds_cache: ``None`` or tuple
SOAP local cache duration for WSDL description and client object. Pass a cache duration tuple like (‘days’, 3) to set a custom duration. Duration may be in months, weeks, days, hours, or seconds. If unspecified, the default duration (1 day) will be used. Use
None
to turn off caching.Returns: variable_info : dict
a python dict containing variable information. If no variable code is None (default) then this will be a nested set of dicts keyed by <vocabulary>:<variable_code>
Lower Colorado River Authority (LCRA) Hydromet Data¶
-
ulmo.lcra.hydromet.
get_sites_by_type
(site_type)¶ Gets list of the hydromet site codes and description for site. Parameters: ———– site_type : str
In all but lake sites, this is the parameter code collected at the site. For lake sites, it is ‘lake’. Seesite_types
andPARAMETERS
- sites_dict: dict
- A python dict with four char long site codes mapped to site information.
-
ulmo.lcra.hydromet.
get_site_data
(site_code, parameter_code, as_dataframe=True, start_date=None, end_date=None, dam_site_location='head')¶ Fetches site’s parameter data Parameters ———- site_code : str
The LCRA site code (four chars long) of the site you want to query data for.- parameter_code : str
- LCRA parameter code. see
PARAMETERS
- start_date :
None
or datetime - Start of a date range for a query.
- end_date :
None
or datetime - End of a date range for a query.
- as_dataframe :
True
(default) orFalse
- This determines what format values are returned as. If
True
(default) then the values will be a pandas.DataFrame object with the values timestamp as the index. IfFalse
, the format will be Python dictionary. - dam_site_location : ‘head’ (default) or ‘tail’
- The site location relative to the dam.
Returns: df : pandas.DataFrame or
values_dict : dict
-
ulmo.lcra.hydromet.
get_all_sites
()¶ Returns list of all LCRA hydromet sites as geojson featurecollection.
-
ulmo.lcra.hydromet.
get_current_data
(service, as_geojson=False)¶ fetches the current (near real-time) river stage and flow values from LCRA web service. Parameters ———- service : str
The web service providing data. see current_data_services. Currently we have GetUpperBasin and GetLowerBasin.- as_geojson : ‘True’ or ‘False’ (default)
- If True the data is returned as geojson featurecollection and if False data is returned as list of dicts.
current_values_dicts : a list of dicts or current_values_geojson : a geojson featurecollection.
Lower Colorado River Authority (LCRA) Water Quality Data¶
-
ulmo.lcra.waterquality.
get_sites
(source_agency=None)¶ Fetches a list of sites with location and available metadata. Parameters ———- source_agency : LCRA used code of the that collects the data. there are sites whose sources are not listed so this filter may not return all sites of a certain source. see
source_map
. Returns ——- sites_geojson : geojson FeatureCollection
-
ulmo.lcra.waterquality.
get_historical_data
(site_code, start=None, end=None, as_dataframe=False)¶ Fetches data for a site at a given date. Parameters ———- site_code: str
The site code to fetch data for. A list of sites can be retrieved withget_sites()
- date :
None
or date (see note on dates and times) - The date of the data to be queried. If date is
None
(default), then all data will be returned. - as_dataframe : bool
- This determines what format values are returned as. If
False
(default), the values dict will be a dict with timestamps as keys mapped to a dict of gauge variables and values. IfTrue
then the values dict will be a pandas.DataFrame object containing the equivalent information.
- data_dict : dict
- A dict containing site information and values.
- date :
-
ulmo.lcra.waterquality.
get_recent_data
(site_code, as_dataframe=False)¶ fetches near real-time instantaneous water quality data for the LCRA bay sites. Parameters ———- site_code : str
The bay site to fetch data for. see real_time_sites- as_dataframe : bool
- This determines what format values are returned as. If
False
(default), the values will be list of value dicts. IfTrue
then values are returned as pandas.DataFrame.
list of values or dataframe.
National Climatic Data Center Climate Index Reference Sequential (CIRS)¶
National Climatic Data Center Climate Index Reference Sequential (CIRS) drought dataset
-
ulmo.ncdc.cirs.
get_data
(elements=None, by_state=False, location_names='abbr', as_dataframe=False, use_file=None)¶ Retrieves data.
Parameters: elements : ``None`, str or list
The element(s) for which to get data for. If
None
(default), then all elements are used. An individual element is a string, but a list or tuple of them can be used to specify a set of elements. Elements are:- ‘cddc’: Cooling Degree Days
- ‘hddc’: Heating Degree Days
- ‘pcpn’: Precipitation
- ‘pdsi’: Palmer Drought Severity Index
- ‘phdi’: Palmer Hydrological Drought Index
- ‘pmdi’: Modified Palmer Drought Severity Index
- ‘sp01’: 1-month Standardized Precipitation Index
- ‘sp02’: 2-month Standardized Precipitation Index
- ‘sp03’: 3-month Standardized Precipitation Index
- ‘sp06’: 6-month Standardized Precipitation Index
- ‘sp09’: 9-month Standardized Precipitation Index
- ‘sp12’: 12-month Standardized Precipitation Index
- ‘sp24’: 24-month Standardized Precipitation Index
- ‘tmpc’: Temperature
- ‘zndx’: ZNDX
by_state : bool
If False (default), divisional data will be retrieved. If True, then regional data will be retrieved.
location_names : str or
None
This parameter defines what (if any) type of names will be added to the values. If set to ‘abbr’ (default), then abbreviated location names will be used. If ‘full’, then full location names will be used. If set to None, then no location name will be added and the only identifier will be the location_codes (this is the most memory-conservative option).
as_dataframe : bool
If
False
(default), a list of values dicts is returned. IfTrue
, a dict with element codes mapped to equivalent pandas.DataFrame objects will be returned. The pandas dataframe is used internally, so setting this toTrue
is faster as it skips a somewhat expensive serialization step.use_file :
None
, file-like object or strIf
None
(default), then data will be automatically retrieved from the web. If a file-like object or a file path string, then the file will be used to read data from. This is intended to be used for reading in previously-downloaded versions of the dataset.Returns: data : list or pandas.DataFrame
A list of value dicts or a pandas.DataFrame containing data. See the
as_dataframe
parameter for more.
National Climatic Data Center Global Historical Climate Network Daily¶
National Climatic Data Center Global Historical Climate Network - Daily dataset
-
ulmo.ncdc.ghcn_daily.
get_data
(station_id, elements=None, update=True, as_dataframe=False)¶ Retrieves data for a given station.
Parameters: station_id : str
Station ID to retrieve data for.
elements :
None
, str, or list of strIf specified, limits the query to given element code(s).
update : bool
If
True
(default), new data files will be downloaded if they are newer than any previously cached files. IfFalse
, then previously downloaded files will be used and new files will only be downloaded if there is not a previously downloaded file for a given station.as_dataframe : bool
If
False
(default), a dict with element codes mapped to value dicts is returned. IfTrue
, a dict with element codes mapped to equivalent pandas.DataFrame objects will be returned. The pandas dataframe is used internally, so setting this toTrue
is a little bit faster as it skips a serialization step.Returns: site_dict : dict
A dict with element codes as keys, mapped to collections of values. See the
as_dataframe
parameter for more.
-
ulmo.ncdc.ghcn_daily.
get_stations
(country=None, state=None, elements=None, start_year=None, end_year=None, update=True, as_dataframe=False)¶ Retrieves station information, optionally limited to specific parameters.
Parameters: country : str
The country code to use to limit station results. If set to
None
(default), then stations from all countries are returned.state : str
The state code to use to limit station results. If set to
None
(default), then stations from all states are returned.elements :
None
, str, or list of strIf specified, station results will be limited to the given element codes and only stations that have data for any these elements will be returned.
start_year : int
If specified, station results will be limited to contain only stations that have data after this year. Can be combined with the
end_year
argument to get stations with data within a range of years.end_year : int
If specified, station results will be limited to contain only stations that have data before this year. Can be combined with the
start_year
argument to get stations with data within a range of years.update : bool
If
True
(default), new data files will be downloaded if they are newer than any previously cached files. IfFalse
, then previously downloaded files will be used and new files will only be downloaded if there is not a previously downloaded file for a given station.as_dataframe : bool
If
False
(default), a dict with station IDs keyed to station dicts is returned. IfTrue
, a single pandas.DataFrame object will be returned. The pandas dataframe is used internally, so setting this toTrue
is a little bit faster as it skips a serialization step.Returns: stations_dict : dict or pandas.DataFrame
A dict or pandas.DataFrame representing station information for stations matching the arguments. See the
as_dataframe
parameter for more.
National Climatic Data Center Global Summary of the Day¶
National Climatic Data Center Global Summary of the Day dataset
-
ulmo.ncdc.gsod.
get_data
(station_codes, start=None, end=None, parameters=None)¶ Retrieves data for a set of stations.
Parameters: station_codes : str or list
Single station code or iterable of station codes to retrieve data for.
start :
None
or date (see note on dates and times)If specified, data are limited to values after this date.
end :
None
or date (see note on dates and times)If specified, data are limited to values before this date.
parameters :
None
, str or listIf specified, data are limited to this set of parameter codes.
Returns: data_dict : dict
Dict with station codes keyed to lists of value dicts.
-
ulmo.ncdc.gsod.
get_stations
(country=None, state=None, start=None, end=None, update=True)¶ Retrieve information on the set of available stations.
Parameters: country : {
None
, str, or iterable}If specified, results will be limited to stations with matching country codes.
state : {
None
, str, or iterable}If specified, results will be limited to stations with matching state codes.
start :
None
or date (see note on dates and times)If specified, results will be limited to stations which have data after this start date.
end :
None
or date (see note on dates and times)If specified, results will be limited to stations which have data before this end date.
update : bool
If
True
(default), check for a newer copy of the stations file and download if it is newer the previously downloaded copy. IfFalse
, then a new stations file will only be downloaded if a previously downloaded file cannot be found.Returns: stations_dict : dict
A dict with USAF-WBAN codes keyed to station information dicts.
Texas Weather Connection Daily Keetch-Byram Drought Index (KBDI)¶
ulmo.twc.kbdi.core¶
This module provides direct access to Texas Weather Connection Daily Keetch-Byram Drought Index (KBDI) dataset.
-
ulmo.twc.kbdi.
get_data
(county=None, start=None, end=None, as_dataframe=False, data_dir=None)¶ Retreives data.
Parameters: county :
None
or strIf specified, results will be limited to the county corresponding to the given 5-character Texas county fips code i.e. 48???.
end :
None
or date (see note on dates and times)Results will be limited to data on or before this date. Default is the current date.
start :
None
or date (see note on dates and times)Results will be limited to data on or after this date. Default is the start of the calendar year for the end date.
as_dataframe: bool
If
False
(default), a dict with a nested set of dicts will be returned with data indexed by 5-character Texas county FIPS code. IfTrue
then a pandas.DataFrame object will be returned. The pandas dataframe is used internally, so setting this toTrue
is a little bit faster as it skips a serialization step.data_dir :
None
or directory pathDirectory for holding downloaded data files. If no path is provided (default), then a user-specific directory for holding application data will be used (the directory will depend on the platform/operating system).
Returns: data : dict or pandas.Dataframe
A dict or pandas.DataFrame representing the data. See the
as_dataframe
parameter for more.
US Army Corps of Engineers - Tulsa District Water Control¶
United States Army Corps of Engineers Tulsa District Water Control
-
ulmo.usace.swtwc.
get_stations
()¶ Fetches a list of station codes and descriptions.
Returns: stations_dict : dict
a python dict with station codes mapped to station information
-
ulmo.usace.swtwc.
get_station_data
(station_code, date=None, as_dataframe=False)¶ Fetches data for a station at a given date.
Parameters: station_code: str
The station code to fetch data for. A list of stations can be retrieved with
get_stations()
date :
None
or date (see note on dates and times)The date of the data to be queried. If date is
None
(default), then data for the current day is retreived.as_dataframe : bool
This determines what format values are returned as. If
False
(default), the values dict will be a dict with timestamps as keys mapped to a dict of gauge variables and values. IfTrue
then the values dict will be a pandas.DataFrame object containing the equivalent information.Returns: data_dict : dict
A dict containing station information and values.
USGS National Water Information System¶
USGS National Water Information System web services
-
ulmo.usgs.nwis.
get_sites
(sites=None, state_code=None, huc=None, bounding_box=None, county_code=None, parameter_code=None, site_type=None, service=None, input_file=None, **kwargs)¶ Fetches site information from USGS services. See the `USGS Site Service`_ documentation for a detailed description of options. For convenience, major options have been included with pythonic names. Options that are not listed below may be provided as extra kwargs (i.e. keyword=’argument’) and will be passed along with the web services request. These extra keywords must match the USGS names exactly. The `USGS Site Service`_ website describes available keyword names and argument formats.
Note
Only the options listed below have been tested and you may have mixed results retrieving data with extra options specified. Currently ulmo requests and parses data in the waterml format. Some options are not available in this format.
Parameters: service : {
None
, ‘instantaneous’, ‘iv’, ‘daily’, ‘dv’}The service to use, either “instantaneous”, “daily”, or
None
(default). If set toNone
, then both services are used. The abbreviations “iv” and “dv” can be used for “instantaneous” and “daily”, respectively.input_file: ``None``, file path or file object
If
None
(default), then the NWIS web services will be queried, but if a file is passed then this file will be used instead of requesting data from the NWIS web services.Returns: sites_dict : dict
a python dict with site codes mapped to site information
-
ulmo.usgs.nwis.
get_site_data
(site_code, service=None, parameter_code=None, statistic_code=None, start=None, end=None, period=None, modified_since=None, input_file=None, methods=None, **kwargs)¶ Fetches site data.
Parameters: site_code : str
The site code of the site you want to query data for.
service : {
None
, ‘instantaneous’, ‘iv’, ‘daily’, ‘dv’}The service to use, either “instantaneous”, “daily”, or
None
(default). If set toNone
, then both services are used. The abbreviations “iv” and “dv” can be used for “instantaneous” and “daily”, respectively.parameter_code : str
Parameter code(s) that will be passed as the parameterCd parameter.
statistic_code: str
Statistic code(s) that will be passed as the statCd parameter
start :
None
or datetime (see note on dates and times)Start of a date range for a query. This parameter is mutually exclusive with period (you cannot use both).
end :
None
or datetime (see note on dates and times)End of a date range for a query. This parameter is mutually exclusive with period (you cannot use both).
period : {
None
, str, datetime.timedelta}Period of time to use for requesting data. This will be passed along as the period parameter. This can either be ‘all’ to signal that you’d like the entire period of record, or string in ISO 8601 period format (e.g. ‘P1Y2M21D’ for a period of one year, two months and 21 days) or it can be a datetime.timedelta object representing the period of time. This parameter is mutually exclusive with start/end dates.
modified_since :
None
or datetime.timedeltaPassed along as the modifiedSince parameter.
input_file: ``None``, file path or file object
If
None
(default), then the NWIS web services will be queried, but if a file is passed then this file will be used instead of requesting data from the NWIS web services.methods: ``None``, str or Python dict
If
None
(default), it’s assumed that there is a single method for each parameter. This raises an error if more than one method ids are encountered. If str, this is the method id for the requested parameter/s and can use “all” if method ids are not known beforehand. If dict, provide the parameter_code to method id mapping. Parameter’s method id is specific to site.Returns: data_dict : dict
a python dict with parameter codes mapped to value dicts
-
ulmo.usgs.nwis.hdf5.
get_site
(site_code, path=None, complevel=None, complib=None)¶ Fetches previously-cached site information from an hdf5 file.
Parameters: site_code : str
The site code of the site you want to get information for.
path :
None
or file pathPath to the hdf5 file to be queried, if
None
then the default path will be used. If a file path is a directory, then multiple hdf5 files will be kept so that file sizes remain small for faster repacking.complevel :
None
or int {0-9}Open hdf5 file with this level of compression. If ``None` (default), then a maximum compression level will be used if a compression library can be found. If set to 0 then no compression will be used regardless of what complib is.
complib :
None
or str {‘zlib’, ‘bzip2’, ‘lzo’, ‘blosc’}Open hdf5 file with this type of compression. If
None
(default) then the best available compression library available on your system will be selected. If complevel argument is set to 0 then no compression will be used.Returns: site_dict : dict
a python dict containing site information
-
ulmo.usgs.nwis.hdf5.
get_site_data
(site_code, agency_code=None, parameter_code=None, path=None, complevel=None, complib=None)¶ Fetches previously-cached site data from an hdf5 file.
Parameters: site_code : str
The site code of the site you want to get data for.
agency_code :
None
or strThe agency code to get data for. This will need to be set if a site code is in use by multiple agencies (this is rare).
parameter_code : None, str, or list
List of parameters to read. If
None
(default) read all parameters. Otherwise only read specified parameters. Parameters should be specified with statistic code, i.e. daily streamflow is ‘00060:00003’path :
None
or file pathPath to the hdf5 file to be queried, if
None
then the default path will be used. If a file path is a directory, then multiple hdf5 files will be kept so that file sizes remain small for faster repacking.complevel :
None
or int {0-9}Open hdf5 file with this level of compression. If ``None` (default), then a maximum compression level will be used if a compression library can be found. If set to 0 then no compression will be used regardless of what complib is.
complib :
None
or str {‘zlib’, ‘bzip2’, ‘lzo’, ‘blosc’}Open hdf5 file with this type of compression. If
None
(default) then the best available compression library available on your system will be selected. If complevel argument is set to 0 then no compression will be used.Returns: data_dict : dict
a python dict with parameter codes mapped to value dicts
-
ulmo.usgs.nwis.hdf5.
get_sites
(path=None, complevel=None, complib=None)¶ Fetches previously-cached site information from an hdf5 file.
Parameters: path :
None
or file pathPath to the hdf5 file to be queried, if
None
then the default path will be used. If a file path is a directory, then multiple hdf5 files will be kept so that file sizes remain small for faster repacking.complevel :
None
or int {0-9}Open hdf5 file with this level of compression. If ``None` (default), then a maximum compression level will be used if a compression library can be found. If set to 0 then no compression will be used regardless of what complib is.
complib :
None
or str {‘zlib’, ‘bzip2’, ‘lzo’, ‘blosc’}Open hdf5 file with this type of compression. If
None
(default) then the best available compression library available on your system will be selected. If complevel argument is set to 0 then no compression will be used.Returns: sites_dict : dict
a python dict with site codes mapped to site information
-
ulmo.usgs.nwis.hdf5.
remove_values
(site_code, datetime_dicts, path=None, complevel=None, complib=None, autorepack=True)¶ Remove values from hdf5 file.
Parameters: site_code : str
The site code of the site to remove records from.
datetime_dicts : a python dict with a list of datetimes for a given variable
(key) to set as NaNs.
path : file path to hdf5 file.
Returns: None :
None
-
ulmo.usgs.nwis.hdf5.
repack
(path, complevel=None, complib=None)¶ Repack the hdf5 file at path. This is the same as running the pytables ptrepack command on the file.
Parameters: path : file path
Path to the hdf5 file.
complevel :
None
or int {0-9}Open hdf5 file with this level of compression. If ``None` (default), then a maximum compression level will be used if a compression library can be found. If set to 0 then no compression will be used regardless of what complib is.
complib :
None
or str {‘zlib’, ‘bzip2’, ‘lzo’, ‘blosc’}Open hdf5 file with this type of compression. If
None
(default) then the best available compression library available on your system will be selected. If complevel argument is set to 0 then no compression will be used.Returns: None :
None
-
ulmo.usgs.nwis.hdf5.
update_site_data
(site_code, start=None, end=None, period=None, path=None, methods=None, input_file=None, complevel=None, complib=None, autorepack=True)¶ Update cached site data.
Parameters: site_code : str
The site code of the site you want to query data for.
start :
None
or datetime (see note on dates and times)Start of a date range for a query. This parameter is mutually exclusive with period (you cannot use both).
end :
None
or datetime (see note on dates and times)End of a date range for a query. This parameter is mutually exclusive with period (you cannot use both).
period : {
None
, str, datetime.timedelta}Period of time to use for requesting data. This will be passed along as the period parameter. This can either be ‘all’ to signal that you’d like the entire period of record, or string in ISO 8601 period format (e.g. ‘P1Y2M21D’ for a period of one year, two months and 21 days) or it can be a datetime.timedelta object representing the period of time. This parameter is mutually exclusive with start/end dates.
path :
None
or file pathPath to the hdf5 file to be queried, if
None
then the default path will be used. If a file path is a directory, then multiple hdf5 files will be kept so that file sizes remain small for faster repacking.methods: ``None``, str or Python dict
If
None
(default), it’s assumed that there is a single method for each parameter. This raises an error if more than one method ids are encountered. If str, this is the method id for the requested parameter/s and can use “all” if method ids are not known beforehand. If dict, provide the parameter_code to method id mapping. Parameter’s method id is specific to site.input_file: ``None``, file path or file object
If
None
(default), then the NWIS web services will be queried, but if a file is passed then this file will be used instead of requesting data from the NWIS web services.autorepack : bool
Whether or not to automatically repack the h5 file(s) after updating. There is a tradeoff between performance and disk space here: large files take a longer time to repack but also tend to grow larger faster, the default of True conserves disk space because untamed file growth can become quite destructive. If you set this to False, you can manually repack files with repack().
Returns: None :
None
-
ulmo.usgs.nwis.hdf5.
update_site_list
(sites=None, state_code=None, huc=None, bounding_box=None, county_code=None, parameter_code=None, site_type=None, service=None, input_file=None, complevel=None, complib=None, autorepack=True, path=None, **kwargs)¶ Update cached site information.
See ulmo.usgs.nwis.core.get_sites() for description of regular parameters, only extra parameters used for caching are listed below.
Parameters: path :
None
or file pathPath to the hdf5 file to be queried, if
None
then the default path will be used. If a file path is a directory, then multiple hdf5 files will be kept so that file sizes remain small for faster repacking.input_file: ``None``, file path or file object
If
None
(default), then the NWIS web services will be queried, but if a file is passed then this file will be used instead of requesting data from the NWIS web services.complevel :
None
or int {0-9}Open hdf5 file with this level of compression. If ``None` (default), then a maximum compression level will be used if a compression library can be found. If set to 0 then no compression will be used regardless of what complib is.
complib :
None
or str {‘zlib’, ‘bzip2’, ‘lzo’, ‘blosc’}Open hdf5 file with this type of compression. If
None
(default) then the best available compression library available on your system will be selected. If complevel argument is set to 0 then no compression will be used.autorepack : bool
Whether or not to automatically repack the h5 file after updating. There is a tradeoff between performance and disk space here: large files take a longer time to repack but also tend to grow larger faster, the default of True conserves disk space because untamed file growth can become quite destructive. If you set this to False, you can manually repack files with repack().
Returns: None :
None
USGS Emergency Data Distribution Network services¶
USGS Emergency Data Distribution Network services
-
ulmo.usgs.eddn.
get_data
(dcp_address, start=None, end=None, networklist='', channel='', spacecraft='Any', baud='Any', electronic_mail='', dcp_bul='', glob_bul='', timing='', retransmitted='Y', daps_status='N', use_cache=False, cache_path=None, as_dataframe=True)¶ Fetches GOES Satellite DCP messages from USGS Emergency Data Distribution Network.
Parameters: dcp_address : str, iterable of strings
DCP address or list of DCP addresses to be fetched; lists will be joined by a ‘,’.
start : {
None
, str, datetime, datetime.timedelta}If
None
(default) then the start time is 2 days prior (or date of last data if cache is used) If a datetime or datetime like string is specified it will be used as the start date. If a timedelta or string in ISO 8601 period format (e.g ‘P2D’ for a period of 2 days) then ‘now’ minus the timedelta will be used as the start. NOTE: The EDDN service does not specify how far back data is available. The service also imposes a maximum data limit of 25000 character. If this is limit reached multiple requests will be made until all available data is downloaded.end : {
None
, str, datetime, datetime.timedelta}If
None
(default) then the end time is ‘now’ If a datetime or datetime like string is specified it will be used as the end date. If a timedelta or string in ISO 8601 period format (e.g ‘P2D’ for a period of 2 days) then ‘now’ minus the timedelta will be used as the end. NOTE: The EDDN service does not specify how far back data is available. The service also imposes a maximum data limit of 25000 character.networklist : str,
‘’ (default). Filter by network.
channel : str,
‘’ (default). Filter by channel.
spacecraft : str,
East, West, Any (default). Filter by GOES East/West Satellite
baud : str,
‘Any’ (default). Filter by baud rate. See http://eddn.usgs.gov/msgaccess.html for options
electronic_mail : str,
‘’ (default) or ‘Y’
dcp_bul : str,
‘’ (default) or ‘Y’
glob_bul : str,
‘’ (default) or ‘Y’
timing : str,
‘’ (default) or ‘Y’
retransmitted : str,
‘Y’ (default) or ‘N’
daps_status : str,
‘N’ (default) or ‘Y’
use_cache : bool,
If True (default) use hdf file to cache data and retrieve new data on subsequent requests
cache_path : {
None
, str},If
None
use default ulmo location for cached files otherwise use specified path. files are named using dcp_address.as_dataframe : bool
If True (default) return data in a pandas dataframe otherwise return a dict.
Returns: message_data : {pandas.DataFrame, dict}
Either a pandas dataframe or a dict indexed by dcp message times
-
ulmo.usgs.eddn.
decode
(dataframe, parser, **kwargs)¶ decodes dcp message data in pandas dataframe returned by ulmo.usgs.eddn.get_data().
Parameters: dataframe : pandas.DataFrame
pandas.DataFrame returned by ulmo.usgs.eddn.get_data()
parser : {function, str}
function that acts on dcp_message each row of the dataframe and returns a new dataframe containing several rows of decoded data. This returned dataframe may have different (but derived) timestamps than that the original row. If a string is passed then a matching parser function is looked up from ulmo.usgs.eddn.parsers
Returns: decoded_data : pandas.DataFrame
pandas dataframe, the format and parameters in the returned dataframe depend wholly on the parser used
USGS Earth Resources Observation Systems (EROS) services¶
Earth Resources Observation and Science (EROS) Center application services (Raster)
-
ulmo.usgs.eros.
get_available_datasets
(bbox, attrs=None, as_dataframe=True)¶ retrieve available datasets for a given bounding box.
Parameters: bbox : (sequence of float|str)
bounding box of in geographic coordinates of area to download tiles in the format (min longitude, min latitude, max longitude, max latitude)
attrs: comma separated list of str
metadata attributes to retrieve, None (default) retrieves all
as_dataframe :
True
(default) orFalse
if
True
return pandas dataframeReturns: datasets : dict or pandas DataFrame
returns availabel datasets
-
ulmo.usgs.eros.
get_themes
(as_dataframe=True)¶ retrieve list of data themes available
Parameters: as_dataframe :
True
(default) orFalse
if
True
return pandas dataframeReturns: available data themes
-
ulmo.usgs.eros.
get_attribute_list
(as_dataframe=True)¶ retrieve list of metadata attributes for dataset
Parameters: as_dataframe :
True
(default) orFalse
if
True
return pandas dataframeReturns: available metadata attributes
-
ulmo.usgs.eros.
get_available_formats
(product_key, as_dataframe=True)¶ retrieve list of data formats available for dataset
Parameters: product_key : str
dataset name. (see get_available_datasets for list)
as_dataframe :
True
(default) orFalse
if
True
return pandas dataframeReturns: available data formats
-
ulmo.usgs.eros.
get_raster
(product_key, bbox, fmt=None, path=None, check_modified=False, mosaic=False)¶ downloads National Elevation Dataset raster tiles that cover the given bounding box for the specified data layer.
Parameters: product_key : str
dataset name. (see get_available_datasets for list)
bbox : (sequence of float|str)
bounding box of in geographic coordinates of area to download tiles in the format (min longitude, min latitude, max longitude, max latitude)
fmt :
None
or stravailable formats vary in different datasets. If
None
, preference will be given to geotiff and then img, followed by whatever fmt is availablepath :
None
or pathif
None
default path will be usedupdate_cache: ``True`` or ``False`` (default)
if
False
then tiles will not be re-downloaded if they exist in the pathcheck_modified: ``True`` or ``False`` (default)
if tile exists in path, check if newer file exists online and download if available.
mosaic: ``True`` or ``False`` (default)
if
True
, mosaic and clip downloaded tiles to the extents of the bbox provided. Requires rasterio package and GDAL.Returns: raster_tiles : geojson FeatureCollection
metadata as a FeatureCollection. local url of downloaded data is in feature[‘properties’][‘file’]
-
ulmo.usgs.eros.
get_raster_availability
(product_key, bbox, fmt=None)¶ retrieve metadata for raster tiles that cover the given bounding box for the specified data layer.
Parameters: product_key : str
dataset layer name. (see get_available_layers for list)
bbox : (sequence of float|str)
bounding box of in geographic coordinates of area to download tiles in the format (min longitude, min latitude, max longitude, max latitude)
fmt : str
desired data format. if None, geotiff followed by img will be given preference
Returns: metadata : geojson FeatureCollection
returns metadata including download urls as a FeatureCollection
USGS National Elevation Dataset (NED) services¶
`National Elevation Dataset`_ services (Raster)
-
ulmo.usgs.ned.
get_available_layers
()¶ return list of available data layers
-
ulmo.usgs.ned.
get_raster
(layer, bbox, path=None, check_modified=False, mosaic=False)¶ downloads National Elevation Dataset raster tiles that cover the given bounding box for the specified data layer.
Parameters: layer : str
dataset layer name. (see get_available_layers for list)
bbox : (sequence of float|str)
bounding box of in geographic coordinates of area to download tiles in the format (min longitude, min latitude, max longitude, max latitude)
path :
None
or pathif
None
default path will be usedcheck_modified: ``True`` or ``False`` (default)
if tile exists in path, check if newer file exists online and download if available.
mosaic: ``True`` or ``False`` (default)
if
True
, mosaic and clip downloaded tiles to the extents of the bbox provided. Requires rasterio package and GDAL.Returns: raster_tiles : geojson FeatureCollection
metadata as a FeatureCollection. local url of downloaded data is in feature[‘properties’][‘file’]
-
ulmo.usgs.ned.
get_raster_availability
(layer, bbox)¶ retrieve metadata for raster tiles that cover the given bounding box for the specified data layer.
Parameters: layer : str
dataset layer name. (see get_available_layers for list)
bbox : (sequence of float|str)
bounding box of in geographic coordinates of area to download tiles in the format (min longitude, min latitude, max longitude, max latitude)
Returns: metadata : geojson FeatureCollection
returns metadata including download urls as a FeatureCollection
note on dates and times¶
Dates and times can provided a few different ways, depending on what is convenient. They can either be a string representation or as instances of date and datetime objects from python’s datetime standard library module. For strings, the ISO 8061 format (‘YYYY-mm-dd HH:MM:SS’ or some abbreviated version) is accepted, as well dates in ‘mm/dd/YYYY’ format.