Datetime Information

dataretrieval attempts to normalize time data to UTC time when converting web service data into dataframes. To do this, in-built pandas functions are used; either pandas.to_datetime() during the initial datetime object conversion, or pandas.DataFrame.tz_localize() if the datetime objects exist but are not UTC-localized. In most cases (single-site and multi-site), dataretrieval assigns the datetime information as the dataframe index, the exception to this is when incomplete datetime information is available, in these cases integers are used as the dataframe index (see PR#58 for more details).

Inspecting Timestamps

For single sites, the index of the returned dataframe contains pandas timestamps.

>>> import dataretrieval.nwis as nwis
>>> site = '03339000'
>>> df = nwis.get_record(sites=site, service='peaks',
...                      start='2015-01-01', end='2017-12-31')
>>> print(df)
                          agency_cd   site_no peak_tm  peak_va peak_cd  gage_ht  gage_ht_cd  year_last_pk  ag_dt  ag_tm  ag_gage_ht  ag_gage_ht_cd
datetime
2015-06-08 00:00:00+00:00      USGS  03339000   17:30    25100       C    22.83         NaN           NaN    NaN    NaN         NaN            NaN
2015-12-29 00:00:00+00:00      USGS  03339000   18:45    37600       C    26.66         NaN           NaN    NaN    NaN         NaN            NaN
2017-05-05 00:00:00+00:00      USGS  03339000   04:45    17000       C    18.47         NaN           NaN    NaN    NaN         NaN            NaN

Here the index of the dataframe df is a set of datetime objects. Each has the format, YYYY-MM-DD HH:MM:SS+HH:MM. Because these timestamps are localized to be in UTC, the expected offset (+HH:MM) is +00:00. These values can be converted to a local timezone of your choosing using pandas functionality.

>>> df.index = df.index.tz_convert(tz='America/New_York')
>>> print(df)
                          agency_cd   site_no peak_tm  peak_va peak_cd  gage_ht  gage_ht_cd  year_last_pk  ag_dt  ag_tm  ag_gage_ht  ag_gage_ht_cd
datetime
2015-06-07 20:00:00-04:00      USGS  03339000   17:30    25100       C    22.83         NaN           NaN    NaN    NaN         NaN            NaN
2015-12-28 19:00:00-05:00      USGS  03339000   18:45    37600       C    26.66         NaN           NaN    NaN    NaN         NaN            NaN
2017-05-04 20:00:00-04:00      USGS  03339000   04:45    17000       C    18.47         NaN           NaN    NaN    NaN         NaN            NaN

Above, the index was converted to localize the timestamps to New York. In the updated dataframe index, the resulting timestamps now have offsets of -04:00 and -05:00 as New York is either 4 or 5 hours behind UTC depending on the time of year (due to daylight savings).

When information for multiple sites is requested, dataretrieval creates a dataframe with a multi-index, with the first entry containing the site number, and the second containing the datetime information.

>>> import dataretrieval.nwis as nwis
>>> sites = ['180049066381200', '290000095192602']
>>> df = nwis.get_record(sites=sites, service='gwlevels',
...                      start='2021-10-01', end='2022-01-01')
>>> df
                                          agency_cd site_tp_cd      lev_dt lev_tm lev_tz_cd  ...  lev_dt_acy_cd  lev_acy_cd  lev_src_cd  lev_meth_cd lev_age_cd
site_no         datetime                                                                     ...
180049066381200 2021-10-04 19:54:00+00:00      USGS         GW  2021-10-04  19:54     +0000  ...              m         NaN           S            S          A
                2021-11-16 14:28:00+00:00      USGS         GW  2021-11-16  14:28     +0000  ...              m         NaN           S            S          A
                2021-12-09 10:43:00+00:00      USGS         GW  2021-12-09  10:43     +0000  ...              m         NaN           S            S          A
290000095192602 2021-12-08 19:07:00+00:00      USGS         GW  2021-12-08  19:07     +0000  ...              m         NaN           S            S          P

[4 rows x 15 columns]

Here note that the default datetime index information returned is also UTC localized, and therefore the offset values are +00:00.