Uses either linear interpolation, linear interpolation in the log scale, or fixed-interval smoothing on time series to fill missing data. This function gets used within the Daily functions if fill=TRUE. As a standalone function, the input can be data directly download from `dataRetrieval::read_waterdata_daily`.
Usage
fill_missing_daily(df, fill_type, maxgap = 21, value_col = "value",
qualifier_col = "qualifier")Arguments
- df
Data frame with at least value and qualifier columns. The names of those columns are defined by value_col and qualifier_col. The data frame is expected to be a complete and uniform time series. This could be one row per day, or one row per X interval. The data in the value_col will be filled in with the assumption that the data is uniform and any missing data is set as `NA`.
- fill_type
character to define what process to fill missing data. Options are "interpolation" - linear interpolation from the `zoo::na.approx`, or "log_interp" - linear interpolation in the log space. Only used if fill is set to TRUE.
- maxgap
Maximum number of NA days allowed for interpolating gaps. Default is 21. Only used if fill is set to TRUE.
- value_col
Character, name of value column.
- qualifier_col
Character, name of qualifier column.
Examples
Date <- seq(from = as.Date("2001/1/1"),
to = as.Date("2002/1/2"),
by = "day")
Qualifier <- rep("",367)
Q <- 2+sin(seq(from = 0, to = 2*pi, length.out = 367))
Q <- jitter(Q, factor = 500)
plot(Q, ylim = c(0, 3.2))
dataInput <- data.frame(time = Date,
value = Q,
qualifier = Qualifier)
# Remove some rows to test missing:
dataInput$value[4:5] <- NA
dataInput$value[10:20] <- NA
# Linear interpolation:
interp1 <- fill_missing_daily(df = dataInput,
fill_type = "interpolation")
plot(interp1$time[1:30],
interp1$value[1:30],
col = as.factor(interp1$qualifier[1:30]),
type = "b", pch = 16, ylim = c(0, 3.2),
main = "Linear Interpolation")
# Add a gap that is too big do deal with:
dataInput$value[200:255] <- NA
df_interp <- fill_missing_daily(dataInput,
fill_type = "interpolation")
plot(df_interp$time, df_interp$value,
col = as.factor(df_interp$qualifier),
main = "Linear Interpolation",
type = "b", pch = 16, ylim = c(0, 3.2))
plot(df_interp$time[1:50], df_interp$value[1:50],
col = as.factor(df_interp$qualifier[1:50]),
main = "Linear Interpolation",
type = "b", pch = 16, ylim = c(0, 3.2))
df_log_interp <- fill_missing_daily(dataInput,
fill_type = "log_interp")
plot(df_log_interp$time, df_log_interp$value,
col = as.factor(df_log_interp$qualifier),
main = "Linear Interpolation in Log Scale",
type = "b", pch = 16, ylim = c(0, 3.2))
plot(df_log_interp$time[1:50], df_log_interp$value[1:50],
col = as.factor(df_log_interp$qualifier[1:50]),
main = "Linear Interpolation in Log Scale",
type = "b", pch = 16, ylim = c(0, 3.2))
