New R tool winfapReader reviewed
/ Sean Turner
Yesterday on Twitter I came across a new R package from Ilaria Prosdocimi. The package makes it super-easy to obtain annual maxima and peak-over-threshold data for UK rivers. Here’s a quick review and demo of the package.
You can install easily using devtools
:
devtools::install_github("ilapros/winfapReader")
For this analysis I’m gonna load winfapReader
along with a few other libraries.
library(winfapReader)
library(dplyr)
library(purrr)
library(lubridate) # for dates
library(ggplot2) # for plots
I couldn’t find any functionality in winfapReader
for identifying stations of interest, so I browsed the National River Flow Archive data viewer to get a few stations in Renfrewshire, Scotland (where I grew up). The stations are:
- 84011 Gryfe at Craigend
- 84017 Black Cart Water at Milliken Park
- 84012 White Cart Water at Hawkhead
Next, a table with the station IDs and names:
tribble(
~Station, ~name,
84011, "Gryfe at Craigend",
84017, "Black Cart Water at Milliken Park",
84012, "White Cart Water at Hawkhead"
) -> stations
The get_amax
function in winfapReader
returns the time series of annual maxima. The function takes station ids as the argument. You can pump multiple station ids into the function, but I’m gonna set up the vector of station ids and then use purrr::map_dfr
to get results a single table (rather than a list), which will be easier to work with.
stations[["Station"]] %>%
map_dfr(get_amax) %>%
as_tibble() ->
get_amax_output
get_amax_output
## # A tibble: 138 x 6
## Station WaterYear Date Flow Stage Rejected
## <dbl> <dbl> <date> <dbl> <dbl> <lgl>
## 1 84011 1963 1963-11-24 56.3 2.3 FALSE
## 2 84011 1964 1964-12-11 66.4 2.42 FALSE
## 3 84011 1965 1965-10-31 94.2 2.71 FALSE
## 4 84011 1966 1966-10-05 65.5 2.41 FALSE
## 5 84011 1967 1968-03-26 77.4 2.54 FALSE
## 6 84011 1968 1968-10-09 56.3 2.3 FALSE
## 7 84011 1969 1970-02-01 73.6 2.5 FALSE
## 8 84011 1970 1970-11-03 64.7 2.4 FALSE
## 9 84011 1971 1971-10-10 71.8 2.48 FALSE
## 10 84011 1972 1972-12-11 64.7 2.4 FALSE
## # … with 128 more rows
(Not sure what the flow units are—I assume cumecs)
Next I’m gonna identify the season of each annual max and neaten up the table:
get_amax_output %>%
left_join(stations) %>%
mutate(month = month(Date),
season = case_when(
month %in% c(12, 1, 2) ~ "winter",
month %in% 3:5 ~ "spring",
month %in% 6:8 ~ "summer",
month %in% 9:11 ~ "autumn"
)) %>%
select(name, WaterYear, Flow, season) -> amax_table
amax_table
## # A tibble: 138 x 4
## name WaterYear Flow season
## <chr> <dbl> <dbl> <chr>
## 1 Gryfe at Craigend 1963 56.3 autumn
## 2 Gryfe at Craigend 1964 66.4 winter
## 3 Gryfe at Craigend 1965 94.2 autumn
## 4 Gryfe at Craigend 1966 65.5 autumn
## 5 Gryfe at Craigend 1967 77.4 spring
## 6 Gryfe at Craigend 1968 56.3 autumn
## 7 Gryfe at Craigend 1969 73.6 winter
## 8 Gryfe at Craigend 1970 64.7 autumn
## 9 Gryfe at Craigend 1971 71.8 autumn
## 10 Gryfe at Craigend 1972 64.7 winter
## # … with 128 more rows
And finally plot using gglot2
:
amax_table %>%
ggplot(aes(WaterYear, Flow)) +
geom_smooth(se = FALSE, col = "lightgrey",
method = "loess",
formula = "y ~ x") +
geom_point(aes(col = season)) +
facet_wrap(~name, scales = "free_y", ncol = 1) +
expand_limits(y = 0) +
theme_classic() +
theme(strip.background = element_blank())+
labs(title = "Annual Maximum Flow analysis",
subtitle = "Three stations in Renfrewshire, Scotland",
y = "Flow (cumecs)", x = "Water year") +
geom_vline(xintercept = 1994, alpha = 0.5, linetype = 2)
There we have it. Looks like there was a big flood in the winter of 1994. I must say I don’t recall that particular winter (I was ten). Unsurprisingly the annual max almost always occurs in autumn or winter. No strong trend emerging from any of the stations, although it looks like the White Cart has been regulated somewhat in recent years (maybe new operations at a dam upstream).
Overall winfapReader
is a fine wee package that makes analysis of UK floods easier than catching COVID-19 in New York City. A great addition would be an internal vector of the station ids, which would allow one to easily grab all of the metadata straight up (maybe there’s some way of doing this that I missed).
Shout out to Ilaria Prosdocimi and colleagues for providing this tool.