At the municipality i work at we increasingly want to anlyse and publish real time data. As probably every organization =). This is a small practice project where i want to gather live data. Analyze it. And present it in a live, self-updating dashboard.
The data i gather is from sensors placed in the city of Nijmegen for the Smart Emissions project. Citizens of Nijmegen and the municipality placed several cheap sensors around town to measure temperatures, content of the air (PM 2,5, NO2 etc.) and sometimes even sound. They provide several API’s to access the data.
The dashboard I made is presenting the measurements of sensor ‘20080007’ of the last 12 hours. Plus it presents a forecast of the next 6 hours based on the last 1000 measurements. It tries to update itself every 5 minutes. It might take a minute to load.
Below is the script for making this simple dashboard.
library(shiny)
library(jsonlite)
library(tidyverse)
library(shiny)
library(forecast)
library(tseries)
library(janitor)
url <- "https://data.smartemission.nl/gost/v1.0/Things?$filter=name%20eq%20%2720080007%27&$expand=Datastreams/Observations($top=1000)"
ui <- shinyServer(fluidPage(
plotOutput("plot_temp")
))
server <- shinyServer(function(input, output, session){
# Function to get new observations
get_temp <- function(){
## update every 2 mins
tmp <- fromJSON(url, flatten = TRUE)
df <- tmp$value$Datastreams[[1]]$Observations[[5]] %>%
clean_names() %>%
mutate(date_result = as.Date(substr(max(as.Date(result_time)), 1, 10)),
time = substr(result_time, 12, 19))
Sys.sleep(0.1) #ensures API results are returned successfully
df
}
output$plot_temp <- renderPlot({
## make this function to update itself every 300000 ms, or 5 mins
invalidateLater(300000, session)
# Initialize df
df <- get_temp()
## get times for forecast
last_obs <- df %>%
select(time) %>%
head(1)
times <- seq(as.POSIXct(last_obs$time, format = "%H:%M:%S") + 3600, length.out = 6, by = "hour")
## making a timeseries for forecasting
ts <- df %>%
arrange(result_time) %>%
select(result) %>%
ts(frequency = 24)
## forecasting with auto.arima
fit <- auto.arima(ts)
fc <- as.data.frame(forecast(fit, h = 6)) %>%
## add times for plotting later
bind_cols(time = times) %>%
mutate(time = strftime(time, format = "%H:%M:%S")) %>%
rename(pt_fc = `Point Forecast`) %>%
select(time, pt_fc)
## make the 24h of a day for plotting later
h24 <- strftime(seq(as.POSIXct("00:00:00", format = "%H:%M:%S") , length.out = 24, by = "hour"), format = "%H:%M:%S")
date_title <- df[[1,14]]
# Plot the 12 most recent values
df %>%
head(12) %>%
mutate(type = "Temperature measurements") %>%
## add forecast points to data
bind_rows(fc) %>%
## add 24h times for better plotting
full_join(data.frame(time = h24)) %>%
## format time to better fit the plot
mutate(time = format(as.POSIXct(time, format = "%H:%M:%S"), "%H:%M")) %>%
## start the plot
ggplot() +
# add measurements
geom_line(aes(x = time, y = result, group = 1, linetype = "Temperature measurements"), colour = "orangered1", size = 2) +
# add forecast
geom_point(aes(x = time, y = pt_fc, shape = "Forecast"), colour = "steelblue1", size = 6) +
xlab(" ") +
ylab("Temperature") +
ggtitle(label = paste0("Temperature for station with id '20080007' on ", date_title, ". Will try to update every 5 minutes.")) +
theme(legend.title=element_blank())
})
})
shinyApp(ui=ui,server=server)
I’m sure there is some tweaking possible to make it faster or better. For me it was a good practice to work with continues, streaming data. And to work with Shiny again. It also opens a lot possibilities for fun future projects =).
One thing I’ll definitely look at is to also make models independent of long time series. And let the last few measurements update the model in small increments. It will be a lot faster. Have fun with your own streaming data and let me know of your own ideas!