A college of mine, Vivien Roussez, wrote a nice library in R to predict time series. The package is called “autoTS” and provides a high level interface for univariate time series predictions. It implements many algorithms, most of them provided by the forecast
package. You can find the package as an open source project on GitHub. Over the last few weeks we saw a lot of Data Science happening due to Corona. One of the challenges on this is to use the right forecast.
Introduction to autoTS
by Vivien Roussez
The autoTS
package provides a high level interface for univariate time series predictions. It implements many algorithms, most of them provided by the forecast
package. The main goals of the package are :
- Simplify the preparation of the time series ;
- Train the algorithms and compare their results, to chose the best one ;
- Gather the results in a final tidy dataframe
What are the inputs ?
The package is designed to work on one time series at a time. Parallel calculations can be put on top of it (see example below). The user has to provide 2 simple vectors :
- One with the dates (s.t. the
lubridate
package can parse them) - The second with the corresponding values
Warnings
This package implements each algorithm with a unique parametrization, meaning that the user cannot tweak the algorithms (eg modify SARIMA specfic parameters).
Example on real-world data
Before getting started, you need to install the required package “autoTS”. This works with the following code:
knitr::opts_chunk$set(warning = F,message = F,fig.width = 8,fig.height = 5)
suppressPackageStartupMessages(library(dplyr))
suppressPackageStartupMessages(library(ggplot2))
suppressPackageStartupMessages(library(lubridate))
library(autoTS)
For this example, we will use the GDP quarterly data of the european countries provided by eurostat. The database can be downloaded from this page and then chose “GDP and main components (output, expenditure and income) (namq_10_gdp)” and then adjust the time dimension to select all available data and download as a csv file with the correct formatting (1 234.56). The csv is in the “Data” folder of this notebook.
dat <- read.csv("Data/namq_10_gdp_1_Data.csv")
str(dat)
## 'data.frame': 93456 obs. of 7 variables:
## $ TIME : Factor w/ 177 levels "1975Q1","1975Q2",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ GEO : Factor w/ 44 levels "Albania","Austria",..: 15 15 15 15 15 15 15 15 15 15 ...
## $ UNIT : Factor w/ 3 levels "Chain linked volumes (2010), million euro",..: 2 2 2 2 3 3 3 3 1 1 ...
## $ S_ADJ : Factor w/ 4 levels "Calendar adjusted data, not seasonally adjusted data",..: 4 2 1 3 4 2 1 3 4 2 ...
## $ NA_ITEM : Factor w/ 1 level "Gross domestic product at market prices": 1 1 1 1 1 1 1 1 1 1 ...
## $ Value : Factor w/ 19709 levels "1 008.3","1 012.9",..: 19709 19709 19709 19709 19709 19709 19709 19709 19709 19709 ...
## $ Flag.and.Footnotes: Factor w/ 5 levels "","b","c","e",..: 1 1 1 1 1 1 1 1 1 1 ...
head(dat)
## TIME GEO
## 1 1975Q1 European Union - 27 countries (from 2019)
## 2 1975Q1 European Union - 27 countries (from 2019)
## 3 1975Q1 European Union - 27 countries (from 2019)
## 4 1975Q1 European Union - 27 countries (from 2019)
## 5 1975Q1 European Union - 27 countries (from 2019)
## 6 1975Q1 European Union - 27 countries (from 2019)
## UNIT
## 1 Chain linked volumes, index 2010=100
## 2 Chain linked volumes, index 2010=100
## 3 Chain linked volumes, index 2010=100
## 4 Chain linked volumes, index 2010=100
## 5 Current prices, million euro
## 6 Current prices, million euro
## S_ADJ
## 1 Unadjusted data (i.e. neither seasonally adjusted nor calendar adjusted data)
## 2 Seasonally adjusted data, not calendar adjusted data
## 3 Calendar adjusted data, not seasonally adjusted data
## 4 Seasonally and calendar adjusted data
## 5 Unadjusted data (i.e. neither seasonally adjusted nor calendar adjusted data)
## 6 Seasonally adjusted data, not calendar adjusted data
## NA_ITEM Value Flag.and.Footnotes
## 1 Gross domestic product at market prices :
## 2 Gross domestic product at market prices :
## 3 Gross domestic product at market prices :
## 4 Gross domestic product at market prices :
## 5 Gross domestic product at market prices :
## 6 Gross domestic product at market prices :
Data preparation
First, we have to clean the data (not too ugly though). First thing is to convert the TIME column into a well known date format that lubridate can handle. In this example, the yq
function can parse the date without modification of the column. Then, we have to remove the blank in the values that separates thousands… Finally, we only keep data since 2000 and the unadjusted series in current prices.
After that, we should get one time series per country
dat <- mutate(dat,dates=yq(as.character(TIME)),
values = as.numeric(stringr::str_remove(Value," "))) %>%
filter(year(dates)>=2000 &
S_ADJ=="Unadjusted data (i.e. neither seasonally adjusted nor calendar adjusted data)" &
UNIT == "Current prices, million euro")
filter(dat,GEO %in% c("France","Austria")) %>%
ggplot(aes(dates,values,color=GEO)) + geom_line() + theme_minimal() +
labs(title="GDP of (completely) random countries")

Now we’re good to go !
Prediction on a random country
Let’s see how to use the package on one time series :
- Extract dates and values of the time series you want to work on
- Create the object containing all you need afterwards
- Train algo and determine which one is the best (over the last known year)
- Implement the best algorithm on full data
ex1 <- filter(dat,GEO=="France")
preparedTS <- prepare.ts(ex1$dates,ex1$values,"quarter")
## What is in this new object ?
str(preparedTS)
## List of 4
## $ obj.ts : Time-Series [1:77] from 2000 to 2019: 363007 369185 362905 383489 380714 ...
## $ obj.df :'data.frame': 77 obs. of 2 variables:
## ..$ dates: Date[1:77], format: "2000-01-01" "2000-04-01" ...
## ..$ val : num [1:77] 363007 369185 362905 383489 380714 ...
## $ freq.num : num 4
## $ freq.alpha: chr "quarter"
plot.ts(preparedTS$obj.ts)

ggplot(preparedTS$obj.df,aes(dates,val)) + geom_line() + theme_minimal()

Get the best algorithm for this time series :
## What is the best model for prediction ?
best.algo <- getBestModel(ex1$dates,ex1$values,"quarter",graph = F)
names(best.algo)
## [1] "prepedTS" "best" "train.errors" "res.train" "algos"
## [6] "graph.train"
print(paste("The best algorithm is",best.algo$best))
## [1] "The best algorithm is my.ets"
best.algo$graph.train

You find in the result of this function :
- The name of the best model
- The errors of each algorithm on the test set
- The graphic of the train step
- The prepared time series
- The list of used algorithm (that you can customize)
The result of this function can be used as direct input of the my.prediction
function
## Build the predictions
final.pred <- my.predictions(bestmod = best.algo)
tail(final.pred,24)
## # A tibble: 24 x 4
## dates type actual.value ets
## <date> <chr> <dbl> <dbl>
## 1 2015-04-01 <NA> 548987 NA
## 2 2015-07-01 <NA> 541185 NA
## 3 2015-10-01 <NA> 566281 NA
## 4 2016-01-01 <NA> 554121 NA
## 5 2016-04-01 <NA> 560873 NA
## 6 2016-07-01 <NA> 546383 NA
## 7 2016-10-01 <NA> 572752 NA
## 8 2017-01-01 <NA> 565221 NA
## 9 2017-04-01 <NA> 573720 NA
## 10 2017-07-01 <NA> 563671 NA
## # … with 14 more rows
ggplot(final.pred) + geom_line(aes(dates,actual.value),color="black") +
geom_line(aes_string("dates",stringr::str_remove(best.algo$best,"my."),linetype="type"),color="red") +
theme_minimal()

Not too bad, right ?
Scaling predictions
Let’s say we want to make a prediction for each country in the same time and be the fastest possible →→ let’s combine the package’s functions with parallel computing. We have to reshape the data to get one column per country and then iterate over the columns of the data frame.
Prepare data
suppressPackageStartupMessages(library(tidyr))
dat.wide <- select(dat,GEO,dates,values) %>%
group_by(dates) %>%
spread(key = "GEO",value = "values")
head(dat.wide)
## # A tibble: 6 x 45
## # Groups: dates [6]
## dates Albania Austria Belgium `Bosnia and Her… Bulgaria Croatia Cyprus
## <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2000-01-01 NA 50422. 62261 NA 2941. 5266. 2547.
## 2 2000-04-01 NA 53180. 65046 NA 3252. 5811 2784.
## 3 2000-07-01 NA 53881. 62754 NA 4015. 6409. 2737.
## 4 2000-10-01 NA 56123. 68161 NA 4103. 6113 2738.
## 5 2001-01-01 NA 52911. 64318 NA 3284. 5777. 2688.
## 6 2001-04-01 NA 54994. 67537 NA 3669. 6616. 2946.
## # … with 37 more variables: Czechia <dbl>, Denmark <dbl>, Estonia <dbl>, `Euro
## # area (12 countries)` <dbl>, `Euro area (19 countries)` <dbl>, `Euro area
## # (EA11-2000, EA12-2006, EA13-2007, EA15-2008, EA16-2010, EA17-2013,
## # EA18-2014, EA19)` <dbl>, `European Union - 15 countries (1995-2004)` <dbl>,
## # `European Union - 27 countries (from 2019)` <dbl>, `European Union - 28
## # countries` <dbl>, Finland <dbl>, France <dbl>, `Germany (until 1990 former
## # territory of the FRG)` <dbl>, Greece <dbl>, Hungary <dbl>, Iceland <dbl>,
## # Ireland <dbl>, Italy <dbl>, `Kosovo (under United Nations Security Council
## # Resolution 1244/99)` <dbl>, Latvia <dbl>, Lithuania <dbl>,
## # Luxembourg <dbl>, Malta <dbl>, Montenegro <dbl>, Netherlands <dbl>, `North
## # Macedonia` <dbl>, Norway <dbl>, Poland <dbl>, Portugal <dbl>,
## # Romania <dbl>, Serbia <dbl>, Slovakia <dbl>, Slovenia <dbl>, Spain <dbl>,
## # Sweden <dbl>, Switzerland <dbl>, Turkey <dbl>, `United Kingdom` <dbl>
pull ## Compute bulk predictions
library(doParallel)
pipeline <- function(dates,values)
{
pred <- getBestModel(dates,values,"quarter",graph = F) %>%
my.predictions()
return(pred)
}
doMC::registerDoMC(parallel::detectCores()-1) # parallel backend (for UNIX)
system.time({
res <- foreach(ii=2:ncol(dat.wide),.packages = c("dplyr","autoTS")) %dopar%
pipeline(dat.wide$dates,pull(dat.wide,ii))
})
## user system elapsed
## 342.339 3.405 66.336
names(res) <- colnames(dat.wide)[-1]
str(res)
## List of 44
## $ Albania :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 0 0 0 0 0 0 0 0 0 0 ...
## ..$ stlm : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Austria :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 50422 53180 53881 56123 52911 ...
## ..$ sarima : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Belgium :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 62261 65046 62754 68161 64318 ...
## ..$ shortterm : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Bosnia and Herzegovina :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 0 0 0 0 0 0 0 0 0 0 ...
## ..$ stlm : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Bulgaria :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 2941 3252 4015 4103 3284 ...
## ..$ tbats : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Croatia :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 5266 5811 6409 6113 5777 ...
## ..$ shortterm : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Cyprus :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 2547 2784 2737 2738 2688 ...
## ..$ sarima : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Czechia :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 15027 16430 17229 18191 16677 ...
## ..$ tbats : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Denmark :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 42567 44307 43892 47249 44143 ...
## ..$ prophet : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Estonia :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 1391 1575 1543 1662 1570 ...
## ..$ shortterm : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Euro area (12 countries) :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 0 0 0 0 0 0 0 0 0 0 ...
## ..$ prophet : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Euro area (19 countries) :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 0 0 0 0 0 0 0 0 0 0 ...
## ..$ prophet : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Euro area (EA11-2000, EA12-2006, EA13-2007, EA15-2008, EA16-2010, EA17-2013, EA18-2014, EA19):Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 0 0 0 0 0 0 0 0 0 0 ...
## ..$ prophet : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ European Union - 15 countries (1995-2004) :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 0 0 0 0 0 0 0 0 0 0 ...
## ..$ prophet : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ European Union - 27 countries (from 2019) :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 0 0 0 0 0 0 0 0 0 0 ...
## ..$ prophet : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ European Union - 28 countries :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 0 0 0 0 0 0 0 0 0 0 ...
## ..$ prophet : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Finland :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 31759 33836 34025 36641 34474 ...
## ..$ bats : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ France :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 363007 369185 362905 383489 380714 ...
## ..$ ets : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Germany (until 1990 former territory of the FRG) :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 515500 523900 536120 540960 530610 ...
## ..$ sarima : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Greece :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 33199 34676 37285 37751 35237 ...
## ..$ shortterm : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Hungary :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 11516 12630 13194 13955 12832 ...
## ..$ shortterm : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Iceland :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 2304 2442 2557 2447 2232 ...
## ..$ stlm : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Ireland :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 25583 26751 27381 28666 29766 ...
## ..$ tbats : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Italy :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 292517 309098 298655 338996 309967 ...
## ..$ ets : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Kosovo (under United Nations Security Council Resolution 1244/99) :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 0 0 0 0 0 0 0 0 0 0 ...
## ..$ ets : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Latvia :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 1848 2165 2238 2382 2005 ...
## ..$ ets : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Lithuania :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 2657 3124 3267 3505 2996 ...
## ..$ shortterm : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Luxembourg :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 5646 5730 5689 6015 5811 ...
## ..$ sarima : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Malta :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 979 1110 1158 1152 1031 ...
## ..$ shortterm : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Montenegro :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 0 0 0 0 0 0 0 0 0 0 ...
## ..$ ets : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Netherlands :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 109154 113124 110955 118774 118182 ...
## ..$ shortterm : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ North Macedonia :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 901 1052 1033 1108 986 ...
## ..$ shortterm : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Norway :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 44900 43730 46652 50638 48355 ...
## ..$ sarima : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Poland :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 41340 44210 46944 54163 47445 ...
## ..$ bagged : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Portugal :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 30644 31923 32111 33788 31927 ...
## ..$ stlm : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Romania :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 7901 9511 11197 11630 8530 ...
## ..$ shortterm : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Serbia :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 0 0 0 0 0 ...
## ..$ sarima : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Slovakia :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 5100 5722 5764 5752 5343 ...
## ..$ shortterm : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Slovenia :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 5147 5591 5504 5667 5407 ...
## ..$ bats : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Spain :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 153378 162400 158526 171946 166204 ...
## ..$ bagged : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Sweden :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 67022 73563 68305 73399 66401 ...
## ..$ bats : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Switzerland :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 70048 72725 74957 77476 76092 ...
## ..$ bats : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ Turkey :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 59944 70803 82262 82981 60075 ...
## ..$ shortterm : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
## $ United Kingdom :Classes 'tbl_df', 'tbl' and 'data.frame': 85 obs. of 4 variables:
## ..$ dates : Date[1:85], format: "2000-01-01" "2000-04-01" ...
## ..$ type : chr [1:85] NA NA NA NA ...
## ..$ actual.value: num [1:85] 438090 440675 446918 462127 441157 ...
## ..$ bagged : num [1:85] NA NA NA NA NA NA NA NA NA NA ...
There is no free lunch…
There is no best algorithm in general ⇒⇒ depends on the data !
sapply(res,function(xx) colnames(select(xx,-dates,-type,-actual.value)) ) %>% table()
## .
## bagged bats ets prophet sarima shortterm stlm tbats
## 3 4 5 7 6 12 4 3
sapply(res,function(xx) colnames(select(xx,-dates,-type,-actual.value)) )
## Albania
## "stlm"
## Austria
## "sarima"
## Belgium
## "shortterm"
## Bosnia and Herzegovina
## "stlm"
## Bulgaria
## "tbats"
## Croatia
## "shortterm"
## Cyprus
## "sarima"
## Czechia
## "tbats"
## Denmark
## "prophet"
## Estonia
## "shortterm"
## Euro area (12 countries)
## "prophet"
## Euro area (19 countries)
## "prophet"
## Euro area (EA11-2000, EA12-2006, EA13-2007, EA15-2008, EA16-2010, EA17-2013, EA18-2014, EA19)
## "prophet"
## European Union - 15 countries (1995-2004)
## "prophet"
## European Union - 27 countries (from 2019)
## "prophet"
## European Union - 28 countries
## "prophet"
## Finland
## "bats"
## France
## "ets"
## Germany (until 1990 former territory of the FRG)
## "sarima"
## Greece
## "shortterm"
## Hungary
## "shortterm"
## Iceland
## "stlm"
## Ireland
## "tbats"
## Italy
## "ets"
## Kosovo (under United Nations Security Council Resolution 1244/99)
## "ets"
## Latvia
## "ets"
## Lithuania
## "shortterm"
## Luxembourg
## "sarima"
## Malta
## "shortterm"
## Montenegro
## "ets"
## Netherlands
## "shortterm"
## North Macedonia
## "shortterm"
## Norway
## "sarima"
## Poland
## "bagged"
## Portugal
## "stlm"
## Romania
## "shortterm"
## Serbia
## "sarima"
## Slovakia
## "shortterm"
## Slovenia
## "bats"
## Spain
## "bagged"
## Sweden
## "bats"
## Switzerland
## "bats"
## Turkey
## "shortterm"
## United Kingdom
## "bagged"
We hope you enjoy working with this package to build your time series predictions in the future. Now you should be capable of extending your data science algorithms on corona with Time Series predicitons. If you want to learn more about data science, I recommend you doing this tutorial.