What is a Time Series?
We make a simple definition of a time series in this guide.
It’s a collection of data points we can order by “time”.
Sometimes I will refer to a time series as a random process, which is defined as a collection of random variable over time.
That can be stock prices, weather data, sales data, sensor readings, etc.
They usually look something like this:
I’m writing this guide getting back into the topic myself for the Allora forge competitions.
So we will look at simpler time series but also a lot at crypto currency and token prices like the one above.
I believe a hands-on example that you can continuously adapt help with building intuition and fortifying knowledge.
I will not discuss whether it is possible to forecast prices.
My goal here is to combine my learning and provide a resource in how one would attempt to attempt forecasting with any time series.
We will also look at weather, football players, lake elevation levels and more.
Time Series is a wonderful, interesting and surprising topic.
A time series has a mean function, an autocovariance function and an autocorrelation function that define it partially or completely.
The Mean Function
The mean function returns the expected value of the random variable at time . . I.e. at time what is the average value we can expect.
The Autocovariance Function
The autocovariance function of a random process returns the covariance between two random variables at different times.
If we scale the Autocovariance functiont to , we get the autocorrelation function.
Autocorrelation Function or ACF
The ACF is one of the tools we will use the most to analyze our time series.
We will look at these building blocks and how to use them in detail later.
Drift and trend
A time series may have a constant added to each time step. We will refer to this as drift or trend.
Stationarity is an important concept that we will require to make many of our analysis and forecasting tools useful. Intuitively, you can say that a time series or random process is (weakly) stationary if there is no change in the mean or variance when you compare two windows of the time series. That means that the average and variations at different times or different “lags” stay the same.
A Random Process is strictly stationary if
- The Mean function is constant
- The Autocovariance function is a function of the lag only.
- All random variables have the same distribution
We mostly only require 1 and 2, which defines a weakly stationary time series.
Let’s start looking at some simple time series models and get a feel of the concepts discussed above. I think the simplest time series model we can make is:
Assuming that the random variables are constant. If we change the equation above to make it less boring, we can add a multiplier, you may remember this as the slope from the equation of a line:
Now this is a regression model as we know and love, and if you’re unfamiliar, I have a pocket guide for it.
Let’s let . If you tweak the slope m in:
You can see how our values change upwards, it looks like a curve with more points, but each of the splines is just a piecewise linear.
Dift
We introduced the drift before. Change it below to get a feeling how this impacts your model.
Noise
Last but not least, we operate in the real world. And to make it less boring, our measurements have noise and nature inherently (I believe) has some noise. At each timestep we add iid (independent and identically distributed, meaning at each time the noise is the same and the noises don’t influence one another at each time step) gaussian noise as:
And now you can play with a complete Autoregressive model, in particular, this is an AR(1) model as we only look at .