Site icon DatAnalytics

Time Series Prediction of Daily Total Female Births in California – January, 1960

By Dr Gwinyai Nyakuengama

(3 October 2018)

 

KEY WORDS

ARFIMA; Time Series; Daily female births in California; Stata; R package – Prophet

 

ACKNOWLEDGEMENT

We gratefully acknowledge:

Collectively, these parties not only inspired but underpinned this blog.

 

OBJECTIVE

To predict the 30-day, daily total female births in California, for January 1960.

 

METHOD

In this study:

 

RESULTS

This Stata plot of the daily female births in California for 1959 showed that the data has very high volatility.

This was suggestive of:

 

These Stata auto-correlation and partial auto-correlation plots also suggested the presence of serial correlation in the female daily birth time series.

 

Based on these Stata Dickey-Fuller test results, we failed to reject the null hypothesis of a random walk with a possible drift in the female daily births.

 

In Stata, the commonly used criteria for choosing appropriate time series lags are Schwarz’s Bayesian information criterion (SBIC), the Akaike’s information criterion (AIC), Final Prediction Error (FPE) and the Hannan and Quinn information criterion (HQIC). It turns out that AIC works well on monthly data.

The above results from Stata’s vector auto-regressive selection order (vascor) macro indicate that the second lag (ar2) was picked by most decision criteria (i.e. FPE , AIC and HQIC). However, a lagged 1 period (ar1) was selected using the SBIC criterion.

 

The DFGLS: Stata module to compute Dickey-Fuller/GLS unit root test command:

Based on these results:

 

The above Stata ARFIMA regression results suggested:

 

This Stata plot shows:

Just focusing on the 30-day prediction from the Stata ARFIMA model:

 

The Stata ARFIMA model’s 30-day predictions in January 1960 show;

 

We also predicted the births using the R package – Prophet, tuned the predictions to 90% CI , same as in Stata.

This plot from the R package – Prophet shows:

The average root-mean-square error (rmse) from R was also around seven daily female births (or 7.2 exactly).

 

Just focusing on the 30-day prediction in January 1960, these two plots from the R package – Prophet show:

 

CONCLUSION

 

BIBLIOGRAPHY

Becketti S. (2013): Introduction to Time Series Using Stata 1st Edition, Stata Press https://www.amazon.com/Introduction-Time-Using-Stata-Becketti/dp/1597181323

Ivanov, V. and Kilian, L. 2001. ‘A Practitioner’s Guide to Lag-Order Selection for Vector Autoregressions’. CEPR Discussion Paper no. 2685. London, Centre for Economic Policy Research. http://www.cepr.org/pubs/dps/DP2685.asp

Prophet: https://facebook.github.io/prophet/docs/quick_start.html

Prophet R package: June 15, 2018 https://cran.r-project.org/web/packages/prophet/prophet.pdf

StataCorp 2013: Stata Time-Series Reference Manual https://www.stata.com/manuals/ts.pdf.

 

Exit mobile version