%------------------------------------------------------------------------------
%\documentclass[reqno]{amsart}
\documentclass[12pt]{amsart}
%\setcounter{page}{6}
\usepackage[top=1in, bottom=1in, left=1in, right=1in]{geometry}
\usepackage[colorlinks=true, urlcolor=blue]{hyperref}
\usepackage{amssymb}
\usepackage{amsmath,mathrsfs}
%\oddsidemargin -0in \evensidemargin .5in
%\topmargin=1.25in
%\headheight 10pt \headsep 10pt \footheight 10pt \footskip 24pt
%\textheight 10in \textwidth 6.5in \columnsep 10pt \columnseprule 0pt
%\font\namefont=cmr10 scaled\magstep2
\font\namefont=cmr8 scaled\magstep2
\newcommand{\myindent}{\leftskip=.4in}
%\voffset=-.75in
\parskip=11pt %extra vertical distance for new paragraph
\parindent=0in
%\newtheorem{theorem}{Theorem}[section]
%\newtheorem{lemma}[theorem]{Lemma}
%\newtheorem{lemma}{Lemma}[section]
%\newtheorem{theorem}{Theorem}[section]
%\newtheorem{lemma}[theorem]{Lemma}
%\newtheorem{prop}[theorem]{Proposition}
%\newtheorem{cor}[theorem]{Corollary}
%\newtheorem{conj}{Conjecture}
\usepackage{graphicx}
\usepackage{color}
\usepackage{subfigure}
\usepackage{amssymb}
\usepackage{amsmath}
\usepackage{colonequals}
\usepackage{hyperref}
%\usepackage{showlabels}
%\usepackage[all]{xypic}
%\entrymodifiers={+!!<0pt,\fontdimen22\textfont2>}
\theoremstyle{definition}
\newtheorem{exercise}{Exercise}
\newtheorem{remark}{Remark}
\renewcommand{\setminus}{\smallsetminus}
\addtolength{\footskip}{17pt}
%\numberwithin{table}{section}
\renewcommand{\subset}{\subseteq}
\renewcommand{\supset}{\supseteq}
\renewcommand{\epsilon}{\varepsilon}
\newcommand{\abs}[1]{\left|#1\right|} % Absolute value notation
\newcommand{\absf}[1]{|#1|} % small absolute value signs
\newcommand{\vnorm}[1]{\left|\left|#1\right|\right|} % norm notation
\newcommand{\vnormf}[1]{||#1||} % norm notation, forced to be small
\newcommand{\im}[1]{\mbox{im}#1} % Pieces of English for math mode
\newcommand{\tr}[1]{\mbox{tr}#1}
\newcommand{\Proj}[1]{\mbox{Proj}#1}
\newcommand{\Vol}[1]{\mbox{Vol}#1}
\newcommand{\Z}{\mathbf{Z}} % Blackboard notation
\newcommand{\N}{\mathbf{N}}
\newcommand{\E}{\mathbf{E}}
\renewcommand{\P}{\mathbf{P}}
\newcommand{\R}{\mathbf{R}}
\newcommand{\C}{\mathbf{C}}
\newcommand{\Q}{\mathbf{Q}}
\newcommand{\figoneawidth}{.5\textwidth} % Image formatting parameters
\newcommand{\lbreak}{\\} % Linebreak
\newcommand{\italicize}[1]{\textit {#1}} % formatting commands for bibliography
%\newcommand{\embolden}[1]{\textbf {#1}}
\newcommand{\embolden}[1]{{#1}}
\newcommand{\undline}[1]{\underline {#1}}
\newcommand{\e}{\varepsilon}
\renewcommand{\epsilon}{\varepsilon}
\renewcommand{\Re}{\mathrm{Re}}
%\renewcommand{\colonequals}{=}
\thispagestyle{empty}
\begin{document}
\bigskip
Graduate Time Series \hfill Steven Heilman\\
\noindent\rule{6.5in}{0.4pt}
\begin{center}
{\Large Spring 2020, Math 545, Take-Home Final Exam}
\end{center}
Due: 5PM PST on May 8th, 2020.
%\vspace{.2cm}
\textbf{Instructions}.
\begin{itemize}
\item You \textbf{can} use your book, notes, and previous homeworks on this exam.
\item You \textbf{cannot use the internet or internet search resources} such as google, mathoverflow, etc. to complete the exam.
\item You are required to show your work on each problem on the exam.
\item For exercises involving coding, submit all of your code, and submit all output that is relevant to that exercise.
\item The exam is \textbf{due at 5PM PST on May 8th, 2020}.
\item You must submit the exam, via email, to \verb!stevenmheilman@gmail.com!. This is the only email address at which the exam will be accepted.
\item The exam must be submitted as a \textbf{single PDF file}. Submitting separate files will be penalized.
\item \textbf{If you use a theorem or proposition from class or the notes or the book you must explicitly cite the theorem/proposition number from the book/notes} and explain
why the theorem may be applied. (Since this particular exam is open note and open book, this extra requirement should be reasonable.)
\item \textbf{Organize your work}, in a reasonably neat and coherent way. Work scattered all over the page without a clear ordering will
receive very little credit.
\item \textbf{Mysterious or unsupported answers will not receive full
credit}. A correct answer, unsupported by calculations, explanation,
or algebraic work will receive no credit; an incorrect answer supported
by substantially correct calculations and explanations might still receive
partial credit.
\end{itemize}
\begin{exercise}
Let $0<\sigma<\infty$. Let $\{Z_{n}\}_{n\in\Z}$ be real-valued $WN(0,\sigma^{2})$. Let $\{X_{n}\}_{n\in\Z}$ be a real-valued mean zero ARMA$(1,1)$ process defined by
$$X_{n}-\frac{1}{3}X_{n-1}=Z_{n}+\frac{1}{3}Z_{n-1},\qquad\forall\,n\in\Z.$$
\begin{itemize}
\item Prove that there exist $c_{0},c_{1},\ldots\in\R$ such that
$$X_{n}=\sum_{j\geq0}c_{j}Z_{n-j},\qquad\forall\,n\in\Z.$$
Also, explicitly find the constants $c_{0},c_{1},\ldots$.
\item Write down an explicit formula for the autocovariance function $\gamma\colon\Z\to\R$ of $\{X_{n}\}_{n\in\Z}$.
\item Find the value of the partial autocorrelation function $\alpha(1)$. Also, find the one-step prediction error $\E(X_{2}-Y_{2})^{2}$.
\item Write down an explicit formula for the spectral density function $f\colon\R/\Z\to\C$ of $\{X_{n}\}_{n\in\Z}$.
\end{itemize}
\end{exercise}
\begin{exercise}[Financial Data Analysis, Version 2]
In this exercise, we will attempt to model financial data by ARMA processes. You can create a spreadsheet of financial data if you open up a spreadsheet in google spreadsheets, and input the following command into one of the cells in the spreadsheet (and then press enter):
\begin{verbatim}
=GOOGLEFINANCE("GOOG", "price", "1/1/2010", "12/31/2019", "DAILY")
\end{verbatim}
The first argument in the command is the stock price ticker symbol (which corresponds to google stock). The closing price of the stock is then listed every day, starting from January 1st, 2010, and ending on December 31st, 2019. Once you have the spreadsheet data, you should be able to import it into Matlab or your favorite mathematical software.
A classic barometer of the US economy is the ten year US treasury note yield. You can access this data with the command
\begin{verbatim}
=GOOGLEFINANCE("TNX", "price", DATE(2005,1,1), DATE(2019,12,31), "DAILY")
\end{verbatim}
That is, we are analyzing this data from 2005 to 2019. Let $j$ be the (non-weekend) day occurring after January 1, 2005, so e.g. $j=6$ corresponds to the sixth non-weekend day after January 1, 2005, and denote $Y_{j}$ as the corresponding ten year US treasure note yield on day $j$. There should be $3,767$ entries in the data.
In a previous homework exercise, we examined the Yule-Walker estimators and autoregressive parameters for this data. In this exercise, we will try to find the trend and seasonal components of the data.
$\bullet$ In order to find the trend and seasonal components, first take the Fourier transform of the data, i.e. examine the function
$$\widehat{Y}(s)=\sum_{j=1}^{3767}Y_{j}e^{2\pi isj},\qquad\forall\,s\in\R/\Z$$
It should just have a large spike near zero, indicating a trend component, with perhaps no observed seasonal component. Plot the inverse Fourier transform of the large spike. That is, determine the frequencies $s$ where the large spike is supported in the Fourier transform (say it is an interval of the form $[-a,a]$), and then plot
$$
S_{j}\colonequals \int_{-a}^{a}\widehat{Y}(r)e^{-2\pi irj}dr,\qquad\forall\,1\leq j\leq 3767.
$$
Your choice of $a$ should make $S_{j}$ resemble the original data, but $a$ should not be so large that it matches the small oscillations of the data too closely. (The exact choice of $a$ might be a subjective choice.)
$\bullet$ The Fourier transform $\widehat{Y}(s)$ should look fairly ``jagged.'' So, let's smooth it out to get a better picture of it (and verify whether or not a seasonal component exists). Recall that the $n^{th}$ \textbf{periodogram} of $Y_{1},\ldots,Y_{n}$ is the function $I_{n}\colon\R/\Z\to[0,\infty)$ defined by
$$I_{n}(t)\colonequals\frac{1}{n}\abs{\sum_{j=1}^{n}Y_{j}e^{2\pi itj}}^{2},\qquad\forall\,t\in\R/\Z.$$
Define $w_{n}\colon\Z\to[0,\infty)$ so that $w_{n}(j)\colonequals c$ if $\abs{j}\leq\sqrt{n}$, $w_{n}(j)\colonequals0$ if $\abs{j}>\sqrt{n}$, and $c>0$ is chosen so that $\sum_{j\in\Z}w_{n}(j)=1$. Recall that the \textbf{smoothed periodogram} is defined to be
$$I_{n,w}(t)\colonequals\sum_{j\in\Z}w_{n}(j)I_{n}(t-j/n),\qquad\forall\,t\in\R.$$
Plot the smoothed periodogram of the data. Does the result help you determine where the large spike is in the Fourier transform of the data? If so, amend the first part of the problem. Do you observe any seasonal component, i.e. a spike in the Fourier transform away from $0$? If so, plot the seasonal component on its own.
$\bullet$ Finally, plot the data, with the trend (and seasonal component, if you found one) subtracted from the data. Call the resulting data $\{W_{j}\}_{j=1,2,\ldots,3767}$. We anticipate that $\{W_{j}\}_{j=1,2,\ldots,3767}$ should then be weakly stationary. Try to fit an autoregressive model AR$(p)$ to $\{W_{j}\}_{j=1,2,\ldots,3767}$. That is, use the Yule-Walker estimators for the autoregressive parameters. Examine the values of the partial autocorrelation function, to try to determine a good value of $p$ for the model. (Optional: use the AICC criterion to estimate the values of $p,q$ in an ARMA$(p,q)$ model for the data.)
$\bullet$ The (smoothed) periodogram of $\{W_{j}\}_{j=1,2,\ldots,3767}$ gives a good estimate for the spectral density of $\{W_{j}\}_{j=1,2,\ldots,3767}$. Using the MLARMA estimate (or any other estimate for the ARMA parameters of the process which are then plugged into an estimate of $\phi$ and $\theta$, which then determine an estimate of the spectral density of the form $\sigma^{2}\abs{\theta/\phi}^{2}$), compare that estimate of the spectral density of $\{W_{j}\}_{j=1,2,\ldots,3767}$ to the smoothed periodogram of $\{W_{j}\}_{j=1,2,\ldots,3767}$. Do they appear similar to each other? To get a third comparison, plot the lag window spectral density estimator
$$F_{n,u}(t)\colonequals\sum_{j\in\Z}u(j/m)\Gamma_{n}(j)e^{2\pi itj},\qquad\forall\,t\in\R/\Z,$$
with $u(x)\colonequals 2(1-\abs{2x})1_{\abs{x}<1/2}$ for any $x\in\R$, and $m_{n}\colonequals\lfloor\sqrt{n}\rfloor$ for all $n\geq1$. Does this estimate of the spectral density resemble the other two? If not, adjust the parameters of the estimators, such as the smoothing function $w_{n}$ for the smoothed periodogram, try changing the function $m_{n}$, or try other choices of $u$, as described in the book. Ideally, all of the estimators will be in agreement with each other, and they should agree with Theorem 8.9 in the notes, but in practice, this might not happen.
$\bullet$ Repeat all steps above for the logarithmically differenced google stock data from 2005 to 2019, i.e. using the command
\begin{verbatim}
=GOOGLEFINANCE("GOOG", "price", "1/1/2005", "12/31/2019", "DAILY")
\end{verbatim}
That is, if $X_{n}$ denotes the google stock price on day $n$, then consider
$$Y_{n}\colonequals\log(X_{n+1}/X_{n})=\log X_{n+1}-\log X_{n}$$
and apply the above steps to $Y_{n}$. The behavior you examine here might be different than the behavior for the ten year treasury note yield.
\end{exercise}
\begin{exercise}[\embolden{Sunspot Data, Version 4}]
This exercise deals with sunspot data from the following files (the same data appears in different formats)
\href{http://www.sidc.be/silso/DATA/SN_d_tot_V2.0.txt}{txt file}\qquad\qquad\qquad\href{http://www.sidc.be/silso/INFO/sndtotcsv.php}{csv (excel) file}
These files are taken from \href{http://www.sidc.be/silso/datafiles#total}{http://www.sidc.be/silso/datafiles\#total}
To work with this data, e.g. in Matlab you can use the command
\begin{verbatim}
x=importdata('SN_d_tot_V2.0.txt')
\end{verbatim}
to import the .txt file.
The format of the data is as follows.
\begin{itemize}
\item Columns 1-3: Gregorian calendar date (Year, Month, then Day)
\item Column 4: Date in fraction of year
\item Column 5: Daily total number of sunspots observed on the sun. A value of -1 indicates that no number is available for that day (missing value).
\item Column 6: Daily standard deviation of the input sunspot numbers from individual stations.
\item Column 7: Number of observations used to compute the daily value.
\item Column 8: Definitive/provisional indicator. A blank indicates that the value is definitive. A '*' symbol indicates that the value is still provisional and is subject to a possible revision (Usually the last 3 to 6 months)
\end{itemize}
In two previous exercises, we examined the number of sunspots $U_{t}$ at time $t$, where $t$ is measured in years. We also took the Fourier transform of $U$, and defined
$$\widehat{U}(r)\colonequals\sum_{t\in\Z/365}U_{t}e^{2\pi itr},\qquad\forall\,r\in\R/365\Z.$$
We found that the seasonal component of $U$ closely matched the following function of $t\in\Z/365$
$$
S_{t}\colonequals \frac{1}{365}\int_{.08}^{.105}\widehat{U}(r)e^{-2\pi irt}dr+\frac{1}{365}\int_{-.105}^{-.08}\widehat{U}(r)e^{-2\pi irt}dr.
$$
We also found that the trend component of $U$ closely matched the following function of $t\in\Z/365$
$$
M_{t}\colonequals\frac{1}{365}\int_{-.016}^{.016}\widehat{U}(r)e^{-2\pi irt}dr.
$$
In this exercise, we will smooth out the Fourier transform to get a better picture of it.
$\bullet$ Plot the periodogram and smoothed periodogram of the sunspot data (using the same smoothing function $w_{n}$ as in the previous Exercise). Compare the two plots. Does the smoothed periodogram have a spike around frequency $1/11$? What is the exact frequency of the highest value of the spike in the Fourier transform?
$\bullet$ The (smoothed) periodogram of $U_{t}-S_{t}-M_{t}$ gives a good estimate for the spectral density of $U_{t}-S_{t}-M_{t}$. Using the MLARMA estimate (or any other estimate for the ARMA parameters of the process which are then plugged into an estimate of $\phi$ and $\theta$, which then determine an estimate of the spectral density of the form $\sigma^{2}\abs{\theta/\phi}^{2}$), compare that estimate of the spectral density to the smoothed periodogram. Do they appear similar to each other? To get a third comparison, plot the lag window spectral density estimator
$$F_{n,u}(t)\colonequals\sum_{j\in\Z}u(j/m)\Gamma_{n}(j)e^{2\pi itj},\qquad\forall\,t\in\R/\Z,$$
with $u(x)\colonequals 2(1-\abs{2x})1_{\abs{x}<1/2}$ for any $x\in\R$, and $m_{n}\colonequals\lfloor\sqrt{n}\rfloor$ for all $n\geq1$. Does this estimate of the spectral density resemble the other two? If not, adjust the parameters of the estimators, such as the smoothing function $w_{n}$ for the smoothed periodogram, try changing the function $m_{n}$, or try other choices of $u$, as described in the book. Ideally, all of the estimators will be in agreement with each other, and they should agree with Theorem 8.9 in the notes, but in practice, this might not happen.
\end{exercise}
\end{document}