%------------------------------------------------------------------------------
%\documentclass[reqno]{amsart}
\documentclass[12pt]{amsart}
%\setcounter{page}{6}
\usepackage[top=1in, bottom=1in, left=1in, right=1in]{geometry}
\usepackage[colorlinks=true, urlcolor=blue]{hyperref}
\usepackage{amssymb}
\usepackage{amsmath,mathrsfs}
%\oddsidemargin -0in \evensidemargin .5in
%\topmargin=1.25in
%\headheight 10pt \headsep 10pt \footheight 10pt \footskip 24pt
%\textheight 10in \textwidth 6.5in \columnsep 10pt \columnseprule 0pt
%\font\namefont=cmr10 scaled\magstep2
\font\namefont=cmr8 scaled\magstep2
\newcommand{\myindent}{\leftskip=.4in}
%\voffset=-.75in
\parskip=11pt %extra vertical distance for new paragraph
\parindent=0in
%\newtheorem{theorem}{Theorem}[section]
%\newtheorem{lemma}[theorem]{Lemma}
%\newtheorem{lemma}{Lemma}[section]
%\newtheorem{theorem}{Theorem}[section]
%\newtheorem{lemma}[theorem]{Lemma}
%\newtheorem{prop}[theorem]{Proposition}
%\newtheorem{cor}[theorem]{Corollary}
%\newtheorem{conj}{Conjecture}
\usepackage{graphicx}
\usepackage{color}
\usepackage{subfigure}
\usepackage{amssymb}
\usepackage{amsmath}
\usepackage{colonequals}
\usepackage{hyperref}
%\usepackage{showlabels}
%\usepackage[all]{xypic}
%\entrymodifiers={+!!<0pt,\fontdimen22\textfont2>}
\theoremstyle{definition}
\newtheorem{exercise}{Exercise}
\newtheorem{remark}{Remark}
\renewcommand{\setminus}{\smallsetminus}
\addtolength{\footskip}{17pt}
%\numberwithin{table}{section}
\renewcommand{\subset}{\subseteq}
\renewcommand{\supset}{\supseteq}
\renewcommand{\epsilon}{\varepsilon}
\newcommand{\abs}[1]{\left|#1\right|} % Absolute value notation
\newcommand{\absf}[1]{|#1|} % small absolute value signs
\newcommand{\vnorm}[1]{\left|\left|#1\right|\right|} % norm notation
\newcommand{\vnormf}[1]{||#1||} % norm notation, forced to be small
\newcommand{\im}[1]{\mbox{im}#1} % Pieces of English for math mode
\newcommand{\tr}[1]{\mbox{tr}#1}
\newcommand{\Proj}[1]{\mbox{Proj}#1}
\newcommand{\Vol}[1]{\mbox{Vol}#1}
\newcommand{\Z}{\mathbf{Z}} % Blackboard notation
\newcommand{\N}{\mathbf{N}}
\newcommand{\E}{\mathbf{E}}
\renewcommand{\P}{\mathbf{P}}
\newcommand{\R}{\mathbf{R}}
\newcommand{\C}{\mathbf{C}}
\newcommand{\Q}{\mathbf{Q}}
\newcommand{\figoneawidth}{.5\textwidth} % Image formatting parameters
\newcommand{\lbreak}{\\} % Linebreak
\newcommand{\italicize}[1]{\textit {#1}} % formatting commands for bibliography
%\newcommand{\embolden}[1]{\textbf {#1}}
\newcommand{\embolden}[1]{{#1}}
\newcommand{\undline}[1]{\underline {#1}}
\newcommand{\e}{\varepsilon}
\renewcommand{\epsilon}{\varepsilon}
%\renewcommand{\colonequals}{=}
\thispagestyle{empty}
\begin{document}
\bigskip
Graduate Mathematical Statistics \hfill Steven Heilman\\
\noindent\rule{6.5in}{0.4pt}
%\vspace{.2cm}
Please provide complete and well-written solutions to the following exercises.
Due January 20, 9AM, to be submitted in blackboard, under the Assignments tab.
\vspace{.5cm}
\begin{center}
{\Large Homework 1}
\end{center}
\vspace{.5cm}
\begin{exercise}
As needed, refresh your knowledge of proofs and logic by reading the following document by Michael Hutchings: \href{http://math.berkeley.edu/~hutching/teach/proofs.pdf}{http://math.berkeley.edu/$\sim$hutching/teach/proofs.pdf}
\end{exercise}
\begin{exercise}
As needed, take the following quizzes on logic and set theory:
\href{http://scherk.pbworks.com/w/page/14864234/Quiz\%3A\%20Logic}{http://scherk.pbworks.com/w/page/14864234/Quiz\%3A\%20Logic}\\
\href{http://scherk.pbworks.com/w/page/14864241/Quiz\%3A\%20Sets}{http://scherk.pbworks.com/w/page/14864241/Quiz\%3A\%20Sets}\\
%\href{http://scherk.pbworks.com/w/page/14864227/Quiz\%3A\%20Functions}{http://scherk.pbworks.com/w/page/14864227/Quiz\%3A\%20Functions}
(These quizzes are just for your own benefit; you don't need to record your answers anywhere.)
\end{exercise}
\begin{exercise}
Two people take turns throwing darts at a board. Person $A$ goes first, and each of her throws has a probability of $1/4$ of hitting the bullseye. Person $B$ goes next, and each of her throws has a probability of $1/3$ of hitting the bullseye. Then Person $A$ goes, and so on. With what probability will Person $A$ hit the bullseye before Person $B$ does?
\end{exercise}
\begin{exercise}
Two people are flipping fair coins. Let $n$ be a positive integer. Person $I$ flips $n+1$ coins. Person $II$ flips $n$ coins. Show that the following event has probability $1/2$: Person $I$ has more heads than Person $II$.
\end{exercise}
\begin{exercise}
Suppose a test for a disease is $99.9\%$ accurate. That is, if you have the disease, the test will be positive with $99.9\%$ probability. And if you do not have the disease, the test will be negative with $99.9\%$ probability. Suppose also the disease is fairly rare, so that roughly $1$ in $20,000$ people have the disease. If you test positive for the disease, with what probability do you actually have the disease?
\end{exercise}%
\begin{exercise}[\embolden{Inclusion-Exclusion Formula}]
In the Properties for Probability laws, we showed that $\P(A\cup B)=\P(A)+\P(B)-\P(A\cap B)$. The following equality is a generalization of this fact. Let $\Omega$ be a discrete sample space, and let $\P$ be a probability law on $\Omega$. Prove the following. Let $A_{1},\ldots,A_{n}\subset\Omega$. Then:
\begin{flalign*}
\P(\cup_{i=1}^{n}A_{i})&=\sum_{i=1}^{n}\P(A_{i})-\sum_{1\leq i0$ such that $\abs{g(x)},\abs{g'(x)}\leq a(1+\abs{x})^{b}$, $\forall$ $x\in\R$. Prove the \textbf{Stein identity}
$$\E Xg(X)=\E g'(X).$$
Using this identity, recursively compute $\E X^{k}$ for any positive integer $k$.
Alternatively, for any $t>0$, show that $\E e^{tX}=e^{t^{2}/2}$, i.e. compute the \textbf{moment generating function} of $X$. Then, using $\frac{d^{k}}{dt^{k}}|_{t=0}\E e^{tX}=\E X^{k}$ and using the power series expansion of the exponential, compute $\E X^{k}$ directly from the identity $\E e^{tX}=e^{t^{2}/2}$.
\end{exercise}
\begin{exercise}[\embolden{MAX-CUT}]
The probabilistic method is a very useful way to prove the existence of something satisfying some properties. This method is based upon the following elementary statement: If $\alpha\in\R$ and if a random variable $X\colon\Omega\to\R$ satisfies $\E X\geq\alpha$, then there exists some $\omega\in\Omega$ such that $X(\omega)\geq\alpha$. We will demonstrate this principle in this exercise.
Let $G=(V,E)$ be an undirected graph on the vertices $V=\{1,\ldots,n\}$ so that the edge set $E$ is a subset of unordered pairs $\{i,j\}$ such that $i,j\in V$ and $i\neq j$. Let $S\subset V$ and denote $S^{c}\colonequals V\setminus S$. We refer to $(S,S^{c})$ as a cut of the graph $G$. The goal of the MAX-CUT problem is to maximize the number of edges going between $S$ and $S^{c}$ over all cuts of the graph $G$.
Prove that there exists a cut $(S,S^{c})$ of the graph such that the number of edges going between $S$ and $S^{c}$ is at least $\abs{E}/2$. (Hint: define a random $S\subset V$ such that, for every $i\in V$, $\P(i\in S)=1/2$, and the events $1\in S,2\in S,\ldots,n\in S$ are all independent. If $\{i,j\}\in E$, show that $\P(i\in S,j\notin S)=1/4$. So, what is the expected number of edges $\{i,j\}\in E$ such that $i\in S$ and $j\notin S$?)
\end{exercise}
\begin{exercise}\label{ex0}
Let $n\geq2$ be a positive integer. Let $x=(x_{1},\ldots,x_{n})\in\R^{n}$. For any $x,y\in\R^{n}$, define $\langle x,y\rangle\colonequals\sum_{i=1}^{n}x_{i}y_{i}$ and $\vnorm{x}\colonequals\langle x,x\rangle^{1/2}$. Let $S^{n-1}\colonequals\{x\in\R^{n}\colon\vnorm{x}=1\}$ be the sphere of radius $1$ centered at the origin. Let $x\in S^{n-1}$ be fixed. Let $v$ be a random vector that is uniformly distributed in $S^{n-1}$. Prove:
$$\E \abs{\langle x,v\rangle}\geq\frac{1}{10\sqrt{n}}.$$
\end{exercise}
\begin{exercise}[\embolden{The Power Method}]
This exercise gives an algorithm for finding the eigenvectors and eigenvalues of a symmetric matrix. In modern statistics, this is often a useful thing to do. The Power Method described below is not the best algorithm for this task, but it is perhaps the easiest to describe and analyze.
Let $A$ be an $n\times n$ real symmetric matrix. Let $\lambda_{1}\geq\cdots\geq\lambda_{n}$ be the (unknown) eigenvalues of $A$, and let $v_{1},\ldots,v_{n}\in\R^{n}$ be the corresponding (unknown) eigenvectors of $A$ such that $\vnorm{v_{i}}=1$ and such that $A v_{i}=\lambda_{i}v_{i}$ for all $1\leq i\leq n$.
Given $A$, our first goal is to find $v_{1}$ and $\lambda_{1}$. For simplicity, assume that $1/2<\lambda_{1}<1$, and $0\leq \lambda_{n}\leq\cdots\leq\lambda_{2}<1/4$. Suppose we have found a vector $v\in\R^{n}$ such that $\vnorm{v}=1$ and $\abs{\langle v,v_{1}\rangle}>1/n$. (From Exercise \ref{ex0}, a randomly chosen $v$ satisfies this property.) Let $k$ be a positive integer. Show that
$$A^{k}v$$
approximates $v_{1}$ well as $k$ becomes large. More specifically, show that for all $k\geq1$,
$$\vnorm{A^{k}v - \langle v,v_{1}\rangle\lambda_{1}^{k}v_{1}}^{2}\leq\frac{n-1}{16^{k}}.$$
(Hint: use the spectral theorem for symmetric matrices.)
Since $\abs{\langle v,v_{1}\rangle}\lambda_{1}^{k}>2^{-k}/n$, this inequality implies that $A^{k}v$ is approximately an eigenvector of $A$ with eigenvalue $\lambda_{1}$. That is, by the triangle inequality,
$$\vnorm{A(A^{k}v)-\lambda_{1}(A^{k}v)}
\leq\vnorm{A^{k+1}v-\langle v,v_{1}\rangle\lambda_{1}^{k+1}v_{1}}
+\lambda_{1}\vnorm{\langle v,v_{1}\rangle\lambda_{1}^{k}v_{1}-A^{k}v}\leq 2\frac{\sqrt{n-1}}{4^{k}}.$$
Moreover, by the reverse triangle inequality,
$$\vnorm{A^{k}v}=\vnorm{A^{k}v-\langle v,v_{1}\rangle\lambda_{1}^{k}v_{1}+\langle v,v_{1}\rangle\lambda_{1}^{k}v_{1}}
\geq\frac{1}{n}2^{-k}-\frac{\sqrt{n-1}}{4^{k}}.$$
In conclusion, if we take $k$ to be large (say $k>10\log n$), and if we define $z\colonequals A^{k}v$, then $z$ is approximately an eigenvector of $A$, that is
$$\vnorm{A\frac{A^{k}v}{\vnorm{A^{k}v}}-\lambda_{1}\frac{A^{k}v}{\vnorm{A^{k}v}}}\leq 4n^{3/2}2^{-k}\leq 4n^{-4}.$$
And to approximately find the first eigenvalue $\lambda_{1}$, we simply compute
$$\frac{z^{T}Az}{z^{T}z}.$$
That is, we have approximately found the first eigenvector and eigenvalue of $A$.
\textit{Remarks}. To find the second eigenvector and eigenvalue, we can repeat the above procedure, where we start by choosing $v$ such that $\langle v,v_{1}\rangle=0$, $\vnorm{v}=1$ and $\abs{\langle v,v_{2}\rangle}>1/(10\sqrt{n})$. To find the third eigenvector and eigenvalue, we can repeat the above procedure, where we start by choosing $v$ such that $\langle v,v_{1}\rangle=\langle v,v_{2}\rangle=0$, $\vnorm{v}=1$ and $\abs{\langle v,v_{3}\rangle}>1/(10\sqrt{n})$. And so on.
Google's PageRank algorithm uses the power method to rank websites very rapidly. In particular, they let $n$ be the number of websites on the internet (so that $n$ is roughly $10^{9}$). They then define an $n\times n$ matrix $C$ where $C_{ij}=1$ if there is a hyperlink between websites $i$ and $j$, and $C_{ij}=0$ otherwise. Then, they let $B$ be an $n\times n$ matrix such that $B_{ij}$ is $1$ divided by the number of $1$'s in the $i^{th}$ row of $C$, if $C_{ij}=1$, and $B_{ij}=0$ otherwise. Finally, they define
$$A=(.85)B+(.15)D/n$$
where $D$ is an $n\times n$ matrix all of whose entries are $1$.
The power method finds the eigenvector $v_{1}$ of $A$, and the size of the $i^{th}$ entry of $v_{1}$ is proportional to the ``rank'' of website $i$.
\end{exercise}
\begin{exercise}\label{ex8}
Let $X_{1},Y_{1}$ be random variables with joint PDF $f_{X_{1},Y_{1}}$. Let $X_{2},Y_{2}$ be random variables with joint PDF $f_{X_{2},Y_{2}}$. Let $T\colon\R^{2}\to\R^{2}$ and let $S\colon\R^{2}\to\R^{2}$ so that $ST(x,y)=(x,y)$ and $TS(x,y)=(x,y)$ for every $(x,y)\in\R^{2}$. Let $J(x,y)$ denote the determinant of the Jacobian of $S$ at $(x,y)$. Assume that $(X_{2},Y_{2})=T(X_{1},Y_{1})$. Using the change of variables formula from multivariable calculus, show that
$$f_{X_{2},Y_{2}}(x,y)=f_{X_{1},Y_{1}}(S(x,y))\abs{J(x,y)}.$$
\end{exercise}
\begin{exercise}
Suppose I tell you that the following list of $20$ numbers is a random sample from a Gaussian random variable, but I don't tell the mean or standard deviation.
$$5.1715,\quad 3.2925,\quad 5.2172,\quad 6.1302,\quad 4.9889,\quad 5.5347,\quad 5.2269,\quad 4.1966,\quad 4.7939,\quad 3.7127$$
$$5.3884,\quad 3.3529,\quad 3.4311,\quad 3.6905,\quad 1.5557,\quad 5.9384,\quad 4.8252,\quad 3.7451,\quad 5.8703,\quad 2.7885$$
To the best of your ability, determine what the mean and standard deviation are of this random variable. (This question is a bit open-ended, so there could be more than one correct way of justifying your answer.)
\end{exercise}
\begin{exercise}
Suppose I tell you that the following list of $20$ numbers is a random sample from a Gaussian random variable, but I don't tell you the mean or standard deviation. Also, around one or two of the numbers was corrupted by noise, computational error, tabulation error, etc., so that it is totally unrelated to the actual Gaussian random variable.
$$-1.2045,\, -1.4829,\, -0.3616,\, -0.3743,\, -2.7298,\, -1.0601,\, -1.3298,\, 0.2554,\, 6.1865,\, 1.2185$$
$$-2.7273,\, -0.8453,\, -3.4282,\, -3.2270,\, -1.0137,\, 2.0653,\, -5.5393,\, -0.2572,\, -1.4512,\, 1.2347$$
To the best of your ability, determine what the mean and standard deviation are of this random variable. Supposing you had instead a billion numbers, and 5 or 10 percent of them were corrupted samples, can you come up with some automatic way of throwing out the corrupted samples? (Once again, there could be more than one right answer here; the question is intentionally open-ended.)
\end{exercise}
\begin{exercise}
Let $b_{1},\ldots,b_{n}$ be distinct numbers, representing the quality of $n$ people. Suppose $n$ people arrive to interview for a job, one at a time, in a random order. That is, every possible arrival order of these people is equally likely. We can think of an arrival ordering of the people as an ordered list of the form $a_{1},\ldots,a_{n}$, where the list $a_{1},\ldots,a_{n}$ is a permutation of the numbers $b_{1},\ldots,b_{n}$. Moreover, we interpret $a_{1}$ as the rank of the first person to arrive, $a_{2}$ as the rank of the second person to arrive, and so on. And all possible permutations of the numbers $b_{1},\ldots,b_{n}$ are equally likely to occur.
For each $i\in\{1,\ldots,n\}$, upon interviewing the $i^{th}$ person, if $a_{i}>a_{j}$ for all $1\leq j