Applied Statistics (ECS764P) – Lab 2
代写Applied Statistics Plot the histogram of daily returns. Find a family of distributions which you think would model this distribution well.
1 Theory 代写Applied Statistics
- Normal distributions have the following two properties:
- the sum of two normals is normal:
- re-scaling a normal gives a normal: for any α > 0, αN(µ, σ) = N(αµ, ασ)
Use these two facts to compute the distribution of sample means for identically and normally distributed independent samples of length n. Specififically, compute the distribution of
2.Consider the family of distributions
Poisson (λ), λ ∈ R
Show that the MLE is given by the sample mean.
- Using the defifinition of the sum of two probability measures given during the lectures, show that the sum of two identical and independent Bernoulli distributions Bern (p) is given by a binomial distribution Binom (2, p). Formally show that
Bern (p) + Bern (p) = Binom (2, p)
(Hint: What is the support of Bern (p) + Bern (p)? What is the support of Binom (2, p)? Do the two probability measures agree on every element of their support? If yes, then they are equal.)
2 Practice 代写Applied Statistics
If you’re unfamiliar with how to defifine a function in Python, read https://www.w3schools.com/python/
python_functions.asp.
- Import scipy.stats in order to access the scipy.stats.beta distribution. Using the cdf method of scipy.stats.beta defifine a function called beta_measure which will take two arguments a, b and which will return the probability mass of the interval [a, b] under the probability measure Beta (3, 7), i.e.
Beta (3, 7) ([a, b])
Test your function by printing the result of:
(a) beta_measure(0,1)
(b) beta_measure(0,0)
(c) beta_measure(0.25,0.75) 代写Applied Statistics
(d) beta_measure(0,0.5)
(e) beta_measure(0.5,1)
Plot the pdf of Beta (3, 7) to check visually if your answers make sense.
- Using the pdf method of scipy.stats.beta defifine a function called beta_pdf which will take one argument x and return the pdf of the probability measure Beta (3, 7) evaluated at x. Import the integration routine quad from scipy.integrate, and have a look at the documentation https://docs.scipy.org/doc/scipy/reference/generated/scipy.integrate.quad.html to see how it works. Use quad to compute and print the following integrals
Compare your answers with those of the previous question.
3.Recall from the lectures that if a probability distribution d1 has density f1 and a probability distribution d2 has density f2, then the density of the sum d1 +d2 is given by the convolution of the two densities, viz.
In this question we consider the sum of Beta (3, 7) + Beta (7, 3). What is the support of Beta (3, 7)? What is the support of Beta (7, 3)? Therefore, what is the support of Beta (3, 7) + Beta (7, 3)?
Write a function which implements the integrand of the integral above, that is to say that implements f1(x)f2(t−x), where f1 is the density of Beta (3, 7) and f2 is the density of Beta (7, 3). (Hint: this function will need two arguments.) 代写Applied Statistics
Next, generate 100 points (t1, . . . , t100) along the support of Beta (3, 7) + Beta (7, 3) (using numpy’s linspace function), and using a for loop, compute the pdf f1+2(ti) at these 100 points using quad.
(Hint: the documentation of quad has an example showing how to integrate a function with two arguments along its fifirst argument.) Plot your result.
Finally, generate 10000 samples from Beta (3, 7), 10000 samples from Beta (7, 3) independently, add them,and plot the histogram of these sums along with the pdf computed in the previous step. What do youobserve?
4.Install pandas-datareader (do pip install pandas-datareader in your terminal). With this library it is very easy to download data from Yahoo Finance (and other providers too). Download the last 10 years of Microsoft stock using
my_data = data.DataReader(’MSFT’, ’yahoo’, ’2012-11-02’, ’2022-11-02’)
Keep the “Close” column and use it to compute the time series of (percentage) daily returns using the formula
Warning: do not make a local copy of this data! It is easier, cleaner and less error-prone to access it directly from Yahoo Finance using pandas-datareader. 代写Applied Statistics
Plot the histogram of daily returns. Find a family of distributions which you think would model this distribution well. (Hint: what is the support of the daily returns? Is it symmetric or skewed? Has it got fat tails/positive excess kurtosis?).
The continuous distributions in scipy.stats have a method called fit which, given some data, computes the Maximum Likelihood Estimators for the parameters of the distribution. Use this method to fifind the optimal probability distribution in the family you have chosen, and plot the corresponding pdf alongside the histogram of observed daily returns.
Finally, plot the QQ plot of the daily returns data versus the model you have just fifitted. Comment on the quality of your fifit.