statistic代写 R code代写 Assignment代写 problems代写

Assignment #4 STA355H1S

statistic代写 Instructions: Solutions to problems 1 and 2 are to be submitted on Quercus (PDF files only) – the deadline is 11:59pm on April 5.

due Friday April 5, 2019

Instructions: Solutions to problems 1 and 2 are to be submitted on Quercus (PDF files only) – the deadline is 11:59pm on April 5. You are strongly encouraged to do problems 3 and 4 but these are not to be submitted for grading.

Problems to hand in:

1.Inclass, we discussed a general class of estimators of g(x) in the non-parametric regression model

of the form statistic代写

Y_i = g(x_i) + ε_i for i = 1, · · · , n.

where we required that w₁(x) + · · · + w_n(x) = 1 for each x. However, it is often desirable for the weights {w_i(x)} to satisfy other constraints. For example, if Var(ε_i) is very small or even 0 and g(x) is a smooth function, then we would hope that g(x_i) ≈ g(x_i) for i = 1, · · · , n.

(a)Suppose that Y_i= g(x_i) = β₀ + Σp^β_kφ_k(x_i) for i = 1, · · · , n where β₀, β₁, · · · , β_p are some nonzero constants and φ_k(x_i) are some functions. Define the n × n smoothing matrix

Then

If g(x_i) = g(x_i) for i = 1, · · · , n for all β₀, β₁, · · · , β_p, show that statistic代写

are eigenvectors of A with eigenvalues all equal to 1.

(b)A smoothing matrix A typically has a fixed number r of eigenvalues λ₁, · · , λ_requalto 1, with the remaining eigenvalues λ_r₊₁, · · · , λ_n lying in the interval (−1, 1). Usually, by varying a smoothing parameter, we can make λ_r₊₁, · · · , λ_n closer to 0 (which will make g(x) smoother) or closer to 1 (which can result in g(x) being very non-smooth).

If A is symmetric then A = ΓΛΓ^T where Λ is a diagonal matrix whose elements are the eigenvalues of A and Γ is an orthogonal matrix with Γ⁻¹ = Γ^T ; we will assume that Γ doesnot depend on the smoothing parameter (i.e. the eigenvalues of A depend on the smoothingparameter but its eigenvectors do not). In lecture, we gave the bias-variance decomposition

The decomposition A = ΓΛΓ^T make the effect of the smoothing parameter on the bias and variance components very transparent.

Show that as λ_r₊₁, · · · , λ_n shrink towards 0,

(i)“Aε“2= ε^T A^T Aε decreases;

(ii)“(A−I)g“2increases if λ_r₊₁, · · , λ_n ≥ 0 unless g is an eigenvector of A with eigenvalue 1.

(c)In practice, the form of the smoothing matrix is typically hidden from the user and in fact, is often never explicitly computed. However, for a given smoothing procedure, we can recoverthe j-the column of the smoothing matrix by applying the smoothing procedure to pseudo-data (x₁, y1∗), · · · , (x_n, yn∗ ) where yj∗ = 1 and yi∗ = 0 for i j.

Consider estimating g using smoothing splines in the model

Y_i = g(x_i) + ε_i for i = 1, · · · , 20

where x_i = i/21. The following R code computes the smoothing matrix A for a given value of its degrees of freedom (i.e. its trace) and then computes its eigenvalues and eigenvectors:statistic代写

> x <- c(1:20)/21

> A <- NULL

> for (i in 1:20) {

+ y <- c(rep(0,i-1),1,rep(0,20-i))

+ r <- smooth.spline(x,y,df=4)

+ A <- cbind(A,r$y)

+ }

> r <- eigen(A,symmetric=T)

> round(r$values,3)

[1] 1.000 1.000 0.913 0.579 0.262 0.115 0.055 0.029 0.016 0.010 0.006 0.004

[13] 0.003 0.002 0.002 0.001 0.001 0.001 0.001 0.001

The matrix A is by definition symmetric and using the option symmetric=T in the R function eigen assures that the eigenvalues will be computed as real numbers (i.e. no imaginary component). (Note that A has 2 eigenvectors with eigenvalue equal to 1; this means that the smoothing method will recover linear functions exactly.) Repeat the procedure above using different values of df between 3 and 10. What do you notice about the eigenvalues as df varies?statistic代写

(d)The smoothing matrices in part (c) are not only symmetric but centro-symmetric; that is, if

Suppose that v = (v₁ v₂ · · · v_n)^T is an eigenvector of A. Show that v is either symmetric (i.e. v_j = v_n₋_j₊₁ for all j) or skew-symmetric (i.e. v_j = −v_n₋_j₊₁ for all j).

(Hint: Define J to be the matrix that reverses the elements of a vector:

A is centro-symmetric if AJx = JAx. It suffices to show that if v is an eigenvector of A

then Jv is also an eigenvector.)

(e)(Optional but recommended) Take a look at the eigenvectors of A in part (c) (for a particular value of df). This can be done using the following Rcode:

> r <- eigen(A,symmetric=T)

> for (i in 1:20) {

+ devAskNewPage(ask = T)

+ plot(x,r$vectors[,i],type=”b”)

+ }statistic代写

(The command devAskNewPage(ask = T) prompts you to move to the next plot.) Note that the first few eigenvectors are quite smooth but become less smooth.

2.Supposethat D₁, · · , D_n are random directions – we can think of these random variables as coming from a distribution on the unit statistic代写

circle {(x, y) : x² + y² = 1} and represent each observation as an angle so that D₁, · · · , D_n come from a distribution on [0, 2π).

A simple family of distributions for these circular data is the von Mises distribution whose density on [0, 2π) is

where 0 ≤ µ < 2π, κ ≥ 0, and I₀(κ) is a 0-th order modified Bessel function of the first kind. In this problem, we want to derive tests of the null hypothesis H₀ : κ = 0 versus the alternative H₁ : κ > 0. Note that under H₀, D₁, · · · , D_n have a uniform distribution on the interval [0, 2π).statistic代写

(a)Suppose that we want to test H0 : κ = 0 versus H1 : κ > 0. Consider a likelihood ratio test of H0 : κ = 0 versus H0 1: κ = κ1 > 0. Show that this LR test rejects H0 : κ = 0 for large values of the statistic

where µ is the MLE of µ (derived in Assignment #3).

(b)TheRayleigh test of H₀ : κ = 0 versus H₁ : κ > 0 uses a test statistic closely related to the LR statistic in part (c). Define

Show that the limiting distribution of R when H₀ is true is Exponential with mean 1. (Hint: Use the bivariate CLT to find the joint limiting distribution of the random variables inside the parentheses.)statistic代写

(c)Atest of H₀ versus H₁ should be invariant under shifts of the angles D₁, · · , D_n; in other words, if T = T (D₁, · · · , D_n) is a test statistic for testing H₀ then for any φ, T (D₁, · · · , D_n) and T (D₁ + φ, · · · , D_n + φ) should have the same distribution under H₀. Show that this invariance condition holds for the tests in parts (a) and (b). (Hint: It’s sufficient to show that T (D₁, · · · , D_n) = T (D₁ + φ, · · · , D_n + φ) for any φ.)

(d)The file txt contains dance directions of 279 honey bees viewing a zenithpatch of artificially polarized light. (The data are given in degrees; you should convert them to radians.) Use the Rayleigh test to assess whether it is plausible that the directions are uniformly distributed.statistic代写

Supplemental problems (not to hand in):

3.Supposethat (X₁, · · , X_n) have a joint density f (x₁, · · · , x_n) where f is either f₀ or f₁ (where both f₀ and f₁ have no unknown parameters).

We put a prior distribution on the possible densities {f₀, f₁}: π(f₀) = π₀ > 0 and π(f₁) = π₁ > 0 where π₀ + π₁ = 1. (This is a Bayesian formulation of the Neyman-Pearson hypothesis testing setup.)

(a)Show that the posterior distribution of {f₀, f₁}is

π(f_k|x₁, · · · , x_n) = τ (x₁, · · · , x_n)π_kf_k(x₁, · · · , x_n) for k = 0, 1

and give the value of the normalizing constant τ (x₁, · · · , x_n). (Note that π(f₀|x₁, · · · , x_n) +π(f₁|x₁, · · · , x_n) must equal 1.)statistic代写

(b)When will π(f₀|x₁, · · , x_n) > π(f₁|x₁, · · · , x_n)? What effect do the prior probabilitiesπ₀and π₁ have?

(c)Suppose now that X₁, · · , X_nare independent random variables with common density gwhere g is either g₀ or g₁ so that

f_k(x₁, · · · , x_n) = g_k(x₁)g_k(x₂) × · · · × g_k(x_n) for k = 0, 1.If g₀ is the true density of X₁, · · · , X_n and π₀ > 0, show that

π(f₀|x₁, · · · , x_n) −→ 1 as n → ∞.(Hint: Look at n⁻¹ ln(π(f₀|x₁, · · · , x_n)/π(f₁|x₁, · · · , x_n)) and use the WLLN.)

4.Suppose that X₁, · · , X_nare independent Exponential random variables with parameter statistic代写

λ. Let X₍₁₎ < · · · < X₍_n₎ be the order statistics and define the normalized spacings

D₁ = nX₍₁₎

and D_k = (n − k + 1)(X₍_k₎ − X₍_k₋₁₎) (k = 2, · · · , n).

As stated in class, D₁, · · · , D_n are also independent exponential random variables with pa- rameter λ.

(a)LetX¯n be the sample mean of X₁, · · , X_n and define for integers r ≥ 2

Use the Delta Method to show that √n(T − r!) −→d statistic代写

N (0, σ²(r)) where

σ²(r) = (2r)! − (r² + 1)(r!)²

and so √n(ln(T_n) − ln(r!)) −→ N (0, σ²(r)/(r!)²).

(Hint: Note that D₁ + · · · + D_n = nX¯n. You will need to find the joint limiting distribution of

and then apply the Delta Method; a note on the Delta Method in two (or higher) dimensions can be found on Quercus. You can compute the elements of the limiting variance-covariance matrix using the fact that E(D^k) = k!/λ^k for k = 1, 2, 3, · · · which will allow you to computeVar(D^r) and Cov(D^r, D_i).)

Note:

We can use the statistic T_n (or ln(T_n) for which the normal approximation is slightly better) defined in part (a) to test for exponentiality, that is, the null hypothesis that X₁, · · · , X_n come from an exponential distribution; for an α-level test, we reject the null hypothesis for T_n > c_α where we can approximate c_α using a normal approximation to the distribution of T_n or ln(T_n). The success of this test depends on T_n −→ a(F ) where a(F ) > r! for non-exponential distributions F . Assume that the F concentrates all of its probability statistic代写

mass on the positive real line and has a density f with f (x) > 0 for all x > 0; without loss of generality, assume that the mean of F is 1. If k/n ≈ t then D_k = (n − k + 1)(X₍_k₎ − X₍_k₋₁₎) is approximately exponentially distributed with mean (1 − t)/f (F ⁻¹(t)), which is constant for 0 < t < 1 if (and only if) F is an exponential distribution. Since the mean of F is 1,

By H¨older’s inequality, it follows that

since we assume that the mean of F is 1. Thus a(F ) ≥ r!. Moreover, if a(F ) = r! then (1 − t)/f (F ⁻¹(t)) = 1 for 0 < t < 1 or (1 − F (x))/f (x) = 1 for all x > 0, which implies that F (x) = 1 − exp(−x).statistic代写

(b)Usethe test suggested in part (a) on the air conditioning data (from Assignment #1) taking r = 2 and r = 3 (using the normal approximation to ln(T_n)) to assess whether an exponential model is reasonable for these data.

其他代写：algorithm代写 analysis代写 app代写 assembly代写 assignment代写 C++代写 code代写 course代写 dataset代写 java代写 web代写北美作业代写编程代写考试助攻 program代写 cs作业代写 source code代写

合作平台：essay代写论文代写写手招聘英国留学生代写

Assignment #4 STA355H1S

1.Inclass, we discussed a general class of estimators of g(x) in the non-parametric regression model

If g(xi) = g(xi) for i = 1, · · · , n for all β0, β1, · · · , βp, show that statistic代写

Show that as λr+1, · · · , λn shrink towards 0,

(d)The smoothing matrices in part (c) are not only symmetric but centro-symmetric; that is, if

2.Supposethat D1, · · , Dn are random directions – we can think of these random variables as coming from a distribution on the unit statistic代写