代写R语言金融分析留学生CS题目：MATH11176Assignment2 IoannisPapastathopoulos

因特定公式，本编辑器不能很好兼容，故显示不完全，见谅！

AT H 11176 Assignm ent 2

Ioannis Papastathopoulos

November 21, 2017

• Subm ission: The submission details for the number of attempts is set to Single At t em pt . All answers related to calculation of posterior distributions must be written in a pdf or wor d

document called mat r i cul at i onnumber A2mat h. Your code must be written separately in

script scr i pt A2. R which is available on Lear n. When submitting, rename the script from scr i pt A2. R to mat r i cul at i onnumber A2. R where mat r i cul at i onnumber refers to your ma- triculation number. Failure to rename the script file will incur a 5% penalty.

You have t o subm it a single script file and a single p df/ word docum ent . Failure to comply with this will incur a 5% penalty. Guidance on what your answer must be is given in each question. Your answer for each question must be included in the corresponding section of scr i pt A2. R file or the corresponding section of mat r i cul at i onnumber A2mat h. For example, your answer/ code for question 1.1 must be included in the section below

## ; ;

## Q1: – – add your code bel ow

## ; ;

## 1. 1

code goes her e

• G uidance – Assessm ent crit eria.

– □ A marking scheme is given. Additionally to the marking scheme, your code will be

assessed according to the following criteria:

□ Style: follow ht t ps: / / googl e. gi t hub. i o/ st yl egui de/ Rgui de. xml with care;

□ Writ ing of funct ions: avoid common pitfalls of local vs global assignments; wrap your code in a coherent set of instructions and try to make it as generic as possible; Also, functions that are meant to be optimized with opt i mmust be written accordingly, see ?opt i m.

□ Execut ability: your code must be executable and should not require additional code in order to run. A common pitfall is failure to load Rpackages required by your code.

• D eadline: Sunday 3rd December 23:59.

• Individual feedback will be given.

1 Quest ion 1

Consider a set of data relating two score tests, LSAT and GPA, at a sample of 15 American law

schools. Of interest is the correlation = cor (l sat ; gpa) between these measurements and the

variance ratio = var(l sat )=var(gpa).

l i st ( l sat = c( 576, 635, 558, 578, 666, 580, 555,

661, 651, 605, 653, 575, 545, 572, 594) ,

gpa = c( 3. 39, 3. 30, 2. 81, 3. 03, 3. 55, 3. 07, 3. 00,

3. 43, 3. 36, 3. 13, 3. 12, 2. 74, 2. 76, 2. 88, 2. 96) , n = 15)

1. Write a function in R called CI . cor that returns 95% bootstrap confidence intervals for the

correlation parameter using the basic bootstrap interval method and the percentile interval

method.

Your answer should contain the R function CI . var . cor .

(12 m arks)

2. Write a function in R called CI . var . r at i o. that returns 95% bootstrap confidence intervals

for the variance ratio using the basic bootstrap interval method and the percentile interval

method.

Your answer should contain the R function CI . var . r at i o.

(13 m arks

2 Quest ion 2

Let X1; X2; : : : be a sequence of independent and identically distributed random variables. An asymptotically justified model for the distribution function of the excesses Xi j Xi > u ab ove a large t hreshold u is given by the generalized P aret o distribution function

FX i jX i > u (x) = 1

{

1 +

( x

u ) }

x > u;

where

2 R, > 0 and x+ = max(x; 0).

In all subsequent exercises, you are allowed to use the following function called f i t . gpd which

fits the generalized Pareto distribution to exceedances of data x above the threshold u = t hr esh.

f i t . gpd <- f unct i on( x, t hr esh, t ol . xi . l i mi t =5e- 2, . . . )

{

l l i k. gpd <- f unct i on( par , x, t hr esh, t ol . xi . l i mi t =5e- 2)

{

y <- x[ x>t hr esh]

si gma <- exp( par [ 1] )

xi <- par [ 2]

n <- l engt h( y)

i f ( abs( xi ) <=t ol . xi . l i mi t )

{

l l i k <- n*l og( si gma) + sum( ( y– t hr esh) / si gma)

r et ur n( l l i k)

}

par . l og. max <- l og( pmax( 1+xi *( y– t hr esh) / si gma, 0 ) )

l l i k <- – n*l og( si gma ) – ( 1+( 1/ xi ) ) *sum( par . l og. max )

l l i k <- – i f el se( l l i k > – I nf , l l i k, – 1e40 )

r et ur n( l l i k)

}

f i t <- opt i m( par = c( 0, 0) , f n = l l i k. gpd,

x=x, t hr esh=t hr esh,

cont r ol =l i st ( maxi t =10000, . . . ) )

si gmahat <- exp( f i t $par [ 1] )

xi hat <- f i t $par [ 2]

r et ur n( c( si gmahat , xi hat ) )

}

> f i t . gpd( par =c( 0, 0) ,

x=bur l i ngt on$Pr eci pi t at i on,

t hr esh=quant i l e( bur l i ngt on$Pr eci pi t at i on, 0. 8) )

[ 1] 1. 4869389 0. 2784732

2.1 P aram et ric b oot st rap

If X1; : : : ; Xn is a sequence of independent and identically distributed random variables and pu = Pr (Xi > u), it then follows that the number of exceedances above u, say Nu = # f Xi > ug, has

probability mass function

Pr (Nu = k) =

( n)

pku (1 pu )n k ; k = 0; : : : ; n;

that is, Nu Binomial (n; pu ). This suggests the following

PARAMETRIC BOOTSTRAP ALGORITHM

1. Fit by maximum likelihood the generalized Pareto model to exceedances above a threshold u. Set counter I = 0, bootstrap sample size R;

2. Increment I to I + 1. Simulate Nu

from Binomial(n; p^u );

3. Simulate Nu

exceedances from the fitted generalized Pareto model;

4. Fit the generalized Pareto model to the simulated exceedances to get

= (^ I ; ^ I );

5. If I < R go back to step 2. Otherwise return

( 1; : : : ; R ):

Write a function called par boot . gpd that takes as inputs a vector of numeric values for the raw data, a threshold argument t hr esh and bootstrap size R. The function should return a named list with elements

• $ml e: a vector containing the maximum likelihood estimates of , and pu ;

• $bi as: a vector containing the estimated bias of ^, ^ and p^u ;

• $se: a vector containing the standard error of ^, ^ and p^u ;

• $di st n: a matrix of dimension R by 3 with the bootstrap distribution of ^, ^ and p^u .

Your answer should contain the R function par boot . gpd.

(10 m arks)

2.2 N on-param et ric b oot st rap

NON-PARAMETRIC BOOTSTRAP ALGORITHM

1. Fit by maximum likelihood the generalized Pareto model to exceedances above a threshold u. Set counter I = 0, bootstrap sample size R;

2. Increment I to I + 1. Simulate n variates from the empirical distribution function

^(x) =

∑n

i= 1

1(Xi x):

3. Fit the generalized Pareto model to the exceedances above u to get

= (^ I ; ^ I );

4. If I < R go back to step 2. Otherwise return

( 1; : : : ; R ):

Write a function called npboot . gpd that takes as inputs a vector of numeric values for the raw data, a threshold argument t hr esh and bootstrap size R. The function should return a named list with elements

1. $ml e: a vector containing the maximum likelihood estimates of , , and pu

2. $bi as: a vector containing the estimated bias of ^, ^ and p^u

3. $se: a vector containing the standard error of ^, ^ and p^u

4. $di st n: a matrix of dimension R by 3 with the bootstrap distribution of ^, ^ and p^u .

Your answer should contain the R function npboot . gpd.

(10 m arks)

2.3 C onfidence int ervals of ret urn levels

Use the functions par boot . gpd and npboot . gpd to draw R=10000 bootstrap samples for fixed

threshold u equal to the 90% empirical quantile of bur l i ngt on$Pr eci pi t at i on data. Use

the output to obtain the pametric bootstrap and non-parametric bootstrap distributions of the T = 100; 500; 1000; 10000 return level estimate which defined by

x^T = u + ^^

{

(T p^u )^

}

1 :

Construct 95% confidence intervals for all return levels using the percentile method. Your answer should contain the code used to obtain the bootstrap distributions of all return levels and the code used to obtain all confidence intervals.

(5 m arks)

3 Quest ion 3

Consider the following dataset

l i st ( t = c( 94. 3, 15. 7, 62. 9, 126, 5. 24, 31. 4, 1. 05, 1. 05, 2. 1, 10. 5) ,

x = c( 5, 1, 5, 14, 3, 19, 1, 1, 4, 22) ,

n = 10)

on observed failures xi of n = 10 power plant pumps. Here ti denotes the length of operation time of the pump (in 1000s of hours). The number of failures Xi is assumed to follow a Poisson distribution

Xi j i Poisson( i ti ). Consider the hierarchical model

i j ; Gamma( ; )

where

Exp( ); Gamma( ; ); = 1 and = = 0:01:

1. What are the conditional posterior distributions of

– i j i ; ; ; x ;

–

;

; x ;

; x ?

Write your answer in a section called Quest ion 3 (a) of the file.

(10 m arks)

2. Write a function in R called mcmc. pumps that implements a general McMC algorithm with

components ; ; where and are updated from the posterior conditional distribution and

log is updated via a random walk Metropolis step with normal increments. The function

should return samples from the posterior distribution of , and as well as an estimate of

the probability of accepting .

Your answer must contain the code of the function mcmc. pumps.

(10 m arks)

3. Write a function in R called pr edi ct i ve. pumps that returns estimates of the predictive dis- tribution of failures for the i-th pump for a given length of operation time ti , using Monte Carlo integration.

Your answer must contain the code of the function pr edi ct i ve. pumps.

(5 m arks)

4 Quest ion 4

Let x1; : : : ; xn be a sample of independent and identically distributed observations assumed to have

been generated from a t-distribution with degrees of freedom located at , i.e.,

f (x j ) /

{

1 + 1(x )2

} ( + 1)=2

x 2 R: (1)

Assume = 10 and suppose N (0; 1).

1. Write down, up to proportionality constant, the posterior j x . Is this a recognisable density?

Write a function in R called mcmc. t that implements an McMC algorithm for evaluating the

posterior distribution of where the updating is done using random walk Metropolis with a

normal candidate generator.

Your answer should contain the posterior density of j x in a section called Quest ion 4 (1)

of the = matriculationA2math= file and the function mcmc. t separately in and scr i pt A2. R

(5 m arks)

2. Using the fact that the t-distribution is a scale mixture of normals, the sampling model in (1) can alternatively be represented as

xi j zi N ( 1=zi )

where zi are a priori independent of and

f (zi ) / z( =2) 1 expf ( =2) zi g:

Using this representation together with the prior distribution proportionality constant, the conditional posterior densities of

N (0; 1), write down, up to and of each zi , i = 1; : : : ; n

and identify the distributions corresponding to each of these densities. Write a function in R

called mcmc. gi bbs that implements a Gibbs sampling algorithm for computing the posterior distribution of .

Your answer should contain the conditional posterior densities of and each of the zi s in the

file in a section called Quest ion 4 (2) of the = matriculationA2math= file and the function

mcmc. gi bbs separately in and scr i pt A2. R

(10 m arks)

3. Let x denote a future observable. Write a function in R called pr edi ct i ve. t that returns estimates of the predictive distribution x j x using Monte Carlo integration.

Your answer must contain the code of the function pr edi ct i ve. t .

(5 m arks)

4. Write a function in R called pr edi ct i ve. quant i l e that estimates the pth quantile of the predictive distribution

xp : Pr (x xp j x ) = p

and use it to construct a 95% predictive interval.

Your answer must contain the code of the function pr edi ct i ve. quant i l e as well the code used to obtain the 95% predictive interval.

(5 m arks)

关键字：