Description
• For the sake of effective learning, if you submit the solution to the assignment as a group, then each member of the group agrees to have participated fully (100%) in performing every part of every question in the assignment.
• If you submit the solution to the assignment (by yourself or as a group), then you agree that every line of code and every line in the report is your (or your group’s) own, and isn’t a copied/modified version of any other source online (on the internet) or offline (in electronic form or paper form or any other form).
• Submit your solution to each problem, i.e., (i) the code, (ii) the results, e.g., graphs or other data, and (iii) the report (in Adobe PDF format), for each question, through moodle. Put the code within the folder “code”, the results within the folder “results”, and the report within the folder “report”.
• Submit all code that allows the TAs to regenerate your results, exactly as they appear in the report.
• Submit a single zip file that contains the solutions to all problems in the assignment.
• To get any possible partial credit for the code, ensure that the code is very well documented. To get partial credit for the derivations, include all derivation steps in their full details.
• To avoid non-deterministic results in each program run, and to make the results reproducible during test time, use rng(seed) where seed is a fixed hard-coded integer in your code.
• If the question suggests the use of some function in Matlab, then you can use a corresponding function in other coding frameworks/languages.
• 5 points are reserved for submission in the proper format.
1. (10 points) Use the Matlab function randn() to generate a data sample of N points drawn from a Gaussian distribution with mean µtrue = 10 and standard deviation σtrue = 4. Consider the problem of using the data to get an estimate µ of this Gaussian mean, assuming it is unknown, b
when the standard deviation σtrue is known.
Consider using one of the two prior prior distributions on the mean: (i) a Gaussian prior with mean µprior = 10.5 and standard deviation σprior = 1 and (ii) a uniform prior over [9.5,11.5].
Consider various sample sizes N = 5, 10, 20, 40, 60, 80, 100, 500, 103,104. For each sample size N, repeat the following experiment M ≥ 100 times: generate the data, get the maximum likelihood estimate µML, get the maximum-a-posteriori estimates µMAP1 and µMAP2, and measure b b b
the relative errors |µb − µtrue|/µtrue for all three estimates.
• (8 points) Plot a single graph that shows the relative errors for each value of N as a box plot (use the Matlab boxplot() function), for each of the three estimates.
• (2 points) Interpret what you see in the graph. (i) What happens to the error as N increases ? (ii) Which of the three estimates will you prefer and why ?
2. (15 points) Use the Matlab function rand() to generate a data sample of N points from the uniform distribution on [0,1]. Transform the resulting data x to generate a transformed data sample where each datum y := (−1/λ)log(x) with λ = 5. The transformed data y will have some distribution with parameter λ; what is its analytical form ? Use a Gamma prior on the parameter λ, where the Gamma distribution has parameters α = 5.5 and β = 1.
Consider various sample sizes N = 5, 10, 20, 40, 60, 80, 100, 500, 103,104. For each sample size N, repeat the following experiment M ≥ 100 times: generate the data, get the maximum likelihood estimate λbML, get the Bayesian estimate as the posterior mean λbPosteriorMean, and measure the relative errors |λb − λtrue|/λtrue for both the estimates.
• (5 points) Derive a formula for the posterior mean.
• (8 points) Plot a single graph that shows the relative errors for each value of N as a box plot (use the Matlab boxplot() function), for both the estimates.
• (2 points) Interpret what you see in the graph. (i) What happens to the error as N increases ? (ii) Which of the two estimates will you prefer and why ?
3. (20 points) Suppose random variable X has a uniform distribution over [0,θ], where the parameter θ is unknown. Consider a Pareto distribution prior on θ, with a scale parameter θm > 0 and a shape parameter α > 1, as P(θ) ∝ (θm/θ)α for θ ≥ θm and P(θ) = 0 otherwise.
(5 points) Find the maximum-likelihood estimate θbML and the maximum-a-posteriori estimate θMAP.
• (8 points) Does θbMAP tend to θbML as the sample size tends to infinity ? Is this desirable or not ?
• (5 points) Find an estimator of the mean of the posterior distribution θbPosteriorMean.
• (2 points) Does θbPosteriorMean tend to θbML as the sample size tends to infinity ? Is this desirable or not ?
Reviews
There are no reviews yet.