## Description

ISyE 6420

1. Carpal Tunnel Syndrome Tests. Carpal tunnel syndrome is the most common entrapment neuropathy. The cause of this syndrome is hard to determine, but it can include trauma, repetitive maneuvers, certain diseases, and pregnancy.

Three commonly used tests for carpal tunnel syndrome are Tinel’s sign, Phalen’s test, and the nerve conduction velocity test. Tinel’s sign and Phalen’s test are both highly sensitive (0.97 and 0.92, respectively) and specific (0.91 and 0.88, respectively). The sensitivity and specificity of the nerve conduction velocity test are 0.93 and 0.87, respectively. Assume that the tests are conditionally independent.

Calculate the sensitivity and specificity of a combined test if combining is done

(a) in a serial manner;

(b) in a parallel manner.

(c) Find Positive Predictive Value (PPV) for tests in (a) and (b) if the prevalence of carpaltunnel syndrome in the general population is approximately 50 cases per 1000 subjects.

2. A Simple Na¨ıve Bayes Classifier: 6420 Students going to Beach. Assume that for each instance covariates x1,…,xp are given and one of I different classes {1,2,…,I} that the instance belongs to. Bayes classifier assigns the class according to maximum probability

IP(Class i|x1,…,xp) ∝ IP((x1,…,xp)|Class i) × IP(Class i), i = 1,…,I,

conditionally that probabilities on the right hand side can be assessed/elicited. The symbol

∝ stands for the proportionality relation, exact probabilities IP(Class i|x1,…,xp) satisfy

I

X

IP(Class i|x1,…,xp) = 1.

i=1

IP((x1,…,xp)|Class i) = IP(x1|Class i) × IP(x2|Class i) × … × IP(xp|Class i)

p

= Y IP(xj|Class i).

j=1

The conditional probabilities IP(xj|Class i) are usually easier to assess. If we have a training sample, these probabilities can be taken as relative frequencies of items with covariate xj among the items in the class i.

Thus, class i is selected for which

IP(Class Class i), i = 1,…,I,

is maximum.

To illustrate very simple (covariates take values true/false, i.e., 1/0) na¨ıve Bayes classifier, assume the following scenario:

The imaginary data for 100 students are available, as in file naive.csv|txt. The following is recorded: Satisfied with ISyE6420 Midterm results (0 no/1 yes), Personal finances good (0 no/1 yes), Friends joined (0 no/1 yes), Weather forecast good (0 no/1 yes), Gender (0 male/1 female), Went to Beach (0 no/1 yes).

The conditional probabilities IP(xj|Class i) can be estimated by the relative frequencies of items with covariate xj among the items in the class i (yellow in the image). For example,

2

Jane is happy with her 6420 Midterm results, financially is doing well, however, her friends will not go to the beach and the weather forecast does not look good. Then, for example IP(Financially doing well|Went to Beach) = 29/40 = 0.725, etc.

> pbpropto = 0.875*0.725*(1-0.775)*(1-0.825)*0.225 * 0.4 %0.002248066406250

> pnbpropto = 0.45 * 0.416667 * (1-0.116667) * (1-0.716667) * 0.383333* 0.6

%0.010793211644998

> pbeach = pbpropto/(pbpropto + pnbpropto) %0.172380835483743

> pnbeach = pnbpropto/(pbpropto + pnbpropto) %0.827619164516257

Thus, after normalizing the products, the na¨ıve Bayes assigns the probability of 0.17238 of class ’Going to Beach’ to Jane.

Classify the following two students:

(a) Michael did poorly on his Midterm, he already owes some money, his friends will goto the beach, and weather forecast looks fine.

(b) Melissa did well on the Midterm, her finances look good, the weather prognosis looksgood, but her friends will not go to the beach.

3. Multiple Choice Exam. A student answers a multiple choice examination with two questions that have four possible answers each. Suppose that the probability that the student knows the answer to a question is 0.80 and the probability that the student guesses is 0.20. If the student guesses, the probability of guessing the correct answer is 0.25. The questions are independent, that is, knowing the answer on one question is not influenced by the other question.

(a) What is the probability that the both questions will be answered correctly?

(b) If answered correctly, what is the probability that the student really knew the correctanswer to both questions?

(c) How would you generalize the above from 2 to n questions, that is, what are answers to (a) and (b) if the test has n independent questions? What happens to probabilities in (a) and (b) if n → ∞.

3

## Reviews

There are no reviews yet.