$ 20.99


TAs: Annie, Alisa, Sebastian, Erin, Bhargav, Abhi, Andrew, Siva, Monica, Emily, Kevin, Neural, Markov
START HERE: Instructions
• Late Submission Policy: See the late submission policy here: http://www.cs.cmu.edu/ ˜mgormley/courses/10601/syllabus.html
• Submitting your work: You will use Gradescope to submit answers to all questions and code. Please follow instructions at the end of this PDF to correctly submit all your code to Gradescope.
– Programming: You will submit your code for programming questions on the homework to Gradescope. After uploading your code, our grading scripts will autograde your assignment by running your program on a virtual machine (VM). You are only permitted to use the Python Standard Library modules and numpy. Ensure that the version number of your programming language environment (i.e. Python 3.9.12) and versions of permitted libraries (i.e. numpy 1.23.0) match those used on Gradescope. You have 10 free Gradescope programming submissions, after which you will begin to lose points from your total programming score. We recommend debugging your implementation on your local machine (or the Linux servers) and making sure your code is running correctly first before submitting your code to Gradescope. The submission limit is true for future assignments, but this one allows unlimited submissions.
• Materials: The data and reference output that you will need in order to complete this assignment is posted along with the writeup and template on the course website.
Instructions for Specific Problem Types
For “Select One” questions, please fill in the appropriate bubble completely:
Select One: Who taught this course?
Matt Gormley
⃝ Marie Curie
⃝ Noam Chomsky
Select One: Who taught this course?
Henry Chai
⃝ Marie Curie
@@ Noam Chomsky
For “Select all that apply” questions, please fill in all appropriate squares completely:
Select all that apply: Which are scientists?
■ Stephen Hawking
■ Albert Einstein
■ Isaac Newton
2 I don’t know
Select all that apply: Which are scientists?
■ Stephen Hawking
■ Albert Einstein
■ Isaac Newton
@■ I don’t know
Fill in the blank: What is the course number?
1 Programming: Majority Vote Classifier [30 Points]
1.1 Introduction
The goal of this assignment is to ensure that you:
1. Have a way to edit and test your code (i.e. a text editor and compiler/interpreter)
2. Are familiar with submitting to Gradescope
3. Are familiar with file I/O and standard output in Python
1.2 Majority Vote Classifier
1.2.1 Algorithm
The training procedure should store the label used for prediction at test time. In the case of a tie, output the value that is numerically higher (or comes last alphabetically- ie. in a tie between Apples and Bananas, you would return Bananas). At test time, each example should be passed through the classifier. Its predicted label becomes the label most commonly occurring in the train set.
Looking ahead: This simple algorithm acts as a small component of the Decision Tree that you will implement in the next homework assignment. We hope that you will employ best practices when coding so that you can re-use your own code here in the next assignment. A Majority Vote Classifier is simply a decision tree of depth zero (it predicts a class label for the input instance based on the most commonly occurring label present in the data).
1.2.2 The Datasets
Materials Download the zip file from course website, which contains all the data that you will need in order to complete this assignment.
Datasets The handout contains two datasets. Each one contains attributes and labels and is already split into training and testing data. The first row of each .tsv file contains the name of each attribute, and the class label is always the last column.
1. heart: The first task is to predict whether a patient has been (or will be) diagnosed with heart disease, based on available patient information. The attributes (aka. features) are:
(a) sex: The sex of the patient—1 if the patient is male, and 0 if the patient is female.
(b) chest_pain: 1 if the patient has chest pain, and 0 otherwise.
(c) high_blood_sugar: 1 if the patient has high blood sugar (>120 mg/dl fasting), or 0 otherwise.
(d) abnormal_ecg: 1 if exercise induced angina in the patient, and 0 otherwise. Angina is a type of severe chest pain.
(e) flat_ST: 1 if the patient’s ST segment (a section of an ECG) was flat during exercise, or 0 if it had some slope.
(f) fluoroscopy: 1 if a physician used fluoroscopy, and 0 otherwise. Fluoroscopy is an imaging technique used to see the flow of blood through the heart.
(h) heart_disease: 1 if the patient was diagnosed with heart disease, and 0 otherwise. This is the class label you should predict.
The training data is in heart_train.tsv, and the test data in heart_test.tsv.
The handout zip file also contains the predictions and metrics from a reference implementation of a Majority Vote Classifier for the heart and education datasets (see subfolder example output). You can check your own output against these to see if your implementation is correct.
Note: For simplicity, all attributes are discretized into just two categories. This applies to all the datasets in the handout, as well as the additional datasets on which we will evaluate your Majority Vote Classifier.
1.2.3 Command Line Arguments
The autograder runs and evaluates the output from the files generated, using the following command:
$ python majority_vote.py [args…]
Where above [args…] is a placeholder for five command-line arguments: <train input> <test input> <train out> <test out> <metrics out>. These arguments are described in detail below:
1. <train input>: path to the training input .tsv file
2. <test input>: path to the test input .tsv file
3. <train out>: path of output .txt file to which the predictions on the training data should be written
4. <test out>: path of output .txt file to which the predictions on the test data should be written
5. <metrics out>: path of the output .txt file to which metrics such as train and test error should be written
As an example, the following command line would run your program on the heart dataset. The train predictions would be written to heart_train_labels.txt, the test predictions to heart_test_labels.txt, and the metrics to heart_metrics.txt.
$ python majority_vote.py heart_train.tsv heart_test.tsv heart_train_labels.txt heart_test_labels.txt heart_metrics.txt
1.2.4 Output: Labels Files
Your program should write two output .txt files containing the predictions of your model on training data (<train out>) and test data (<test out>). Each should contain the predicted labels for each example printed on a new line. Use ‘ ’ to create a new line.
Your labels should exactly match those of a reference majority vote classifier implementation—this will be checked by the autograder by running your program and evaluating your output file against the reference solution.
The first few lines of an example output file is given below for the heart dataset:
0 …
1.2.5 Output: Metrics File
Generate another file where you should report the training error and testing error. This file should be written to the path specified by the command line argument <metrics out>. Your reported numbers should be within 0.0001 of the reference solution. You do not need to round your reported numbers! The autograder will automatically incorporate the right tolerance for float comparisons. The file should be formatted as follows:
error(train): 0.490000 error(test): 0.402062
1.3 Command Line Arguments
In this and future programming assignments, we will use command line arguments to run your programs with different parameters. Below, we provide some simple examples for how to do this in Python. In the examples below, suppose your program takes two arguments: an input file and an output file.
import sys
if __name__ == ’__main__’: infile = sys.argv[1] outfile = sys.argv[2]
print(f”The input file is: {infile}”) print(f”The output file is: {outfile}”)
1.4 Code Submission
You must submit a single file named majority_vote.py. Any other files will be deleted. The autograder is case sensitive. You must submit this file to the corresponding homework link on Gradescope.
Written Questions (61 points)
1 LATEX Bonus Point and Template Alignment (1 points)
1. (1 point) Select one: Did you use LATEX for the entire written portion of this homework?
⃝ Yes
⃝ No
2. (0 points) Select one: I have ensured that my final submission is aligned with the original template given to me in the handout file and that I haven’t deleted or resized any items or made any other modifications which will result in a misaligned template. I understand that incorrectly responding yes to this question will result in a penalty equivalent to 2% of the points on this assignment.
Note: Failing to answer this question will not exempt you from the 2% misalignment penalty.
⃝ Yes
2 Course Policies (9 points)
This section covers important course policies that every student should know and understand. These questions MUST be finished in order for the whole homework to be considered for grading.
1. (1 point) Select one: Assignment turned in late without prior approval will incur a daily penalty. How much is the penalty? Up to 1 day: Up to 2 day: Up to 3 day: Up to 4 day:
⃝ 5%, 10%, 15%, 20%
⃝ 10%, 20%, 30%, 40%
⃝ 25%, 50%, 75%, 100%
⃝ 20%, 40%, 60%, 80%
⃝ As many as I want; Of course!
⃝ 6; No
⃝ 6; Yes
⃝ 4; No
⃝ 4; Yes
3. (1 point) Select all that apply: Seeking help from other students in understanding course materials needed to solve homework problems is ALLOWED under which of the following conditions?
2 Any written notes are taken on an impermanent surface (e.g. whiteboard, chalkboard) and discarded before writing up one’s solution alone.
2 Learning is facilitated not circumvented; i.e., the purpose of seeking help is to learn and understand the problem instead of merely getting an answer
2 Help both given and received is reported in collaboration questions in the homework
2 The student updates his/her collaborative questions even if it is after submitting their own
2 None of the above
4. (1 point) Select all that apply: Which of the following is (are) strictly forbidden in solving and submitting homework?
2 Searching on the internet for solutions or sample codes
2 Consulting people outside this class who have seen or solved the problem before
2 Turning in someone else’s homework
2 Using anyone else’s, or allowing other classmates to use your computer or Gradescope account in connection with this course
2 None of the above
5. (1 point) Select one: If you solved your assignment completely on your own, you can skip the collaboration questions at the end of each homework.
⃝ True
⃝ False
6. What is (are) the consequence(s) of being caught cheating in this course?
(a) (1 point) Select all that apply: First instance of cheating:
2 Failure of the course
2 Negative 100% on the assignment
2 None of the above
(b) (1 point) Select all that apply: Second instance of cheating:
2 Failure of the course
2 Negative 100% on the assignment
2 None of the above
7. (1 point) Select one: Assume a difficult situation arises in the middle of the semester (e.g. medical, personal etc.) that might prevent you from submitting assignments on time or working as well as you would like. What should you do?
⃝ Do not speak to the course staff, try to finish the class, reach out to the course staff in the end of the semester explaining your special situation
3 Probability and Statistics (15 points)
1. (1 point) Select one: For events A and B, where A ∩ B indicates A AND B, and A ∪ B indicates A
P(A ∩ B) = P(A) + P(B) − P(A ∪ B)
⃝ True
⃝ False
2. (1 point) Select one: For events A1,A2,A3,
P(A1 ∩ A2 ∩ A3) = P(A3|A2 ∩ A1)P(A2|A1)P(A1)
⃝ True
⃝ False
P(R) = 0.4
P(W|R) = 0.8 P(W|¬R) = 0.2
What is the probability that your car is wet in the morning?
⃝ 0.64
⃝ 0.56
⃝ 0.44
⃝ 0.4
4. Consider the following joint probability table where both X and Y are binary variables:
X Y Probability
0 0 0.1
0 1 0.4
1 0 0.2
1 1 0.3
(b) (1 point) Select one: What is P(X = 1|Y = 1)?

(c) (1 point) Select one: What is P(Y = 0)?
⃝ 0.2
⃝ 0.6
⃝ 0.5
⃝ 0.3
5. Let D be a distribution with mean 1 and variance 2. Now, consider 16 independently identically distributed random variables X1,X2,··· ,X16 ∼ D. Let .
(a) (1 point) Select one: What is E[Y ]?

(b) (1 point) Select one: What is V ar[Y ]?

6. Let A, B, and C be random variables with discrete probability distributions. Consider the following two joint probability tables: one relating A and B, and the other relating B and C.

AB b1 b2 b3
a1 0.1 0.05 0.15
a2 0.1 0.05 0.3
a3 0.05 0.15 0.05

BC c1 c2 c3 c4
b1 0.02 0.14 0.06 0.03
b2 0.03 0.05 0 0.17
b3 0.35 0.04 0 0.11

(a) (1 point) Select all that apply: Which of the following statements are necessarily false? Note X Y indicates that random variable X is independent of random variable Y .
2 AB
2 BC
2 CB
2 None of the above.
(b) (1 point) What is P(B = b1|A = a2,C = c4)? If this value cannot be computed, write N/A. Ignore when P(A = a2,C = c4) = 0 for this problem, if such a case is possible.

(c) (1 point) What is P(B = b2|A = a3,C = c3)? If this value cannot be computed, write N/A. Ignore when P(A = a3,C = c3) = 0 for this problem, if such a case is possible.

(d) (2 points) Select one: True or False: P3i=1 P(B = bi|C = c1) = Pj4=1 P(C = cj|B = b1)
⃝ True
⃝ False
7. (2 points) Select one: Consider two random variables X,Y . Assume that we have
for x ∈ Z≥1 (integers greater than or equal to 1) and .
Assume n is a fixed positive integer constant. What is E[Y ]?

8. (1 point) Please match the probability density function of the random variable X to its corresponding distribution name.
B) P(X = x) = λe−λx when x ≥ 0; 0 otherwise
D) when a ≤ x ≤ b; 0 otherwise
E) P(X = x) = px(1 − p)1−x

4 Linear Algebra (7 points)
 2
1. (1 point) Select one: The matrix A = −3
⃝ True
⃝ False 1
2 3 4 
0  has an inverse. −2
2. (2 points) Select one: Consider two vectors x and y . Let z = xTy. What is ?
⃝ y2
⃝ x2
⃝ x
⃝ y
3 4 2
3. (2 points) Select one: Given matrix X = 1 6 2 and the column vector y, what is the
1 4 4
eigenvalue of X associated with y? (Recall an eigenvector of a matrix A ∈ Rn×n is a nonzero vector v ∈ Rn such that Av = λv where we call the scalar λ the associated eigenvalue for v.)
⃝ −5
⃝ −3
⃝ 2
⃝ 1.5
4. (2 points) Select one: While preparing for her linear algebra final, Ada is finding eigenvectors and eigenvalues for different matrices. Ada finds out that some matrix A (not given) has eigenvalues −3
−1 −3
and 2. She also notices that both  0  and  0  are solutions to the equation Av = 2v, and
2 6
concludes that both of these vectors are eigenvectors corresponding to the eigenvalue of 2. Which statement regarding her conclusion is true?
⃝ It must be wrong because there cannot be multiple eigenvectors corresponding to a single eigenvalue.
⃝ It must be wrong because the eigenvectors are linearly dependent and thus cannot both be solutions to Av = 2v.
⃝ It is correct because both vectors are solutions to Av = 2v.

5 Calculus (8 points)
(2 points) Evaluate the derivative of y with respect to x, where .

2. (2 points) Select one: Find the partial derivative of y with respect to x, where y = 3×2 sin(z)e−x
⃝ 3xsin(z)e−x(2 + x) ⃝ −6xsin(z)e−x
⃝ 3xsin(z)e−x(2 − x)
⃝ 6xcos(z)e−x
3. (2 points) Select one: For the function f(x) = 4×3−5×2−2x, the value sets the derivative to be 0. Additionally, the second order derivative of is negative. What can you say about f(x) at the point :
⃝ a local minimum
⃝ a local maximum
⃝ a local minimum or a local maximum
⃝ None of the above
4. (2 points) Select one: Suppose that f(x|θ) = xTθ, where x,θ ∈ Rn. The function g(θ) is defined as for some x(1) ∈ Rn and y(1) ∈ R. What is the function type of g(θ):
⃝ g : Rn → R
⃝ g : R → R
⃝ g : R → Rn
⃝ g : (Rn × Rn) → R
6 Geometry (6 points)
Consider the vector w and the line wTx + b = 0. Assume x and w are both two-dimensional column vectors and that wT indicates the transpose of the column vector w.
(a) (2 points) Select one: Consider any two points x′ and x′′ on the line wTx + b = 0. Select the correct statement describing the relationship between wT and x′ − x′′.
⃝ The inner product wT(x′ − x′′) is 0
⃝ The inner product wT(x′ − x′′) is 1
⃝ The inner product wT(x′ − x′′) is b
(b) (2 points) Select one: What relationship does the vector w share with the line wTx + b = 0? (assume x and w are both two dimensional column vectors, and wT indicates the transpose of the column vector w.)
⃝ parallel
⃝ orthogonal
⃝ depends on the value of b
2. (2 points) Select one: Let w,x be d-dimensional vectors. What is the distance from the origin to the line wTx + b = 0?

7 CS Foundations (15 points)
Consider the following functions for computing the nth Fibonacci number.
def fib_1(n): if n == 0 or n == 1: return n
return fib_1(n – 1) + fib_1(n – 2)
d = {0: 0, 1: 1} def fib_2(n): if n in d:
return d[n]
d[n] = fib_2(n – 1) + fib_2(n – 2)
return d[n]
(a) (1 point) Select one: Which of the following is the tightest upper bound on the time complexity of computing fib_1(n)?
⃝ O(n)
⃝ O(nlogn)
⃝ O(2n)
⃝ O(n!)
(b) (1 point) Select one: Which of the following is the tightest upper bound on the time complexity of computing fib_2(n)?
⃝ O(n)
⃝ O(nlogn)
⃝ O(2n)
⃝ O(n!)

Figure 1: Britain’s Royal Family
2. (1 point) Select one: Using the tree shown in Figure 1, how many nodes would depth-first-search visit in finding Mia Tindall (including her node)? Assume we search left-to-right and top-down.
⃝ 3
⃝ 12
⃝ 15
⃝ 18
3. Figure 2 is a Binary Tree with indexed nodes. Assume root node is node 1.

Figure 2: A Binary Tree with indexed nodes
(a) (1 point) What is the node-visit order of DFS and BFS of the above Binary Tree?
A depth-first search (DFS) traversal of a binary tree starts with visiting the root node, and recur-sively searches down the left subtree (i.e., the tree rooted at the left node) before going to search the right subtree (i.e., the tree rooted at the right node) until the traversal is done.
Note: Alternatively, we can also look right subtree before left subtree too, for the question please consider left to right order!

(b) (1 point) What is the node-visit order of BFS of the above Binary Tree?
A breadth-first search (BFS) traversal of a binary tree visits every node (assuming a left-to-rightorder) on a level (with the same distance to the root) before going to a lower level until the traversal is done.

4. (4 points) Fill in the blanks in the Python code for key search using recursive depth-first search (DFS) traversal. (Note: Please put your answer in the boxes below, not on the lines.)
class TreeNode:
def __init__(self, key): self.key = key self.leftNode = None self.rightNode = None
# (a) the left/right node is denoted as
# node.leftNode/node.rightNode
# (b) left/right node are of type TreeNode
# (c) the key of the node is denoted as node.key # (d) the left node is searched before the right node
def find_val(node, key):
if node is None: return None
if (1) :

return node
result = (2)

if result is None:
result = (3)

5. Consider an M ×N grid where each cell has a cost associated with passing through it. From any given cell, you can either travel to the right (same row, next column) or down (same column, next row). You need to return the minimum possible cost of traveling from the start point [0,0] in the top left of the grid to the end point [M − 1,N − 1] in the bottom right of the grid. In Figure 3, the minimum cost path is highlighted in gold and the cost is 24. Assume the cost of traveling outside the grid is infinite.

Figure 3: An example grid containing the costs for each cell and the directions one can travel from a cell.
We will first attempt to solve this using recursion.
# grid is of type List[List[int]] m, n = len(grid), len(grid[0])
def minCostPath(r, c): if (1) :

return (2)

if r == m-1 and c == n-1: return grid[r][c]
return (3)

minCostPath(0, 0)
(a) (3 points) Fill in the blanks to complete the above function.
(b) (3 points) From Figure 4, we can deduce that using recursion is inefficient because we recalculate the costs of different partial routes from the same intermediate cells to the endpoint.
The following figure demonstrates the inefficiency of using recursion to find the minimum cost path through the grid because it recalculates the cost of partial routes. For example, we can see that the red and black paths share the partial route from [1,2] to [2,2]. The cost of this partial route is calculated once for the red path and once again for the black path.

Figure 4
To solve this problem, we can utilize memoization by creating a dictionary cache with each intermediate r, c state and its corresponding value of the minCostPath function.
Fill in the blanks below. Assume fields (1) and (2) use your same solutions from above.
# grid is of type List[List[int]] m, n = len(grid), len(grid[0]) cache = {}
def minCostPath(r, c): if (1) :
return (2)
if r == m-1 and c == n-1: return grid[r][c]
key = ( (4) , (5) ) if key not in cache: cache[key] = (6) return cache[key]
minCostPath(0, 0)
9 Collaboration Questions
1. Did you receive any help whatsoever from anyone in solving this assignment? If so, include full details.
2. Did you give any help whatsoever to anyone in solving this assignment? If so, include full details.
3. Did you find or come across code that implements any part of this assignment ? If so, include full details.


There are no reviews yet.

Be the first to review “10601 – HOMEWORK 1: BACKGROUND Solved”

Your email address will not be published. Required fields are marked *