CSCI 347 (Solution)

$ 29.99
Category:

Description

Homework 02
Show your work. Include any code snippets you used to generate an answer, using comments in the code to clearly indicate which problem corresponds to which code Consider the following data matrix
x1 x2
x
D = 3 x4 X1
red
blue yellow yellow X2
yes no no no X3
north south east west
x5 red yes north
x6 yellow yes north
x7 blue no west
Answer the following:
1. (5 points) Use matplotlib to create a bar plot for the counts of the variable X2. Make sure to label the axis.
2. (2 points) Use one-hot encoding to transform all the categorical attributes to numerical values. Write down the transformed data matrix. (In what follows, we will referred to the transformed data matrix as Y).
3. (2 points) What is the Euclidean distance between instance x2 (second row) and x7 (seventh row) after applying one-hot encoding.
4. (2 points) What is the cosine similarity (cosine of the angle) between data instance x2 and data instance x7 after applying one-hot encoding?
5. (2 points) What is the Hamming distance between data instance x2 and data instance x7 after applying one-hot encoding?
6. (2 points) What is the Jaccard similarity between data instance x2 and x7 after applying one-hot encoding?
7. (2 points) What is the multi-dimensional mean of Y ?
8. (2 points) What is the estimated variance of the first column of Y ?
9. (2 points) What is the resulting matrix after applying standard (z-score) normalization to the matrix Y . In the following, we will call this matrix Z.
10. (2 points) What is the multi-dimensional mean of Z?
11. (2 points) Let zi be the i-th row of Z. What is Euclidean distance between z2 and z7?
Acknowledgements: Homework problems adapted from assignments of Veronika Strnadova-
Neeley.

Reviews

There are no reviews yet.

Be the first to review “CSCI 347 (Solution)”

Your email address will not be published. Required fields are marked *