Description
Lab 3: More Linear Algebra
Lecturer: Abir De
Important: Please read the instructions mentioned in the questions carefully. We have provided boilerplate code for each question. Please ensure that you make changes in the areas marked with TODO. Please read the comments in the code carefully for detailed information regarding input and output format.
1 Discrete Cosine Transform
Discrete Cosine Transform (DCT) is a widely used transformation technique in signal processing and data compression. There are around 8 variations of DCT – depending on the normalization coefficients and boundary conditions.
For this lab, we will use the 2-Dimensional DCT-II. Consider an image X of size N1 × N2. The DCT of this image is also a matrix Y of size N1 × N2 given by the formula
The computation of Y can be succinctly captured by a series of matrix multiplications, resulting in a much faster implementation when coded in python.
Implement the vectorized version of 2-D DCT by completing the function vectorized_dct
2 Document Ranking by Query
Information retrieval can be defined as the task of obtaining the most relevant resources according to some requirements from a collection of resources. Document ranking is an important problem in this domain. We are given a collection of documents and a query. We need to order the documents according to their relevance with their query i.e. the document most relevant to the query must be a the top and the document least relevant must be at the bottom.
In this problem, each query is represented as a list of vectors vi ∈ rw. The length of each list is m.
Each vector in the list can be though of as the representation of a word in the query.
We are also given an collection of k documents where each document is a list of n vectors aj ∈ Rw. Again, each vector in the list can be though of as the representation of a word in the document.
The document at the ith position has docID = i where i = 0,1,2,3…k − 1
Your task is to return a list of docIDs ordered according to relevance to the query.
The relevance R(d,q) between a document d and a query q is calculated as follows –
For a vector vi in the query, compute maxajviT aj over the vectors aj in the document. Let the value be si. This denotes the maximum similarity a word in the query has with a word in the document.
Now, sum all these values of si for each vector vi in the query. This gives the relevance score R(d,q).
Complete the function get_document_ranking to implement this.
3 Markov Chains
Markov Chains: It is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. We can represent it using a digraph, where the nodes represent states & edges represent transition probabilities.
Example of Markov chain:
0.5 0.8
1 1
0.5
In this problem you have to find an optimal way to compute reaching a particular state from a given state using transition probabilites matrix. Assume that each transition takes a unit time.
Given a markov chain matrix Mnxn where Mij denotes transition probability from state Si given Sj = P(Sj|Si). Find the probability of reaching state ST at time t = T if we start from state Sa at time t = 0.
• Complete the function compute_state_probability(M, S_T, T, S_a) to return computed probability upto 4 decimal places.
• Complete the function compute_state_probability_vec(M, S_T, T, S_a) to return vectorized computed probability upto 4 decimal places.
Note: Assume n (number of states) is large.
Complete the functions in assignment.py. Keep the file in a folder named <ROLL_NUMBER>_L3 and compress it to a tar file named <ROLL_NUMBER>_L3.tar.gz using the command tar -zcvf <ROLL_NUMBER>_L3.tar.gz <ROLL_NUMBER>_L3
Submit the tar file on Moodle.
The directory structure should be –
<ROLL_NUMBER>_L3
| – – – – assignment.py
Replace ROLL_NUMBER with your own roll number. If your Roll number has alphabets, they should be in “small” letters.
Reviews
There are no reviews yet.