Description
Lab 4: Developing multi-threaded applications
Objectives
1. To develop multi-threaded application programs
2. To demonstrate the use of threads in matrix multiplication
Guidelines
As noted in the class text book there is an increasing importance of parallel processing that led to the development of lightweight user-level thread implementations. Even so, the topic of whether threads are a better programming model than processes or other alternatives remains open. Several prominent operating systems researchers have argued that normal programmers should almost never use threads because (a) it is just too hard to write multi-threaded programs that are correct and (b) most things that threads are commonly used for can be accomplished in other, safer ways. These are important arguments to understand—even if you agree or disagree with them, researchers point out pitfalls with using threads that are important to avoid. The most important pitfall is the concurrent access of threads to shared memory. When threads concurrently read/write to shared memory, program behavior is undefined. This is because thread schedule on the CPU is nondeterministic. The program behavior completely changes when you rerun the program. This should have been obvious to you by now with you the examples you have run in Lab2.
To ensure a deterministic behavior of programs where threads cooperate to access shared memory, a synchronization mechanism needs to be implemented. Consequently,
1. The program behavior will be related to a specific function of input, not of the sequence of which thread runs first on the CPU
2. The program behavior will be deterministic and will not vary from run to run
3. These facts should not be IGNORED, otherwise the compiler will mess up the results and will not be according what is thought would happen
This lab is designed to give you the first hands-on programming experience on developing multi-threaded applications. In the coming labs, you will
C Program with threads (problems 1, 2, 7, and 8 of the class textbook)
Step 1. Download the slightly revised threadHello.c program from Camino, then compile and run several times. The comment at the top of program explains how to compile and run the program.
Explain what happens when you run the threadHello.c program? Do you get the same result if you run it multiple times? What if you are also running some other demanding processes (e.g., compiling a big program, playing a Flash game on a website, or watching streaming video) when you run this program?
The function go() has the parameter arg passed a local variable. Are these variables per-thread or shared state? Where does the compiler store these variables’ states?
The main() has local variable i. Is this variable per-thread or shared state? Where does the compiler store this variable?
Step 2. Delete the second for loop in threadHello.c program so that the main routine simply creates NTHREADS threads and then prints “Main thread done.” What are the possible outputs of the program now. Explain.
Matrix multiplication with threads (problem 5 of the class text book)
Step 3. Write a program that uses threads to perform a parallel matrix multiply. To multiply two matrices, C =
A*B, the result entry C(i,j) is computed by taking the dot product of the ith row of A and the jth column of
B: Ci,j = SUM A(i,k)*B(k,j) for k = 0 to N-1, where N is the matrix size. We can divide the work by creating one thread to compute each value (or each row) in C, and then executing those threads on different processors in parallel on multi-processor systems. As shown in the following figure, each is cell in the resulting matrix is the sum of the multiplication of row elements of the first matrix by the column elements of the second matrix.
srand(time(NULL)); for (int i = 0; i < N; i++) for (int j = 0; j < M; j++)
matrixA[i][j] = rand();
srand(time(NULL)); for (int i = 0; i < M; i++) for (int j = 0; j < L; j++)
matrixB[i][j] = rand();
The following are important notes:
– The number of columns of the first matrix must be equal to the number of rows in the second matrix. – The values of N, M, and L must be large to exploit parallelism (e.g. N, M, L = 1024).
– The output matrix would be defined double matrixC[N][L];
– The Matrix multiplication is a loosely coupled problem and so decomposable, i.e. multiplication of each row of matrixA with all columns of matrix can be performed independently and in parallel
– The number of threads to be created in the program equals to N and each thread i would be performing the following task:
for (int j = 0; j < L; j++){ double temp = 0; for (int k = 0; k < M; k++){
temp += matrixA[i][k] * matrixB[k][j];
} matrixC[i][j] = temp;
}
– The main thread needs to wait for all other threads before it displays the resulting matrixC
Step 4. [Bonus] Modify your program in Step 3 to creat N*L threads, each computing ith row * jth column.
Requirements to complete the lab
2. Submit your answers to questions, observations, and notes as .txt file and upload to Camino
3. Submit the source code for all your programs as .c file(s) and upload to Camino.
Be sure to retain copies of your .c and .txt files. You will want these for study purposes and to resolve any grading questions (should they arise)
Please start each program/ text with a descriptive block that includes minimally the following information:
# Name: <your name>
# Title: Lab4 – task
# Description: This program computes … <you should
# complete an appropriate description here.>
Reviews
There are no reviews yet.