I’m working on a programming project and need the explanation and answer to help me learn.
The goal of this assignment is to write a program in a language that restricts the programming model to simplify dealing with concurrency: Google’s Go. You will use channels, go-routines, and signaling to compute binary search tree equivalence.
I have uploaded the attachments for more details
10/10/23, 9:36 PM
CS378 Assignment #3
CS380P: Parallel Systems
Lab #3
The goal of this assignment is to write a program in a language that restricts the programming
model to simplify dealing with concurrency: Google’s Go. You will use channels, go-routines,
and signaling to compute binary search tree equivalence.
It is highly recommended that you read all instructions for this lab before you start coding.
Binary Search Trees
A binary search tree (BST) is a simple data structure where each node contains a data value and
pointers to left and right subtrees. Each subtree is constrained such that the left substree contains
only values smaller than its parent and the right subtree contains only values larger than its
parent. While this constraint generally helps the average search time, it also limits where new
values can be inserted (without displacing old values). This means the organization of the tree is
dependent on the order in which values are inserted; two “identical” trees can have have very
different layouts despite containing the same values.
Consider the two example trees below, each of which encodes the sort order 1, 2, 4, 8, 16. The
tree on left, is produced by insertion orders in which 8 is inserted first, while the one on the
right is produced by insertion orders in which 2 is inserted first. However, despite the fact that
the data structures have different node-level organizations, an in-order traversal of the trees
produces the same sort order, and the BSTs are therefore said to be “equivalent.”
Algorithm
Given a set of binary search trees, we would like to determine which trees contain the same
values. We can do this naively by comparing each tree to every other tree to see if they are the
same (if traversing them in order results in the same values). However, this is highly inefficient.
Instead, we will first compute a hash of each tree by traversing it in order. We can then compare
the hashes of the trees rather than the trees themselves. If the hashes of two trees are different,
https://www.cs.utexas.edu/~rossbach/cs380p/lab/go-cs380p.html
1/4
10/10/23, 9:36 PM
CS378 Assignment #3
no further work is necessary. If the hashes of the trees are the same, the trees must be compared
to see if they are really the same.
Deliverables will be detailed below, but the focus is on a writeup that provides performance
measurements as graphs, and answers (perhaps speculatively) a number of questions. Spending
some time setting yourself up to quickly and easily collect and visualize performance data is a
worthwhile time investment since it will come up over and over in this lab and for the rest of the
course.
Input files
We have prepared a few input files to help you with this assignment: simple.txt, coarse.txt, and
fine.txt. Simple.txt is intended to be small to simplify debugging, especially via print statements.
Coarse.txt contains a small number of very large trees, which should allow you to observe the
benefits of parallelism without high amounts of overhead. Fine.txt contains many very small
trees and should start to test the overhead and scalability of your implementation.
Step 1: Create a sequential solution
In step 1 of the lab, you will write a program that accepts command-line parameters to specify
the following:
-hash-workers= integer-valued number of threads
-data-workers= integer-valued number of threads
-comp-workers= integer-valued number of threads
-input= string-valued path to an input file.
While the options allow you to specify a number of workers, your initial solution should just
implement a single-threaded version of the algorithm. This greatly simplifies debugging and
gives you a baseline against which to measure subsequent parallel versions. At this point, your
implementation should:
contruct BSTs as specified by the file
each line represents a different BST and ends in a newline character
the numbers on each line, which are separated by spaces, are the values contained in
the BST
numbers should be inserted into the BST in the order provided
generate a hash of each BST
store the hashes and identify potential duplicates
compare trees with identical hashes to determine their equality
for each tree, store the IDs of each identical tree
a BST’s ID is just the index of it in the input file
Step 2: Parallelize hash operations
https://www.cs.utexas.edu/~rossbach/cs380p/lab/go-cs380p.html
2/4
10/10/23, 9:36 PM
CS378 Assignment #3
There are three main steps that you should parallelize, the first of which is computing the BST
hashes. During your first implementation, this should be done by spawning a goroutine for each
BST. Once you have done this, try spawning hash-workers goroutines (threads) and having
them iterate over the available BSTs. Try your code with several different values of hashworkers on both coarse.txt and fine.txt. Which implementation is faster (and by how much)?
Can Go manage goroutines well enough that you don’t have to worry about how many threads to
spawn anymore?
For simplicity, we recommend you use the faster of your implementations before proceeding.
With your hashing parallelized, you will now need to place these hashes in a shared location and
identify duplicates. We believe the easiest way to do both of these is to use a map from hashes to
BST IDs with that hash. However, since appending a new BST ID is not atomic, we must protect
the shared data structure. For your first implementation, spawn a single thread to insert data into
the map and have your hashing threads communicate with it via a channel by sending (hash,
BST ID) pairs through it. Since channels are thread safe and only one thread is modifying the
data structure at a time, this approach is entirely safe. Once you have done this, make an
alternate implementation where the data structure is protected by a single lock and each thread
must acquire the lock to add their result to the data structure (i.e. they now fight for the right to
do the work of the helper thread before). Compare the speed of both implementations. Which
approach has more overhead? How much faster are they compared to a single thread? Which
approach do you find simpler?
(Optional) For some extra credit, make two more implementations which use fine grain
synchonization to allow up to data-workers threads to access the data structure at once.
Since multiple threads will be accessing the data structure, you will have to make sure that only
one thread is modifying the entry for a particular hash at a time. Evaluate your performance on
both coarse.txt and fine.txt. Was access to the shared data structure a bottleneck before? How do
channels and locks compare now that there is more parallelism?
Step 3: Parallelize tree comparisons
For simplicity, we recommend you use the faster of your implementations before proceeding.
With your hashes parallelized, it is now time to parallelize the final tree comparisons. To keep
things simple, you can store BST equivalence using a 2D adjacency matrix where array[i]
[j] == true implies the BST with ID i is equivalent to the BST with ID j. After you have
populated your map from hashes to BST IDs and initialized the adjacency matrix to false,
have a single thread traverse it. If a hash is associated with only one BST ID, k, then you can
simply assign true to array[k][k] and proceed. However, if a hash has multiple BST IDs
associated with it, you will have to compare all of their trees for similarity. For your first
implementation, just spawn a goroutine to do each comparison and write the result to the
adjacency matrix if a match is found.
For your second implementation, you will spawn comp-workers threads to do the
comparisons and use a concurrent buffer to communicate with them (work can be represented
with a (BST ID, BST ID) pair of trees to compare). Since this is not a data structures course, it
https://www.cs.utexas.edu/~rossbach/cs380p/lab/go-cs380p.html
3/4
10/10/23, 9:36 PM
CS378 Assignment #3
doesn’t matter how the buffer is implemented as long as you make sure that it only holds up to
comp-workers items at a time and isn’t slow. Your buffer should contain a mutex to prevent
concurrency errors when multiple threads try to access the buffer at once. The buffer should also
contain two conditions to handle the cases where the main thread tries to insert work when the
buffer is full and when the worker threads try to remove work when the buffer is empty. The
threads should not “spin” and repeatedly check if the buffer is no longer empty or full. Evaluate
your performance on coarse.txt (you can also test your code on fine.txt, but the large number of
trees will result in an extremely large adjacency matrix, which will take significantly longer to
access). How do the performance and complexity of both approaches compare with each other?
How do they scale compared to a single thread? Is the additional complexity of managing a
thread pool worthwhile?
Hints
Go includes a lot of useful packages. Argument parsing is trivial with the flag package. Highprecision timing in Go can also be quite easy using the built-in time package. The io/ioutil
package makes reading in file data easy. Once you strip the EOF character, you can parse the file
with the strings and strconv packages. The sync package includes a lot of handy tools, including
mutexes and conditions. It also includes the WaitGroup, which greatly simplifies waiting on a
group of goroutines to finish. Finally, Go contains many familiar data structures, like arrays and
maps. However, you may also want to check out the slice. It combines the power of lists, arrays,
and slicing into a single data structure.
Deliverables
Please follow these supplemental instructions and submission guidelines.
https://www.cs.utexas.edu/~rossbach/cs380p/lab/go-cs380p.html
4/4
10/10/23, 9:37 PM
cs.utexas.edu/~rossbach/cs380p/lab/go-submission-cs380p.html
CS380P: Parallel Systems
Lab #3: BST Comparison
Submission Guide
Read this page first for detailed instructions about how to do this lab.
When computing the hash of a tree, we will use the following function:
// initial hash value
hash = 1;
for each value in tree.in_order_traversal() {
new_value = value + 2;
hash = (hash * new_value + new_value) % 1000
}
The given input data (simple.txt, coarse.txt, and fine.txt) is used for exploring the difference between
implementations to prepare the writeup. Hidden input data will be used by the autograder to test performance
and correctness of each implementation. Your program will be tested for each step of the algorithm. Flags are
used to select different implementations by the autograder. Make sure your program accepts the exact flag(s) for
a particular implementation as detailed below.
You should submit your code and report packaged into a tar file on Canvas. If you do not complete the optional
implementation in step 2.2, name your main go program BST.go. If you do complete it, name it BST_opt.go
instead.
1. Computation of Hash of Each Tree
1.1 Flags
The following flag controls the number of goroutines to compute the hash of each BST
-hash-workers=: spawn hash-workers goroutines to iterate
over the available BSTs to compute the hash.
When -hash-workers is set to 1, you do not have to spawn any goroutines. You can simply compute the hash of
all trees in the main thread. When -hash-workers is the only flag provided, your program should only compute
the hash of each BST without performing the other two steps (identification of hash duplicates and tree
comparison). This is for the autograder to collect performance for this step only.
1.2 Output
For any number of hash workers, your program should output the time it takes to compute the hashes of all
BSTs. Make sure that the time is measured after all goroutines are synchronized. The output should have the
following form
hashTime: xxx
https://www.cs.utexas.edu/~rossbach/cs380p/lab/go-submission-cs380p.html
1/3
10/10/23, 9:37 PM
cs.utexas.edu/~rossbach/cs380p/lab/go-submission-cs380p.html
Note that the unit of the output time is second. Do not include tree construction time.
1.3 Testing
Your program will be tested for the following numbers of hash workers
(1,2,4,8,16,N)
where N is the total number of BSTs.
2. Computation of Hash Groups
2.1 Flags
The following flag controls the synchronization mechanism used by hash-workers goroutines to update the map
-data-workers=: there will be data-workers goroutines
trying to update the map.
Your program should select the proper implementation according to the following combinations of flags:
1. -hash-workers=1 -data-workers=1: This is the sequential implementation. Your program can compute
the hash of all BSTs and update the map in the main thread.
2. -hash-workers=i -data-workers=1(i>1): This implementation spawns i goroutines to compute the
hashes of the input BSTs. Each goroutine sends its (hash, BST ID) pair(s) to a central manager goroutine
using a channel. The central manager updates the map.
3. -hash-workers=i -data-workers=i(i>1): This implementation spawns i goroutines to compute the
hashes of the input BSTs. Each goroutine updates the map individually after acquiring the mutex.
The following combination of flags are for the optional implementation:
-hash-workers=i -data-workers=j(i>j>1): This implementation spawns i goroutines to compute the
hashes of the input BSTs. Then j goroutines are spawned to update the map.
It is up to you how the hash workers are communicating with the j data workers. You can use a
semaphore so that j of the i hash workers can update the map. Or you can use j central managers to
collect (hash, BST ID) pairs from each hash worker via j channels.
When j goroutines try to update the map, make sure each hash entry is not updated by more than
two threads at the same time.
2.2 Output
For each implementation, your program should output the elapsed time as well as the hash groups in the
following form:
hashGroupTime: xxxxxx
hash0: id00 id01 ..
hash1: id10 id11 ..
..
hashi is the ith hash value and idi0, idi1,… are tree ids that have the same hash value hashi. Ids are separated
by spaces. Do not print hash groups that have only one tree. Note that hashGroupTime includes the time for hash
computations.
https://www.cs.utexas.edu/~rossbach/cs380p/lab/go-submission-cs380p.html
2/3
10/10/23, 9:37 PM
cs.utexas.edu/~rossbach/cs380p/lab/go-submission-cs380p.html
Do not show any output from the tree comparison if -comp-workers is not specified.
2.3 Correctness
The autograder will check the output hash groups to verify that both the hash values and corresponding tree ids
match the reference output. Therefore, make sure you are using the hash algorithm from the top of this page to
compute the hash of each tree’s in-order traversal.
2.4 Performance
The autograder will run each implementation 10 times to obtain an average elapsed time to measure
performance. Possible i values include 2, 4, 8, 16, N, where N is the number of BSTs in the input file. For the
optional implementation, possible j values include 2, 4, 8, 16.
3. Tree Comparison
3.1 Flags
The following flag controls the number of goroutines doing the tree comparisons.
-comp-workers=: there will be comp-workers
goroutines spawned to compare pairs of trees.
-comp-workers=1 corresponds to the sequential implementation, in which your program can simply do tree
comparisons in the main thread.
3.2 Output
Your program should measure how much time it takes to compare trees (compareTreeTime) and output the tree
groups in the following form:
compareTreeTime: yyyyyy
group 0: id00 id01 …
group 1: id10 id11 …
..
Do not print groups that have only one tree.
It is ok that your program also outputs the hash groups from the previous step.
3.3 Correctness
The autograder will compare the tree groups with the reference output. Group ids do not have to match.
3.4 Performance
The autograder will execute each implementation 10 times and compute an average elapsed time to measure the
performance of tree comparison. Possible comp-workers values include 1,2,4,8,16.
https://www.cs.utexas.edu/~rossbach/cs380p/lab/go-submission-cs380p.html
3/3