CSCI-UA.0480-003 Parallel Computing
Homework Assignment 2
1.
We discussed briefly how caches are designed. Among cache characteristics are whether a cache is write back (when a cache block is modified, it is written back to the lower level cache only when the block is replaced) or write through (whenever a cache block is updated, it updates also the lower level copy). Discuss the pros and cons of each.
2.
Suppose a processor has to wait 10 cycles for its memory system to provide a 64-bit word. What can we do to reduce this delay when we have several loads coming from the processor to the memory? Ignore the existence of caches for now.
3. 并行计算家庭作业代写
Suppose we have a system with three level of caches: L1 is close to the processor, level 2 is below it, and level 3 is the last level before accessing the main memory. We know that two main characteristics of a cache performance are: cache access latency (How long does the cache take before responding with hit or miss?) and cache hit rate (how many of the cache accesses are hits?). As we go from L1 to L2 to L3, which of the two characteristics become more important? and why?
4. 并行计算家庭作业代写
A sequential application with a 20% part that must be executed sequentially, is required to be accelerated three-fold. How many CPUs are required for this task? How about five-fold speedup?
5.
How does coherence protocol affect performance? Why?
6. 并行计算家庭作业代写
In slide# 11 of the performance analysis lecture (lecture 6) we saw that as the number of processes increases, the speedup increases.
a. Why the curve of “double size” seems better than the other two?
b. If we keep increasing the number of processes (i.e. the x-axis) what do you think the rest of the curve will look like?
7. 并行计算家庭作业代写
In slide# 12 of the performance analysis lecture (lecture 6), efficiency decreases as the number of processes increases, why is that?
8.
What is the relationship between synchronization points in a parallel program and load balancing? Explain why do we have such a relationship?
9. 并行计算家庭作业代写
In slide 12 of lecture 5, we found that parallelism = 6.25, yet, we can see from the graph that there are 8 paths that can be executed in parallel. How can you explain this discrepancy?
10.
Suppose we have two threads doing the same operations but on different data. Also suppose these two threads execute on two different cores, and those cores are not executing anything else but the assigned threads. If the two threads start at the same time, can we assume that they will always finish at the same time? Justify.