Midterm Exam Preparation Questions

Covers Lesson 1 through Lesson 6

The purpose of these sample questions is to prepare you for success on the midterm exam. These questions have been used on previous exams in this course, and represent the range of difficulty and depth you should expect on the actual midterm. You should expect some exam questions this semester to be similar to these questions, and some to be dissimilar.

We recommend that you use these questions as a companion to your lesson viewing sessions, and use them to help solidify your understanding as you progress through the course. This is generally more effective than reviewing these questions all at once just before the exam. We will release the solutions to these test preparation questions the week of the midterm exam.

Introduction to Software Analysis

Question 1.Static vs. Dynamic Analysis

a.State one advantage of static analysis over dynamic analysis.

b.State one advantage of dynamic analysis over static analysis.

Question 2. We may classify each program as either safe or unsafe. For instance, we may say that a program is safe if it does not dereference a null pointer in any execution, and unsafe if there exists an execution in which it will dereference a null pointer.

A software analysis is considered sound if whenever it reports a program as safe, the program is indeed safe. However, due to the undecidability of software analysis, a sound analysis may reject some safe programs. A sound analysis A is more precise than a sound analysis B if whenever analysis B accepts a program, analysis A also accepts that program.

Consider the above figure. Inside of the dotted (---) oval lies the set of all safe programs, and the outside of the oval denotes the set of all unsafe programs. The inside of each solid oval, labeled

A1 through A5, denotes the set of programs accepted by the corresponding analysis (e.g., these are five different null dereference checking analyses). Each analysis rejects all programs outside its corresponding oval.

Which of these five analyses are sound? List the sound analyses ordered by precision, i.e A6 > A7 > A8 indicates that A6, A7, and A8 are all sound, and A6 is more precise than A7, which is more precise than A8.

Introduction to Software Testing

Question 3. Consider a program P, and two test suites, X and Y for P. Test suite X covers set of branches Q in the program, while test suite Y covers set of branches R in the program. Suppose Q is a proper subset of R (Q  R).Which of the below statements are necessarily true?

A. Whenever a test in Y reaches a statement, some test in X also reaches that statement.

B. Whenever a test in X reaches a statement, some test in Y also reaches that statement.

C. Test suite Y has strictly higher path coverage than test suite X.

D. Test suite Y has strictly higher branch coverage than test suite X.

Question 4.Consider the Java function:

void copy(int[] src, int[] dst, int N) {

for (int i = 0; i < N; i++)

dst[i] = src[i];

}

Which of the below predicates is the function’s weakest possible precondition that prevent any null-pointer or array-out-of-bounds exceptions from being thrown?

NOTE:The expression X => Y is read: “X implies Y”. It is equivalent to (!X) OR Y.

A. N ≥ 0  src != null  dst != null  N < src.length  N < dst.length

B. N ≥ 0  src != null  dst != null  N < src.length  dst.length = src.length

C. N > 0  (src != null  dst != null  N < src.length  N < dst.length)

D. N > 0  (src != null  dst != null  N < src.length  dst.length = src.length)

Question 5. Consider a test suite consisting of three deterministic tests T1, T2, T3 for a correct program P. Since P is correct, it passes all the three tests. Suppose we have three mutants M1, M2, M3 of P such that: M1 fails T1 and T2; M2 fails none; and M3 fails T2 and T3.

Let M = P denote that M and P are equivalent. Likewise, let M != P denote that they are NOT equivalent.

a. Could it be possible that M1 = P? Justify your answer.

b. Mutation analysis will report M2 to the tester since it does not fail any test in the test suite. Which of the following actions are plausible for the tester to take given this information?

A. Determine if M2 == P, and if so, devise a new test case T4 on which M2 passes but P fails.

B. Determine if M2 != P, and if so, then ignore M2.

C. Determine if M2 != P, and if so, devise a new test case T4 on which P passes but M2 fails.

D. Determine if M2 == P, and if so, ignore M2.

Random Testing

Question 6. Consider the following concurrent program with two threads and shared variable x. Variables tmp1 and tmp2 are local to the respective threads. This program has a concurrency bug: it can lose an update to x.

1: tmp1=x 4: tmp2=x

2: tmp1=tmp1+1 5: tmp2=tmp2+1

3: x=tmp1 6: x=tmp2

a.Write one possible execution of the six statements that does not cause a concurrency bug.

b.Write one possible execution of the six statements that does trigger a concurrency bug.

c.What is the depth of the concurrency bug?

d.Specify the ordering constraints needed to trigger the bug.

Question 7. Consider the following pseudo-Java function, in which HashMap is used. A HashMap is a data structure that associates a value of type V to a key of type K. The value

v associated with a key k can be set with the API call put(k, v), and the value associated with the key k is returned by the API call get(k). For this problem, if no value has been associated

with k,then assumeget(k)returns0.

double charRatio(String s, char a, char b) { int N = s.length();

HashMap

char c = s.charAt(i);

int v = counts.get(c);

counts.put(c, v+1);

}

return counts.get(a) / counts.get(b);

}

Describe how you could use a fuzzer to test this function. What bugs would you expect a fuzzer to identify in this function? What bugs would be more challenging for a fuzzer to identify? Explain your reasoning fully, including any assumptions you are making.

Automated Test Generation

Question 8.Consider the following class:

public class Lock {

private final static int UNLOCKED = 0;

private final static int LOCKED = 1;

private int state;

public Lock() { state = UNLOCKED; }

public void acquire() { assert(state == UNLOCKED); state = LOCKED; } public void release() { assert(state == LOCKED); state = UNLOCKED; } public int getState() { return state; }

}

Consider each of the below five tests individually. Answer whether that test can ever possibly be generated by Randoop. If yes, explain whether Randoop 1. Discards the test as illegal, or 2. Reports the test as a bug, or 3.Adds the test to components for future extension.

For simplicity, assume that Randoop does not check for redundancy, and that the contracts it checks (not shown for brevity) are standard ones (e.g., equals and hashCode).

 # Test Can ever be generated? If yes, what outcome? 1. Lock l = new Lock(); A.Yes B.No 1.Discards 2.Reports 3.Adds l.acquire(); l.release(); Lock l = new Lock(); 2. int s = l.getState(); A.Yes B.No 1.Discards 2.Reports 3.Adds if (s == 0) l.acquire(); 3. Lock l = new Lock(); A.Yes B.No 1.Discards 2.Reports 3.Adds l.release(); 4. Lock l = new Lock(); A.Yes B.No 1.Discards 2.Reports 3.Adds l.release(); l.acquire(); Lock l = new Lock(); 5. l.acquire(); A.Yes B.No 1.Discards 2.Reports 3.Adds l.release(); l.acquire();

Dataflow Analysis

Question 9. Show the final result of performing reaching-definitions analysis on the given control-flow graph of the following program. That is, list the contents of IN[k] and OUT[k] for each k from 1 to 8.

a := 0; b := 1; c := 0;

while (a != b) {

b := a;

c := c + 1;

if (c < 100)

a := a + 1;

}

 Node IN OUT 1 a, ? , b, ? , c, ? 2 3 4 5 6 7 8

Question 10. Answer the following questions about dataflow analysis for the simple WHILE language from the lecture on dataflow analysis.

a. The order in which the nodes in the control-flow graph are processed by the inner loop “for (each node n)” of the chaotic iteration algorithm can affect which of the following aspects of the analysis?

A. Soundness B. Completeness C. Performance D. Termination

b. What is the most efficient order in which to visit the statements in the below program (i.e., nodes in its control-flow graph) for a reaching-definitions analysis?

S1; { if (...) then S2 else S3 }; S4

A. S1, S2, S3, S4 B. S4, S2, S3, S1 C. S1, S4, S2, S3 D. S2, S3, S1, S4

c. What is the most efficient order in which to visit the statements in the below program (i.e., nodes in its control-flow graph) for a live-variables analysis?

S1; { if (...) then S2 else S3 }; S4

A. S1, S2, S3, S4 B. S4, S2, S3, S1 C. S4, S2, S1, S3 D. S1, S2, S4, S3

d. Suppose the given program has N statements (i.e., nodes in the control-flow graph) and V variables. How much memory does a live-variables analysis consume on this program in the worst case at any instant? Write the size of memory as an expression in terms of N and V. Give as tight a bound as possible. Remember that the analysis must store both the “in” and “out” information for each statement.

Pointer Analysis

Question 11. Consider a flow-insensitive, context-insensitive, but field-sensitive pointer analysis of the following program:

 class Node { boolean add(Node this, int q) { int v; int p = this.v; Node left, right; if (q < p) { Node(int v) { this.v = v; } Node n = this.left; } if (n == null) { Node l = new Node(q); // h2 this.left = l; static void main() { return true; Node root = new Node(5);   // h1 } for (int i = 0; i < 10; i++) return add(n, q); add(root, i); } } if (q > p) { Node m = this.right; if (m == null) { Node r = new Node(q); // h3 this.right = r; return true; } return add(m, q); } return false; }

a. A flow-insensitive pointer analysis analyzes an (unordered) set of simple statements in the given program. Write the statements contained in this set for the above program (in the main and add functions together). Recall that this set includes only 4 kinds of statements: allocation (v = new h, where h is the commented label), copy (v1 = v2), field read (v1 = v2.f), and field write (v1.f = v2). The allocation statements are listed below for your convenience. You will lose points for listing statements not analyzed by this analysis.

Relevant allocation statements: root = new h1, l = new h2, r = new h3

Relevant copy statements: __________________________________________________

Relevant field-write statements: __________________________________________________

b.For each pair of expressions below, write all correct choices among 1-3 below.

1. They do NOT point to the same object in any (concrete) run of the program.

2. The above pointer analysis proves using allocation-site based heap abstraction that they CANNOT point to the same object in any run of the program.

3. The above pointer analysis proves using the type based heap abstraction that they CANNOT point to the same object in any run of the program.

 A. root, root.right 1 2 3 B. root.left, root.right 1 2 3 C. root.left.right, root.right.left 1 2 3 D. root.left.left, root.right.right 1 2 3 E. root.left.right, root.right.right 1 2 3 F. root.left.right.left, root.right.left 1 2 3

Solutions to Midterm Exam Preparation Questions

In the following document, the text in blue is the solution to the question above. In some cases, there may be other responses that would be considered correct or partially correct.

Introduction to Software Analysis

Question 1.Static vs. Dynamic Analysis

a.State one advantage of static analysis over dynamic analysis.

Answer: Static analysis may achieve soundness - it is possible to design an analysis that does not miss an error in a program, even if some of errors reported may be false positives.

Since static analysis does not require the program to be run, the cost to analyze a piece of software is proportional to its code size, not it’s runtime.

Since static analysis does not require the program to be run, it can be performed on a machine without requiring that machine to be able to run the code.

b.State one advantage of dynamic analysis over static analysis.

Answer: Dynamic analysis may achieve completeness - it is possible to design an analysis that will only report errors that can be triggered in a concrete execution of the program, even if some errors may be undiscovered due to limitations on the number of runs inspected.

Since dynamic analysis is performed by observing a concrete run of a program, bugs found using this method are typically easy to report / reproduce.

Note that there may be other acceptable answers to both parts of this question.

Question 2. We may classify each program as either safe or unsafe. For instance, we may say that a program is safe if it does not dereference a null pointer in any execution, and unsafe if there exists an execution in which it will dereference a null pointer.

A software analysis is considered sound if whenever it reports a program as safe, the program is indeed safe. However, due to the undecidability of software analysis, a sound analysis may reject some safe programs. A sound analysis A is more precise than a sound analysis B if whenever analysis B accepts a program, analysis A also accepts that program.

Consider the above figure. Inside of the dotted (---) oval lies the set of all safe programs, and the outside of the oval denotes the set of all unsafe programs. The inside of each solid oval, labeled A1 through A5, denotes the set of programs accepted by the corresponding analysis (e.g., these are five different null dereference checking analyses). Each analysis rejects all programs outside its corresponding oval.

Which of these five analyses are sound? List the sound analyses ordered by precision, i.e A6 > A7 > A8 indicates that A6, A7, and A8 are all sound, and A6 is more precise than A7, which is more precise than A8.

Answer: A1 and A2 are the only analyses that can be considered sound. The other analyses include unsafe programs (their oval covers regions outside the dotted oval) in the set of programs they will accept, which means the analysis is unsound.

A2 > A1, because A2 will accept any program that is safe that will also be accepted by A1, and more (A2 completely contains A1 in the diagram).

Introduction to Software Testing

Question 3. Consider a program P, and two test suites, X and Y for P. Test suite X covers set of branches Q in the program, while test suite Y covers set of branches R in the program. Suppose Q is a proper subset of R (Q  R).Which of the below statements are necessarily true?

A. Whenever a test in Y reaches a statement, some test in X also reaches that statement.

B. Whenever a test in X reaches a statement, some test in Y also reaches that statement.

C. Test suite Y has strictly higher path coverage than test suite X.

D. Test suite Y has strictly higher branch coverage than test suite X.

C is not a correct answer. It is possible for Y to cover more branches in a program than X without covering more paths through the program.

Question 4.Consider the Java function:

void copy(int[] src, int[] dst, int N) {

for (int i = 0; i < N; i++)

dst[i] = src[i];

}

Which of the below predicates is the function’s weakest possible precondition that prevent any null-pointer or array-out-of-bounds exceptions from being thrown?

NOTE:The expression X => Y is read: “X implies Y”. It is equivalent to (!X) OR Y.

A. N ≥ 0  src != null  dst != null  N < src.length  N < dst.length

B. N ≥ 0  src != null  dst != null  N < src.length  dst.length = src.length

C. N > 0  (src != null  dst != null  N < src.length  N < dst.length)

D. N > 0  (src != null  dst != null  N < src.length  dst.length = src.length)

Question 5. Consider a test suite consisting of three deterministic tests T1, T2, T3 for a correct program P. Since P is correct, it passes all the three tests. Suppose we have three mutants M1, M2, M3 of P such that: M1 fails T1 and T2; M2 fails none; and M3 fails T2 and T3.

Let M = P denote that M and P are equivalent. Likewise, let M != P denote that they are NOT equivalent.

a. Could it be possible that M1 = P? Justify your answer.

b. Mutation analysis will report M2 to the tester since it does not fail any test in the test suite. Which of the following actions are plausible for the tester to take given this information?

A. Determine if M2 == P, and if so, devise a new test case T4 on which M2 passes but P fails.

B. Determine if M2 != P, and if so, then ignore M2.

C. Determine if M2 != P, and if so, devise a new test case T4 on which P passes but M2 fails.

D. Determine if M2 == P, and if so, ignore M2.

Random Testing

Question 6. Consider the following concurrent program with two threads and shared variable x. Variables tmp1 and tmp2 are local to the respective threads. This program has a concurrency bug: it can lose an update to x.

1: tmp1=x 4: tmp2=x

2: tmp1=tmp1+1 5: tmp2=tmp2+1

3: x=tmp1 6: x=tmp2

a.Write one possible execution of the six statements that does not cause a concurrency bug.

Answer: Any order which performs a read and a write before the second read, e.g., 1 2 3 4 5 6 OR 4 5 6 1 2 3.

b.Write one possible execution of the six statements that does trigger a concurrency bug.

Answer: Any order which performs both reads before any write, e.g., 1 4 2 5 3 6.

c.What is the depth of the concurrency bug?

d.Specify the ordering constraints needed to trigger the bug.

Question 7. Consider the following pseudo-Java function, in which HashMap is used. A HashMap is a data structure that associates a value of type V to a key of type K. The value

v associated with a key k can be set with the API call put(k, v), and the value associated with the key k is returned by the API call get(k). For this problem, if no value has been associated

with k,then assumeget(k)returns0.

double charRatio(String s, char a, char b) { int N = s.length();

HashMap

char c = s.charAt(i);

int v = counts.get(c);

counts.put(c, v+1);

}

return counts.get(a) / counts.get(b);

}

Describe how you could use a fuzzer to test this function. What bugs would you expect a fuzzer to identify in this function? What bugs would be more challenging for a fuzzer to identify? Explain your reasoning fully, including any assumptions you are making.

Answer: There are three main bugs in this program:

the possibility of a null dereference if s == null,

the possibility of a division by zero when b doesn't appear in the input string, and

the possibility of counts.get(a) / counts.get(b) not equaling the correct ratio of a's to b's in the input string due to integer division.

Fuzzing would likely quickly detect the division-by-zero error by generating a string s with no instances of the char b. This would depend on the implementation of the fuzzer. If the fuzzer only generated '0' and '1's, then it would likely be difficult for b not to appear in s. On the other hand, if the fuzzer uniformly generated legal strings of length 100 from the ASCII character set (and b were uniformly chosen from the ASCII character set), then there would be a 45% chance of the bug being triggered. (As the length of the input string grows, the probability of selecting b so that b were not in s would vanish quickly, though.)

The null-dereference error would also likely be caught by the fuzzer using a similar argument.

The integer division bug would be harder to detect, as fuzzing does not generally entail matching the output of a function against an expected output: we usually just give the function random strings until we detect a crash.

Automated Test Generation

Question 8.Consider the following class:

public class Lock {

private final static int UNLOCKED = 0;

private final static int LOCKED = 1;

private int state;

public Lock() { state = UNLOCKED; }

public void acquire() { assert(state == UNLOCKED); state = LOCKED; } public void release() { assert(state == LOCKED); state = UNLOCKED; } public int getState() { return state; }

}

Consider each of the below five tests individually. Answer whether that test can ever possibly be generated by Randoop. If yes, explain whether Randoop 1. Discards the test as illegal, or 2. Reports the test as a bug, or 3.Adds the test to components for future extension.

For simplicity, assume that Randoop does not check for redundancy, and that the contracts it checks (not shown for brevity) are standard ones (e.g., equals and hashCode).

 # Test Can ever be generated? If yes, what outcome? 1. Lock l = new Lock(); A.YesB.No 1.Discards  2.Reports3.Adds l.acquire(); l.release();

 Lock l = new Lock(); 2. int s = l.getState(); A.Yes B.No 1.Discards 2.Reports 3.Adds if (s == 0) l.acquire(); 3. Lock l = new Lock(); A.Yes B.No 1.Discards 2.Reports 3.Adds l.release(); 4. Lock l = new Lock(); A.Yes B.No 1.Discards 2.Reports 3.Adds l.release(); l.acquire(); Lock l = new Lock(); 5. l.acquire(); A.Yes B.No 1.Discards 2.Reports 3.Adds l.release(); l.acquire();

1. A, 3

2. B

3. A, 1

4. B

5. A, 3

Dataflow Analysis

Question 9. Show the final result of performing reaching-definitions analysis on the given control-flow graph of the following program. That is, list the contents of IN[k] and OUT[k] for each k from 1 to 8.

a := 0; b := 1; c := 0;

while (a != b) {

b := a;

c := c + 1;

if (c < 100)

a := a + 1;

}

 Node IN OUT 1 a, ? , b, ? , c, ? a,1 , b, ? , c, ? 2 a, 1 , b, ? , c, ? a, 1 , b, 2 , c, ? 3 a, 1 , b, 2 , c, ? a, 1 , b, 2 , c, 3 4 a, 1 , b, 2 , c, 3 , a, 1 , b, 2 , c, 3 , a, 8 , b, 5 , c, 6 a, 8 , b, 5 , c, 6

 5 a, 1 , b, 2 , c, 3 , a, 1 , c, 3 , a, 8 , b, 5 , c, 6 a, 8 , b, 5 , c, 6 6 a, 1 , c, 3 , a, 1 , a, 8 , b, 5 , c, 6 a, 8 , b, 5 , c, 6 7 a, 1 , a, 1 , a, 8 , b, 5 , c, 6 a, 8 , b, 5 , c, 6 8 a, 1 , a, 8 , b, 5 , c, 6 a, 8 , b, 5 , c, 6

Question 10. Answer the following questions about dataflow analysis for the simple WHILE language from the lecture on dataflow analysis.

a. The order in which the nodes in the control-flow graph are processed by the inner loop “for (each node n)” of the chaotic iteration algorithm can affect which of the following aspects of the analysis?

A. Soundness B. Completeness C. Performance D. Termination

b. What is the most efficient order in which to visit the statements in the below program (i.e., nodes in its control-flow graph) for a reaching-definitions analysis?

S1; { if (...) then S2 else S3 }; S4

A. S1, S2, S3, S4 B. S4, S2, S3, S1 C. S1, S4, S2, S3 D. S2, S3, S1, S4

c. What is the most efficient order in which to visit the statements in the below program (i.e., nodes in its control-flow graph) for a live-variables analysis?

S1; { if (...) then S2 else S3 }; S4

A. S1, S2, S3, S4 B. S4, S2, S3, S1 C. S4, S2, S1, S3 D. S1, S2, S4, S3

d. Suppose the given program has N statements (i.e., nodes in the control-flow graph) and V variables. How much memory does a live-variables analysis consume on this program in the worst case at any instant? Write the size of memory as an expression in terms of N and V. Give as tight a bound as possible. Remember that the analysis must store both the “in” and “out” information for each statement.

Answer: 2 * N * V

Pointer Analysis

Question 11. Consider a flow-insensitive, context-insensitive, but field-sensitive pointer analysis of the following program:

 class Node { boolean add(Node this, int q) { int v; int p = this.v; Node left, right; if (q < p) { Node(int v) { this.v = v; } Node n = this.left; } if (n == null) { Node l = new Node(q); // h2 this.left = l; static void main() { return true; Node root = new Node(5);   // h1 } for (int i = 0; i < 10; i++) return add(n, q); add(root, i); } } if (q > p) { Node m = this.right; if (m == null) { Node r = new Node(q); // h3 this.right = r; return true; } return add(m, q); } return false; }

a. A flow-insensitive pointer analysis analyzes an (unordered) set of simple statements in the given program. Write the statements contained in this set for the above program (in the main and add functions together). Recall that this set includes only 4 kinds of statements: allocation (v = new h, where h is the commented label), copy (v1 = v2), field read (v1 = v2.f), and field write (v1.f = v2). The allocation statements are listed below for your convenience. You will lose points for listing statements not analyzed by this analysis.

Relevant allocation statements: root = new h1, l = new h2, r = new h3

Relevant copy statements: __________________________________________________

Relevant field-write statements: __________________________________________________

Relevant copy statements: this = root,

this = n,

this = m

Relevant field-read statements: n = this.left,

m = this.right

Relevant field-write statements:this.left = l,

this.right = r

b.For each pair of expressions below, write all correct choices among 1-3 below.

1. They do NOT point to the same object in any (concrete) run of the program.

2. The above pointer analysis proves using allocation-site based heap abstraction that they CANNOT point to the same object in any run of the program.

3. The above pointer analysis proves using the type based heap abstraction that they CANNOT point to the same object in any run of the program.

 A. root, root.right 1 2 3 B. root.left, root.right 1 2 3 C. root.left.right, root.right.left 1 2 3 D. root.left.left, root.right.right 1 2 3 E. root.left.right, root.right.right 1 2 3 F. root.left.right.left, root.right.left 1 2 3

A. 1, 2

B. 1, 2

C. 1, 2

D. 1, 2

E. 1

F. 1