Lab 4: Datalog
Datalog代写 n this lab, you will implement two static analyses using a declarative logic programming language, Datalog.Declarative
Spring Semester 2018
Due: March 19, 8:00 a.m. Eastern Time Corresponding Lecture: Lesson 7 (Constraint Based Analysis)
In this lab, you will implement two static analyses using a declarative logic programming language, Datalog. Declarative languages like Datalog are different from imperative languages like C in that they specify what the program should accomplish but not how the program should accomplish it.
For those without experience in declarative languages, Datalog relations are similar to SQL (Structured Query Language) queries. The analyses you will implement are modified-reference (mod-ref) analysis and method-escape analysis. Both analyses are flow-insensitive, context-insensitive, may (as opposed to must), and interprocedural (requiring reasoning across method boundaries).
- Petablox repository:
- Petablox user guide:
- How to write an analysis in Datalog:
- Predefined relations:
- Download zipfrom Canvas. When uncompressed, it will produce the directory
- Build the example Java program to be analyzed (TestCase) by running antin the directory Datalog/examples/test_case/. This will produce the subdirectory classes/.
- Write your mod-ref analysis in the file Datalog/src/modref.dlogand your method-escape analysis in the file Datalog/src/escape.dlog. See below for more details on these analyses.Datalog代写
- To run your analyses, run the following commands in Datalog/:
- You can find the output of the analyses under the directory
- Sample outputs for the test cases are provided inDatalog/sample_output/.
- We have provided a script shthat you can use to verify your analyses against the provided sample output. From the Datalog/ directory, grant the script execute permissions with the command chmod +x compare.sh and then use the command./compare.sh modref or ./compare.sh escape to execute the script. The script output will indicate how closely your most recent analysis run matches the sample solutions. Note that you may receive an error for test cases you have not yet executed.Datalog代写
- Both analyses will use a pre-existing, flow- and context-insensitive pointer analysis with on-the-fly call graph construction (henceforth called pointer/call-graph analysis). This analysis is provided in Datalog/cipa_0cfa.dlog. You can study it to learn how to write your own analyses in Datalog, however you should not spend time trying to understand each relation defined in this analysis. It declares all the relations (both input and output) that you will need to write your analyses. Of particular interest are:
- reachable methods in the relation – reachableM
- the call graph in the relationIM
- points-to information in the relations VH, FH, andHFH
- the relation specifying the containing method of each call site -MI
NOTE: Any changes you make to this file will not be reflected when you run your analyses. This file is provided as reference only. Petablox uses an analogous file in petablox.jar for the purpose of executing analyses.
Part 1: Modified-Reference (mod-ref) Analysis
The goal of mod-ref analysis is to determine which memory locations may be modified and which memory locations may be referenced during the execution of a method. The execution of a method includes the execution of any methods called by that method, directly or transitively.Datalog代写
Your analysis must consider each reachable method (that is, for each method in relationreachableM).
The output of your analysis must be in the form of the following four relations:
- The relation refStatField⊆ (M × F) containing each tuple (m, f) such that pointer-typed static field f may be read during the execution of method m.
- The relation modStatField⊆ (M × F) containing each tuple (m, f) such that
pointer-typed static field f may be written to during the execution of method m.
- The relation modInstField⊆ (M × H × F) containing each tuple (m, h, f) such that the pointer-typed instance field f of an object allocated at site h may be written to during the execution of method m. Note that f may be the element 0 in the domainF, in which case it denotes some pointer-typed element of an array allocated at site h.Datalog代写
- Therelation refInstField ⊆ (M × H × F) containing each tuple (m, h, f) such that pointer-typed instance field f of an object allocated at site h may be read during the execution of method m. As in the preceding item, f may denote a pointer-typed array element.
Use the following relations to write your analysis.They are also used as input relations in the pointer/call-graph analysis provided to you:
- The relation MgetStatFldInst⊆ (M × V × F) contains each tuple (m, v, f) such that the body of method m contains the instruction v = f, where f is a static field of pointer type.
- The relation MputStatFldInst⊆ (M × F × V) contains each tuple (m, f, v) such that the
body of method m contains the instruction f = v, where f is a static field of pointer type.
- The relation MgetInstFldInst⊆ (M × V × V × F) contains each tuple (m, v, b, f) such that the body of method m contains the instruction v = b.f, where f is either an instance field of pointer type or it is the distinguished hypothetical field (denoted by element 0 in domain F) that designates any array element.Datalog代写
- The relation MputInstFldInst⊆ (M × V × F × V) contains each tuple (m, b, f, v) such that the body of method m contains the instruction f = v, where f is either an instance field of pointer type or it is the distinguished hypothetical field (denoted by element 0 in domain F) that designates any array element.
You can also use the output relations of the pointer/call-graph analysis provided to you. You will need to use both call-graph information (to determine method reachability) and points-to information (to determine the sites that may allocate objects whose instance fields or array elements may be read/written by a particular method). You can find the descriptions of these relations at the link provided in the Resources section above.
Datalog supports recursive relations. A relation which is used in its own definition is recursive. Recursive relations must also have a base case. For example, a recursive relation P defined in terms of other relations Q and R may look something like this (the second relation is the base case):
P(i) :- P(i), Q(i). P(i) :- R(i). Datalog代写
The execution of a method includes the execution of any methods that are called by it directly or transitively. You can use relations MI and IM recursively to determine such methods.
For this lab, we will ignore reporting static fields, instance fields, and array elements ofnon-pointer type (called primitive types in Java, e.g., boolean, int, double, etc.) that may be read or written during method execution. The above relations such as MgetStatFldInst, etc. already ignore such fields.
Part 2: Method-Escape Analysis Datalog代写
sis is to determine for each reachable method m and for each object allocation site h in the body of m, whether any object allocated at h in any invocation of m may outlive that invocation. If so, the site is called method-escaping; if not, it is called method-local and all objects created at that site at runtime can be allocated on the method invocation stack (instead of being allocated in the heap, which is more expensive).
An object may outlive the invocation of its allocating method if any of the following conditions hold:
- An object allocated at site h may become reachable from some global variable (i.e., a static field in Java).
- An object allocated at site h may become reachable from some argument of method m.
- Anobject allocated at site h may become reachable from some return variable of method m.Datalog代写
Your analysis must consider each reachable method (that is, for each method in relation reachableM). The output of your analysis must be a relation localMH ⊆ (M × H) containing each tuple (m, h) such that allocation site h contained in the body of reachable method m is method-local.
Use the following relations to write your analysis.Datalog代写
- The relation MmethArg⊆ (M × Z × V) contains each tuple (m, z, v) such that variable v is the zth actual argument of method m. (Z denotes the set of integers.)
- The relation MmethRet⊆ (M × Z × V) contains each tuple (m, z, v) such that variable v is
the return value of method m (you can ignore z).
In addition, you may use the output relations of the pointer/call-graph analysis provided to you and described above. You will need to use both call-graph information (to determine method reachability) and points-to information (to determine the sites that may allocate objects that may escape). You can find the descriptions of these relations at the link provided in the Resources section above.
Note that this analysis requires reasoning about reachability in the heap (as exemplified by the use of the word “reachable” in the above three items); use the relation HFH recursively for this purpose. The relations VH and FH will provide the base cases for the recursion.
Items to Submit Datalog代写
Submit the following files in a single compressed file (.zip format) named lab4.zip:
- (50 points)dlog
- (50 points)dlog
Generally, credit for this lab is awarded as follows:
If the correct output set for Analysis X contains n entries and your output set contains a of those correct entries and b spurious entries, then you will receive max(0, (a-b)/n) of the points for Analysis X, using floating-point division.
Your datalog analyses will be graded against several programs; the three provided benchmark programs will constitute at least half of your grade but other programs will be run against your analysis as well. In general, the programs used in grading but not provided as part of the lab are of the same order of complexity as the provided programs.Datalog代写
While efficiency is important, it is entirely secondary to correctness. While you may lose some points for an inefficient yet correct algorithm, you will not gain points for an efficient yet
incorrect algorithm. There is a time limit several times longer than the expected execution time for an efficient algorithm in the grader. Any analysis that has not completed at the end of this time limit will not be considered correct as will any analysis that causes a crash.