##### 当前位置 |首页 > 作业代写 > JAVA代写 >

COMP122 Assessment 2

Worth 25% of the final course mark.

Learning Outcomes. This assessment addresses the following learning outcomes of COMP122:

Describe the concept of object polymorphism in theory and demonstrate this concept in practice. Design and code iterators for collection-based data management.

Identify and describe the task and issues involved in the process of developing interactive products for people, and the techniques used to perform these tasks.

Part I (50% of assessment mark)

Introduction

Note: Students who took COMP105 will recognize this problem to some extent. We are going to solve it here without the use of Haskell’s fancy functional programming methods. However, all the background you need about this problem is given here, so don’t worry if you haven’t seen this before.

The Caesar cipher. The Caesar cipher is an ancient method of encrypting text, i.e. attempting to transform text into a format that is unreadable by others. The Caesar cipher is defined by a “shift”, i.e. a number that is used to shift letters elsewhere in the alphabet. (Caesar would have been operating in Latin, we will be working in English.)

To encrypt your text using a given (positive) shift, you translate letters by that many places later in the alphabet.

For example, if your text to encrypt is ”Meet me at midnight under the bridge” and your shift is 5, the encrypted text is “Rjjy rj fy rnisnlmy zsijw ymj gwnilj”. In this case, the letter “m” gets translated to five places later in the alphabet, to the letter “r”, the “e” to a “j” (five places later in the alphabet), etc. As said, we “wrap around”, so a “z” gets changed to a “e” given a shift of 5.

We can interpret a negative value for the shift as translating letters backwards (e.g. an “f” gets encrypted as the letter “b” if the shift is 4).

(The above figure was taken from https://phoenixsic.wordpress.com/2011/11/29/the-art-of-deciphering-encrypted-codes-cryptanalysis/.)

It is believed that Caesar actually used such a “shift cipher”, but this was likely also aided by the fact that many people at that time would have been illiterate, or barely literate, so they might think messages were written in a foreign language.

Unfortunately for Caesar, this type of cipher is easily broken by the use of “frequency analysis”. In other words, we know the general frequency of the occurrence of letters in the English alphabet, e.g. “e” is the most common letter, then “t”, etc. Caesar would have working in Latin, whereas we are working in English, but the ideas are the same. I suppose we would want to know, or guess, the language that the person is using, since letter frequencies can vary significantly across different languages.

Cracking the Caesar cipher. How could we go about cracking this cipher? Suppose we are given a “cipher text”, i.e. text that has already been encrypted with some (unknown) shift, and we want to determine the original unencrypted text (typically referred to as the “plaintext”). For example, given the cipher text “Hxotm euax ycuxj yotik ck gzzgiq gz jgct”, how can we reconstruct the original text?

1. Try decoding the text with a particular shift.

2. Compute the letter frequencies for that decoded text.

3. Check to see if the frequencies are close to the English frequencies.

4. Examine all 26 possible shifts, and take the one that is “closest” to English frequencies.

How do we measure “closeness” of the letter frequencies in the given text to those of regular English? If freq denotes the frequency of a letter in our text, and English is the corresponding English letter frequency, we use the 2 score (that is the Greek letter “chi”, so it’s the “chi-squared score”). This score is defined as follows:

In other words, we sum the fraction over all 26 possible letters to determine this score. The 2 score will be lower when the frequencies are closer to English. Note that when we do this, we are ignoring the case of letters (we want to treat upper and lower case equally for our purposes).

Note that we don’t have to actually perform the shift and then compute the frequencies, we can compute the frequencies of the cipher text and then shift those frequencies.

We will be using the following known frequencies for the letters in English, conveniently given here in a Java-style array (arranged in the usual alphabetic order):

```double[] knownFreq = {0.0855, 0.0160, 0.0316, 0.0387, 0.1210, 0.0218, 0.0209, 0.0496, 0.0733, 0.0022, 0.0081, 0.0421, 0.0253, 0.0717, 0.0747,

0.0207, 0.0010, 0.0633, 0.0673, 0.0894,

0.0268, 0.0106, 0.0183, 0.0019, 0.0172,

0.0011};```

Requirements

For this problem, you want to develop a program that will crack a given Caesar cipher text, and to display the original plain text on the screen. You may assume that the cipher text has been produced using a Caesar cipher by some (unknown) shift. This shift has only been applied to letters, not anything else that may appear in the text, so spaces are spaces, any punctuation is left untouched, etc. Lower case letters are transformed to lower case letters, etc.

I don’t want to tell you exactly how to go about doing this, but you will need to develop a method to encrypt (or decrypt) a string given a shift. (After all, you need to give the original unencrypted plaintext.)

In Java, char and int variables are (more or less) interchangeable. A Java statement like int diff = ’e’ - ’b’;

is perfectly legal, i.e. Java can interpret the “difference of two letters” with no problem, and this will give an integer value. If the two letters are of the same case, then this will give a value between 25 and 25. In particular, if ch is a lower case (char) letter, then

`int diff = ch - ’a’;`

tells you how many places after ‘a’ that letter is in the alphabet. (The value of diff will be between 0 and 25 inclusive.)

We can use this idea in encrypting/decrypting letters. Assuming that shift is a nonnegative integer, we can encrypt the (lower case) letter ch by doing the following:

```char
newChar

= (
char
) ((
ch

-

’a’

+

shift
) % 26

+
’a’
);```

What is this doing? First we find out the number of characters after ‘a’ for the letter ch, and add the shift. The % operator is “mod”, so we get the remainder left over after dividing by 26. That is doing the “wrap around”. Then we turn this back into a char by “adding” the letter ‘a’ and typecasting to a char variable.

Using the above procedure, we can then encrypt any lower case letter. To encrypt an upper case letter, we can do the same except replace ‘a’ by ‘A’.

How do we encrypt both lower and upper case letters? Let ch be any character (which could also be a space or non-alphabetic letter). If ch-’a’ is between 0 and 25 (inclusive), then ch is a lower case letter, and we encrypt as above.

Alternatively, if ch-’A’ is between 0 and 25 (inclusive), we encypt ch similarly to get a new upper case letter.

Otherwise, ch is not a letter we want to change, so we leave it alone.

How do we put the ideas together to encrypt (or decrypt) a whole string? Recall a precious prac-tical where we talked about some Java String methods. In particular, if str is a String, then str.charAt(i) gives you the char at index i in str. (Also remember that Java strings use 0-based indices.)

Can you write a Java method to encrypt (or decrypt) a String given a shift? Note that if the shift is a negative number, add 26 to it until you get a positive number. (This is because the Java “mod” operator doesn’t deal with negative numbers in the right way.) Note that shifting by 5, say, is the same as shifting by 21 i.e. shifting back 5 places is the same as shifting forward by 21 places.

Once you have a method to encrypt/decrypt with a shift, you’re half way done. You need to count letter frequencies. Remember that frequencies are fractional values. In the string ”Mississippi moon”, the frequency of the letter “m” is 2=15, while the frequency of the letter “s” is 4=15. As stated before, we consider upper and lower case to be the same in this case, and we also only count the letters (that’s why the denominator is 15, not 16). To compare to English frequencies, we need to find the frequency for all 26 letters to compute the 2 score. (Many will be 0, but we still need all of them.)

Contrary to what I said earlier, you don’t need to “shift and then count letter frequencies”. You can actually count letter frequencies of the cipher text, and then shift those frequencies when searching for the right shift. (Shift frequencies, and then compute the 2 score for that shift.)

Finally, write your application to take input from the command line. Your string will be in the variable args[0]. If your input has spaces in it, put that input inside of (single or double) quotation marks. You should print out an error message if you invoke your program with no string specified on the command line.

Sample output

Here’s some sample output from my version of the program.

java BreakCaesar "htcs aplntgh, vjch, pcs bdctn"

==============================

htcs aplntgh, vjch, pcs bdctn

==============================

send lawyers, guns, and money

java BreakCaesar "aqw’nn pgxgt wpfgtuvcpf vjku"

==============================

aqw’nn pgxgt wpfgtuvcpf vjku

==============================

you’ll never understand this

java BreakCaesar "Wsccsccszzs novdk lveoc"

==============================

Wsccsccszzs novdk lveoc

==============================

Mississippi delta blues

java BreakCaesar "Vlcha siol mqilx.  Qy unnuwe un xuqh."

==============================

Vlcha siol mqilx.  Qy unnuwe un xuqh.

==============================

Bring your sword.  We attack at dawn.

java BreakCaesar "Hvs eiwqy pfckb tcl xiadsr cjsf hvs zonm rcug."

==============================

Hvs eiwqy pfckb tcl xiadsr cjsf hvs zonm rcug.

==============================

The quick brown fox jumped over the lazy dogs.

\$ java BreakCaesar

Oops, you haven’t given enough parameters!

Usage: java BreakCaesar "string"

Questions to consider

You don’t have to write any code, just comment on these questions I ask here.

What would we do differently if we know the language we’re examining isn’t English but some other language (e.g. suppose we know the people communicating via this Caesar cipher usually writes/speaks in Polish)?

Suppose we (somehow) know that the person doing the encryption uses one shift value for lowercase letters, and a different shift value for uppercase letters. What would we have to do differently? How would that affect our calculations, or how would we have to alter our program/calculations to account for this?

Part II

Introduction

Continuing along the lines we started in the course notes, let’s consider a little more in the area of text analysis or “natural language processing”.

Given a set of documents, we want to find the “important” words that appear in this set of documents. Firstly, we note that the meaning of the word “document” is dependent upon the collection we are consider-ing.

For example, if we are looking at a set of tweets from Twitter, then it’s probably sensible to consider each tweet as its own separate entity, so in this case the word “document” means one tweet.

If we are looking at a set of books, then one “document” could mean one book. Similarly, if we consider a set of (academic) journal papers, one “document” might likely refer to an individual paper.

On the other hand, if we are examining a single book, it might make more sense to consider each chapter as a “document”. Or perhaps we go down to the level where a “document” is a single paragraph of the book. And so forth. . .

For the purposes of this exercise, we will consider a set of text files, and each “document” will be one of these text files, but in the discussion that follows, keep the idea in mind that the word “document” can depend upon the particular collection we consider.

Also keep in mind in what follows that we are always going to want to ignore the case of letters, so we will want to begin our analysis by converting everything to lower case (say).

Term frequency-inverse document frequency

In order to define what is an “important” word in our set of documents, we will use a measure that is referred to as “term frequency-inverse document frequency”. This is the product of two numbers, one being the “term frequency” and the other being “inverse document frequency”.

To continue our definitions, suppose that our collection of documents is numbered, so that we have documents d1; d2; : : : ; dn. Therefore, there are n documents in our collection.

“Term frequency” is defined for a term (or word) and a particular document. Suppose that t is a term in a document d. The term frequency, tf(t; d), of t in d is defined a

We can calculate these term frequencies for each term in each document. (We have essentially talked about how to do this in class. In that example we were calculating the numerators of tf(t; d). We can calculate the term frequencies by summing up those numerators and dividing by the sum. Keep in mind these will give you decimal numbers, which is what we want.)

Term frequency is obviously a measure of how often a term appears in a particular document. If a term t does not appear in the document di, then (by definition) we have tf(t; di) = 0.

Side note: There are other definitions of “term frequency” that are sometimes used. These can include the raw frequencies (the numerators of tf(t; d)), boolean frequencies (these are 1/0 variables depending upon whether a word exists/not exists in the document), etc.

“Inverse document frequency” is defined for a term (word) over the whole set of documents. Inverse document frequency is a measure of how much “information” the term provides, i.e. how common or rare the term is across all documents. One of the most common definitions of inverse document frequency is this one (where t is a term, and D is referring to the collection of all documents d1; : : : ; dn):

As defined here, idf(t; D) only makes sense if the term t appears in at least one of the documents, otherwise you are dividing by 0. To account for this, sometimes idf(t; D) is defined by adding 1 to the denominator to avoid a “division by zero” error. We will use the definition defined above, and assume that we only consider terms that appear in at least one of the documents in our collection.

For a given term t, idf(t; D) is largest if t appears in only one one of the documents. On the other hand, if t appears in all n documents, it means that idf(t; D) is zero. So common words like “the”, “and”, “or” will often have idf values of 0, or close to 0, assuming we are looking at a large collection of documents. These words give little information in the meaning of a document, and therefore we want to discount them. This is what idf is supposed to do for us.

The “term frequency-inverse document frequency” of a term t is defined on a per-document basis for a document di, over the whole collection of documents D.

A high value of tf idf is achieved by a high-term frequency (in the given document) and a low frequency of that term in the whole collection of documents. Therefore, these tf idf values tend to filter out common terms (i.e. the tf idf values are low), and the terms with high values are often the terms of interest in further analysis of the text.

These tf idf numbers are what we are interested in computing.

Note: If we have only a single document in our collection D, it doesn’t make sense to adjust the tf values by multiplying by the idf value of that term. This is because for each term in the (single) document, the idf value will always be 0 (since log(1=1) = 0). Therefore, in the special case that we are considering a single document, we revert to using tf values. In that case, however, we need to proceed carefully (and perhaps differently). It’s likely that the high tf values will correspond to common (non-interesting) words like “the”, “or”, etc.

You can find a small example of computing tf idf values on the Wikipedia page for tf-idf. There is also a similar explanation to what I have given here for the definition of tf idf on that webpage.

It should be clear that you can find many other resources online that will talk more about the concept of tf idf values (weights). Keep in mind that these other resources might be using different definitions for “term frequency” or “inverse document frequency”. They should be similar, but could result in different values in the calculations.

Requirements

As mentioned, we want to compute tf idf values for a collection of documents. Or, more specifically, you need to write a Java program that will compute tf idf values.

We note the following for this assessment (what follows are requirements for the assessment, as well as some hints/suggestions):

1. Requirement: Call your application program “TFIDF.java”.

2. Requirement: Filenames of the text files will be supplied to the program via command line arguments. In other words, this TFIDF program will be invoked with a call like

```java TFIDF example1
.
txt example2
.
txt```

This means that the filenames will be in the args array, i.e. the filenames are the values args[0], args[1], etc.

3. For the purposes of this exercise, you can assume that the text files exist (you don’t have to check for existence/non-existence of files).

4. You can assume that at least one filename is given to the program. (But it’s possible that only one filename will be given. So what do you do in that case? See the “Note” above, right before the “Requirements” section began.)

5. Things will likely be much easier if you use HashMap variables in this part of the assessment (or some similar data structure). We used a HashMap in the example in the course notes when counting word frequencies in a single document. Recall that we were looking at the number of occurrences of a word in a document, so those are integer values. We want to compute something different (but related) here.

6. Java does not allow you to make an array of HashMap objects. (Ok, you might be able to do this, but it’s kind of difficult to do, and not recommended.) You can, however, make an ArrayList of HashMap objects. Here’s how you can declare such a thing and (start to) initialize and use it. (Obviously this is only partial Java code, not a full-on application.)

```/ Declare an ArrayList of HashMap objects.  Note that each

/ HashMap uses the same (key,value) types.

ArrayList<HashMap> list = new ArrayList<HashMap> ();

/ Declare an individual HashMap.

HashMap stuff = new HashMap();

/ Add (key,value) pairs to the HashMap. stuff.put("thing", 12.5); stuff.put("more things", 20.0);

/ Retrieve the first HashMap in the ArrayList, and the value correspoding

/ the the key "thing". (This assumes that key exists in the HashMap.) list.get(0).get("thing");```

6. We have already (essentially) discussed how to compute the tf values. Ok, almost. We talked about how to count the “raw frequencies” (number of occurrences of each word) for a set of words in one text file. To convert those to tf values, we would want to divide those raw frequency values by the total number of words in the file. How do you perform that task?

This is why I suggest using an appropriate HashMap to store these tf values, one for each file.

7. Once you find the tf values for each file, how are you going to compute the idf values? Once again, this is (more or less) the same as computing frequency counts in a file, except that instead of a file, you have a collection of (previously) computed word frequencies for each text file.

For each term (word) t that appears in *any* of the text files, how do you count how many files that term appears in? If you can figure out how to do that, you are nearly done computing the idf values. How do you determine n, the number of documents in the collection we consider?

8. Note that the logarithm in the definition of idf is the “base 10 logarithm”. You can get this function in Java with the method Math.log10(x) for a positive number x.

(Actually, it doesn’t really matter much what base you use for the logarithm, since logarithms with different bases differ by a constant factor. Use “base 10” so it’s easy to compare answers with mine.)

9. Finally, given the tf values for each document and the idf values for each word (over the set of documents), how do you combine these to compute tf idf values for each word, in each document?

10. Keep in mind the special case when there is only one text file (document) supplied to your program. What do you do then? (Again, I have already answered this question. See the “Note” above.)

11. Requirement: I want you to print out the word with highest tf idf value for each of the text documents. If there are several such words, print out any one of those words with the highest tf idf value.

Sample Output

You can find the files that I used on the module website. Look for the file called “Samples.zip”, download, and unzip that file. The files included there are:

Chap-4-Frank.txt (Chapter 4 of Frankenstein, or The Modern Prometheus by Mary Shelley)

Adv-3.txt (“A Case of Identity”, included in The Adventures of Sherlock Holmes by Sir Arthur Conan Doyle)

Dunwich.txt (The Dunwich Horror by H.P. Lovecraft)

time.txt (The Time Machine by H.G. Wells)

Note: These texts are not “perfect”, in the sense that there could be misspellings present, or words that run together because spaces are missing, etc. That doesn’t matter for our purposes here. Just use the texts as given.

All of these files were downloaded from Project Gutenberg. (You should really check out this website if you do not know about it. Lots of classic books available for free.) Having said that, I am also obligated to say the following:

These eBooks are for the use of anyone anywhere in the United States and most other parts of the world at no cost and with almost no restrictions whatsoever. You may copy it, give it away or re-use it under the terms of the Project Gutenberg License included with these eBooks or online at www.gutenberg.org. If you are not located in the United States, you’ll have to check the laws of the country where you are located before using these eBooks.

The Project Gutenberg license is available here.

java TFIDF Adv-3.txt Dunwich.txt Chapter-4-Frank.txt time.txt Max TFIDF value for each file.

==========

==========

holmes 0.003912241785716382

==========

Dunwich.txt

==========

whateley 0.002523302562145693

==========

Chapter-4-Frank.txt

==========

feelings 0.0014205111867745869

==========

time.txt

==========

weena 9.886042550541253E-4

Note that the tf idf values have picked out the name (or surname) of a main character in three out of the four documents, which makes sense as those names appear many times in each of their respective texts, but in none of the others (so their idf values are high). (If you’re familiar with these works, you recognize those names. . . )

java TFIDF Adv-3.txt Dunwich.txt Chapter-4-Frank.txt Max TFIDF value for each file.

==========

==========

holmes 0.0031003782620574196

==========

Dunwich.txt

==========

whateley 0.001999669969487269

==========

Chapter-4-Frank.txt

==========

pursuit 0.001313349895020699

It’s interesting (to me at least) how removing one of the files changes which word has the highest tf idf value in one of the other files (obviously the idf values of words have changed, but the tf values haven’t in each document).

java TFIDF Chapter-4-Frank.txt Max TF value for this file.

==========

Chapter-4-Frank.txt

==========

the 0.053480141565080616

As stated, for a single document, we output a tf value instead of a tf idf value, since the idf value would always be zero. In this case, the highest tf value picks out a rather common word in the document. (The same word “the” will be selected for each of the individual files, i.e. it has the highest frequency in each single document. Hence its idf value (and tf idf value) is 0 for any set of more than one of these documents.)

java TFIDF when.txt Chapter-4-Frank.txt Max TFIDF value for each file.

==========

when.txt

==========

necessary 0.012542916485999216

==========

Chapter-4-Frank.txt

==========

0.012192721019815203

The comparison of a short document and a long one gives a slightly odd result. The word “I” is contained in one of them and not the other, so is the word with highest tf idf value in that document. Its tf value is large in “Chapter-4-Frank.txt” and its idf value is non-zero since “I” doesn’t appear in “when.txt”.

A note on testing

Make some small files on which you test your application program. A single file means you are (or should be) calculating and showing tf values. So it’s easy to make a small file, and check that the term frequencies are being calculated correctly. Similarly, for two (small) files, it’s relatively easy to check that the tf idf values are correct. Do those calculations by hand to verify they are correct!

If tfidf is a HashMap variable in a Java program, then this Java statement will print out the entire HashMap, which will give you large ouput if the HashMap is big.

HashMap<String, Double> tfidf = new HashMap<String, Double>();

...

...

System.out.println(tfidf);

The toString method of the HashMap class has been overridden to give this output. But you can print out the whole HashMap for small examples to verify the tf idf values are correct in those cases.

Here’s a small(ish) example (where I have added some line breaks to the output to fit it onto the page).

We can see that terms that appear in both documents have a tf idf value of 0 (because the idf value is 0).

java TFIDF when.txt fear.txt Max TFIDF value for each file.

==========

when.txt

==========

necessary 0.012542916485999216

{which=0.0, necessary=0.012542916485999216, in=0.012542916485999216, one=0.012542916485999216, another=0.012542916485999216, for=0.012542916485999216, political=0.012542916485999216, them=0.012542916485999216, it=0.012542916485999216, bands=0.012542916485999216, when=0.012542916485999216, people=0.012542916485999216, becomes=0.012542916485999216, the=0.0, connected=0.012542916485999216, with=0.012542916485999216, of=0.0, dissolve=0.012542916485999216, have=0.0, course=0.012542916485999216, to=0.0, human=0.012542916485999216, events=0.012542916485999216}

==========

fear.txt

==========

fear 0.0177076468037636

{needed=0.0088538234018818, unreasoning=0.0088538234018818, convert=0.0088538234018818, unjustified=0.0088538234018818, we=0.0088538234018818, advance=0.0088538234018818, firm=0.0088538234018818, that=0.0088538234018818, into=0.0088538234018818, assert=0.0088538234018818, of=0.0, me=0.0088538234018818, only=0.0088538234018818, have=0.0, let=0.0088538234018818, nameless=0.0088538234018818, so=0.0088538234018818, belief=0.0088538234018818, fear=0.0177076468037636, all=0.0088538234018818, which=0.0, terror=0.0088538234018818, paralyzes=0.0088538234018818, is=0.0088538234018818, my=0.0088538234018818, the=0.0, retreat=0.0088538234018818, itself=0.0088538234018818, efforts=0.0088538234018818, to=0.0, thing=0.0088538234018818, first=0.0088538234018818}

A question to consider

You don’t need to write any code for this question, just comment on what I am asking here.

If we are considering only a single document (or file in our case), as stated the idf value doesn’t make any sense, because it will always be 0 for any word in the document. (There is a single document, so n = 1, and any word in that document obviously appears in the document so idf(t; d1; D) = log(1=1) = log(1) = 0.)

So how could we proceed in this case? Instead of tf idf values, I suggested considering tf values alone, but then words with high tf values are likely to be non-interesting words like “and”, “the”, and “or”. (See the above example where I had only one file, and the word “the” had the highest tf value.)

Would you have any suggestions how we might change our approach? (I’ve actually hinted at one suggestion elsewhere in this assessment. . . ) (Note that there isn’t necessarily one “right answer” to this question.)

Submission Instructions

Your submission should consist of a report (a PDF file) and implementation (source code) files. Be certain to include all Java source files (i.e. the “.java” files) needed to run your application.

Submit one compressed file, using only the “zip” format for compression, that includes all files (report and Java source code) for your submission.

The report (a PDF file) should consist of

Requirements: Summary of the above requirements statement in your own words. Do this for each part of the assessment.

Analysis and Design: A short (one paragraph) description of your analysis of the problem including a Class Diagram outlining the class structure for your proposed solution, and pseudocode for methods. Again, do this for each part of the assessment. Pseudocode need not be a line-by-line description of what you do in each method, but an overview of how methods operate, what parameters they take as input, and what they return. Ideally, pseudocode should be detailed enough to allow me (or someone else) to implement a method in any programming language of their choosing, but not rely on language-specific constructs/instructions. You can, of course, say that a method returns a Java HashMap (or something else), but then someone familiar with that data structure knows the meaning (so you don’t need to explain what a HashMap is).

For Part I you must submit your Java file(s). In my solution I actually only have one class file that includes several methods in it. You may arrange your solution differently than mine (possibly using more than one class file), and that is fine.

For Part II you must again submit your Java file(s).

Testing: A set of proposed test cases presented in tabular format including expected output, and evidence of the results of testing (the simplest way of doing this is to cut and paste the result of running your test cases into your report).

Note that your programs will be tested against other inputs than the ones provided in my exam-ples, but the test cases will satisfy the specifications outlined earlier (e.g. for Part II, the files given to the program will exist, so you don’t need to check for existence/non-existence of the files).

The implementation should consist of

Your Java source files, i.e. the relevant .java files, not the class (.class) files.

https://sam.csc.liv.ac.uk/COMP/Submissions.pl

Friday, 13 April 2018, 5:00pm (Friday of Week 8)

Notes/Penalties

Because submission is handled electronically, ANY FILE submitted past the deadline will be consid-ered at least ONE DAY late (penalty days include weekends since assessment is performed electroni-cally). Late submissions are subject to the University Policy (see Section 6 of the Code of Practice on Assessment).

Please make sure your Java classes successfully compile and run on ANY departmental computer system. For this reason, the use of IDEs like NetBeans and Eclipse is discouraged. If your Java source files do not successfully compile/run, you could suffer a penalty of 5 marks from your grade. This penalty could be applied for Part I and/or Part II of the assessment, but only once per part. (However, this will not turn a passing grade into a failing grade.)

If your report is not a PDF file, you will lose 5 marks from your final grade. (However, this will not turn a passing grade into a failing grade.)

If you use any form of compression other than “zip”, you risk your files not being read by the assessors, resulting in a mark of 0! If your files can be read, you will still lose 5 marks from your final grade. (As before, this will not turn a passing grade into a failing grade.)

Note this is an individual piece of work. Please note the University Guidelines on Academic Integrity (see Appendix L of the Code of Practice on Assessment). You should expect that plagiarism detection software will be run on your submission to compare your work to that of other students. (Obviously in this case, the ThreeDice.java class file will be ignored as it has been supplied to you.)

Mark Scheme

Marks for each part of the assessment will be awarded for:

Analysis and Design 10% (This includes things like UML diagrams for your classes/application, jus-tification for how/why you structure your classes, appropriate pseudocode for methods, etc.)

Implementation 25% (This includes correctness of your programs, appropriate level of comments, correct indentation so your code is readable, having useful identifiers, etc.)

Testing 10% (Have you done suitable testing in terms of checking different paths through your code, checking what you do with “unexpected” inputs, a sufficient number of tests, etc.?

Extra questions in Part I or Part II 5%

Please see the module web page for the feedback form.