Cardiff School of Computer Science and Informatics
Coursework Assessment Pro-forma
Module Code: CMT 209 Module Title: Informatics Lecturer: A. Kimmig
Assessment Title: Metadata-based Image Retrieval
Assessment Number: Date Set: 1.11.18
Submission Date and Time: 7.12.18 at 9:30am.
Return Date: 9.1.19
This assignment is worth 50 % of the total marks available for this module. The penalty for late or non-submission is an award of zero marks.
Your submission must include the official Coursework Submission Cover sheet, which can be found here:
|Cover sheet||Compulsory||One PDF (.pdf) file||[student number].pdf|
|Report||Compulsory||One PDF (.pdf) file||cmt209_report_[student number].pdf|
Program source listings for tasks 2 and 3 must be included as appendices in the main PDF report file.
Any deviation from the submission instructions above (including the number and types of files submitted) may result in a mark of zero for the assessment or question part.
In this assignment, you will develop different methods to query a collection of images using the meta-information associated with the images. To restrict the scope of image tags and to abstract from low-level vision details, we will use images and annotations from the “abstract scenes” dataset by Zitnick et al1. These abstract scenes depict
1 C. L. Zitnick, and D. Parikh, Bringing Semantics Into Focus Using Visual Abstraction, In CVPR, 2013.
children playing outdoors, and have been constructed from a limited set of clip-art objects through crowdsourcing.
The archive file cmt209-coursework-2018.zip on Learning Central contains all the data (images + their annotations) needed for this assignment, together with a detailed description of the data.
For tasks 2 and 3, you may use any popular modern programming language.
In tasks 2 & 3, each retrieval method must return the 5 images with highest similarity to the query, as a list of image identifiers together with the similarity, sorted in non- increasing order of similarity, e.g.,
Actually displaying the image along with the identifier is optional, but recommended for easier inspection of results.
The total length of the text in your report should be 1000 to 1800 words.
Task 1: Controlled Vocabulary [20% of coursework marks]
The first task is to develop a controlled vocabulary for describing the abstract scene images. This will be used in the second task to make the image search more flexible. The controlled vocabulary has to satisfy the following criteria:
- It includes the 58 predefined tags (as listed in the data description file), and synonym rings (with one or more synonyms) for at least five of
- It is hierarchical, with at least three layers below the topmost
In your report, you should
- list this vocabulary, using a visual layout that clearly shows the hierarchy and distinguishes the main term for each concept from thesynonyms;
- brieflymotivate the choices you made (both for the structure and for the synonyms), providing examples on how these choices can help with image retrieval
Task 2: Keyword-based Search
The second task is to develop and implement the following methods to retrieve images based on textual queries in the form of one or more keywords. Subtasks b), c) and d) build upon solutions to previous subtasks, but can also be solved individually by building upon the solution to subtask a). Remember to use the output format specified in the introduction above.
- [20%of coursework marks] Method 1 takes one or more words as input, and returns the images whose predefined tags (as given in image_tags.csv) best fit those keywords, i.e., the more similar an image’s set of tags is to the set of words given as
input, the higher the image appears in the result. This method ignores input words that are not part of the predefined tag list.
- [10%of coursework marks] Method 2 extends Method 1 to be robust against typos, i.e., if an input term is not part of the tag list, uses string similarity measures such as Levenshtein edit distance to map the input to the most similar existing term. For instance, the queries “appletree balloons” and “appeltree balloon” should both find images showing the apple tree and the balloons. For standard measures of string similarity you can use an existing programming library, such as the jellyfish python library (https://pypi.python.org/pypi/jellyfish). The report should list any such libraries used.
- [10% of coursework marks] Method 3 extends Method 2 using the controlled vocabulary developed in Task 1. That is, in addition to queries using (slight variants of)the predefined tags, it can answer queries that use known synonyms and concepts higher up in the hierarchy. For instance, if there is a concept “pet” in the hierarchy that contains “cat” and “dog”, a query for pet should retrieve images with a cat or
- [10% of coursework marks] Method 4 extends Method 3 to also return imagesthat match conceptually similar tags, i.e., tags that are close to the input term in the hierarchy, where the most similar tags achieve the highest ranks. For instance, given query “oaktree pie owl”, pictures with these three objects should be ranked higher than pictures with an oak tree, a pizza and an owl, which in turn should be ranked higher than pictures with an oak tree and an owl, but no food at
In your report, you should
- for each method, define the similarity function it uses (how do you rankthe images?), and briefly motivate your choice (why is this a good way to rank?);
- list the top 5 answers (image identifier and similarity) of each method for the three queries
Q1: appletree cat Q2: baseballglove Q3: tree hat duck
- briefly discuss the key strengths and weaknesses of the methods based on these answers
Task 3: Image-based Search
The final task is to retrieve images based on their similarity to a given image rather than a textual query, again using the same output format.
- [10% of coursework marks] As a first step, write a method that, given an image identifier, constructs a textual query from that image’s tags and uses the bestmethod you developed in Task 2 to find similar
- [20% of coursework marks] Write a method that, for a given image identifier, combinesthe text-based similarity used in 3a) with a second similarity measure based on the spatial information. For instance, if the crown in the input image is close to the girl’s head, the spatial similarity measure (on its own) should prefer images where the crown is close to the girl’s head over those where it is far away. Hint: have a look at
the closeness values of sunglasses and hats to get a better idea of the range of these values before defining your similarity measure.
In your report, you should
- define the spatial similarity measure you use, specify how it is combined with the text-based one, and briefly motivate yourchoices;
- for each of the two methods, list the top 5 answers (image identifier and similarity) of each method for the images Scene339_0, Scene335_0 and Scene313_0, as well as precision and recall for the top 5 answers, using the image class as the ground truth, i.e., for image SceneX_Y, all images SceneX_* are considered similar, and all other images notsimilar;
- briefly discuss the key strengths and weaknesses of the two methods based on these results.
Learning Outcomes Assessed
- Understand the performance and limitations of salient algorithms associated with different aspects of
- Understand, through the conduct of case studies, the practical application of modern informatics techniques to a number of
Criteria for assessment
Credit will be awarded against the following criteria.
- Quality of solutions (vocabulary in task 1, top 20 answers in tasks 2 &3)
- Clarity of explanation, informativeness and justification ofdecisions
Feedback and suggestion for future learning
Feedback on your coursework will address the above criteria. Feedback and marks will be returned on 9.1.19 via learning central. Where requested, this will be supplemented with oral feedback individually.
Feedback from this assignment will be useful for other modules requiring report writing.