## APLNG 578

Analyze the syntactic complexity of the two corpora using L2SCA. Save the output file. Discuss any interesting differences you observe.

### Instructions

Upload a single document in .pdf (preferred), .txt or .doc format to the Assignment 3 Dropbox in Canvas by 6pm, Thursday, 10/25/2018. You will need to use the Stanford POS Tagger, the TreeTagger, LCA, the Stanford Parser, Tregex, and L2SCA.

1. Collect two small written “corpora” of two different genres (e.g., news articles, blog entries, academic texts, literary texts, etc.), with 5-10 short texts in each “corpus”. Save each “corpus” as a single plain text (i.e., .txt) file (i.e., you will have a total of 2 textfiles).
2. Lexicalanalysis

a.Tag each corpus using the TreeTagger. Save the output files.

b.Convert the output files to the lemma_tag format.

c.Analyze the lexical complexity of the two corpora using the Lexical Complexity Analyzer. Save the output file. Discuss any interesting differences you observe.

3.Syntactic analysis

a.Parse each corpus using the Stanford Parser.

b.Define one syntactic structure you want to identify, formulate a Tregex pattern for that structure, and query the two parsed corpora separately to retrieve instances of that structure. Discuss any interesting differences you observe.

c.Analyze the syntactic complexity of the two corpora using L2SCA. Save the output file. Discuss any interesting differences you observe.

### 4.Semantic fieldanalysis  文科作业代写

a.Name two semantic fields (either any of the 21 major fields or any of the subfields) that you think may occur more frequently in one corpus than the other and explain why you think so.

b.Tag each corpus using USAS (http://ucrel.lancs.ac.uk/usas/tagger.html). Use AntConc to see if your hypothesis is true. Report and explain your results.

5.What to include in the file yousubmit

-1.A brief description of each corpus (genre, source of texts, number of texts, and number ofwords).

-2a.Command you used to run theTreeTagger.

-2b.Command you used to reformat the outputfiles.

-2c.Command you used to get the lexical complexity of the corpora. Copy and paste the output of the lexical complexity analyzer. Discussion of 2c (1-2paragraphs).  文科作业代写

-3a.Command you used to parse thecorpora.

-3b.The syntactic structure you defined, the Tregex pattern you formulated, and the actual command you used to query the parsed corpora. Discussion of 3b (1-2 paragraphs).

-3c.Command you used to get the syntactic complexity of the corpora. Copy andpaste the output of the syntactic complexity analyzer. Discussion of 3c (1-2 paragraphs).

-4a.Your two fields and yourhypotheses.  文科作业代写

-4b.Report and discuss yourresults.