Skip to main content

LIN 451/651: Morphological Analysis: Corpus Search Resources: Home

About this Guide

This guide contains links to various corpora, databases, and other web resources for searching, visualization, and analysis.

If you have questions or problems using these resources, you can contact me directly:

Patrick Williams
315.443.9520
jpwill03@syr.edu

Corpus Search Resources

  • BYU hosts a number of open access corpora, maintained by Mark Davies. Each has its own features, scope, and limitation, but they share a common interface. A few highlights:
  • The Oxford English Dictionary, online version of longstanding authoritative dictionary, with access to the Historical Thesaurus.
     
  • WordSpy, a database of new terms that have some traction in the language, meaning that they’ve appeared at least three times in print or online, ideally in significant sources such as newspapers, magazines, books, and online. 
     
  • Google Books n-Grams allows us to look at the entire Google Books corpus (about 130 million unique titles) to visualize word frequency over time.
     
  • Urban Dictionary is the largest collection of slang words, idioms, definitions, and usage examples on the web. Features more than two dozen languages and compiled by the public.
     
  • Nineteenth Century Collections Online offers a unique visualization tool for the materials in its collections. Click the Term Frequency tab to use it.
     
  • The University of Glasgow Historical Thesaurus of English contains almost 800,000 words from Old English to the present day arranged into detailed hierarchies within broad conceptual categories such as Thought or Music. It is primarily based on the second edition of the Oxford English Dictionary and its Supplements, with additional materials from A Thesaurus of Old English, and was published in print as the Historical Thesaurus of the OED by Oxford University Press in October 2009.
     
  • Robots Reading Vogue is a corpus of the archives of Vogue Magazine, compiled for creative analysis and visualization by Peter Leonard of Yale University Library.
     
  • Credo Reference is an integrated resource of hundreds of dictionaries, subject encyclopedias, and other reference books that is cross-searchable by entries and allows for semantic visualizations.