Skip to main content
Syracuse University Libraries

Digital Humanities Resources: Tools

Resources for Computational Text Analysis

HathiTrust Digital Library

HathiTrust Digital Library is the product of the HathiTrust, a partnership of major research institutions and libraries working together to preserve our cultural record of print materials. As of January 2013, the digital library comprises over 10 million volumes, over 3.2 million of which are public domain, and includes almost half the print holdings at SU Libraries. HathiTrust provides its members with full-text searching across the entire repository, full-text PDF downloads for items in public domain or not otherwise under copyright, and full-text access to brittle out-of-print items in SU Libraries. Researchers can conduct computational analysis of works in the HathiTrust Digital Library through HathiTrust Research Center (HTRC), which contains a suite of tools and services for text-based, data-driven research, such as HTRC Algorithms and Data Capsule. 

Gale's Nineteenth Century Collections

A multi-year global digitization and publishing program focusing on primary source collections of the long nineteenth century. Researchers can check the number of document relevant to key terms over a specific period of time through the Term Frequency function.  

Europeana Newspapers

The Europeana Newspapers project has converted 10 million historic newspaper pages to full text for Europeana. It has also developed a number of open source software tools, such as Named Entity Recognition Tool for Europeana Newspapers

LC for Robots

The Library of Congress Lab provides a list of APIs, bulk downloads, and tutorials for researchers to explore the machine-readable access to its digital collections.

Digital Public Library of America

The Digital Public Library of America (DPLA) aims at providing public access to digital holdings within America’s libraries, archives, museums, and other cultural heritage institutions. DPLA offers public API and  Bulk Download that grant access to all of DPLA’s records under a permissive license. 

New York Times APIs

NYT offers ten APIs to facilitate a wide range of uses, from custom link lists to complex visualizations. 

Open American National Corpus

An open access linguistic corpus consisting of 15 million words of American English automatically annotated for logical structure, word and sentence boundaries, part of speech (multiple tag sets), shallow parse (noun and verb chunks), and named entities.

Reddit API

Reddit provides API to access data from its posts, threads, comments, users and more. Historic Reddit data can be downloaded from this website

Documenting the Now

An organization & set of tools and materials and media for chronicling historically significant events via social media.

Handy Analytics Tools (No Programming Needed)

Data Basic

A suite of easy-to-use web tools for beginners that introduce concepts of working with data.

Lexos

A web-based platform for analysis and visualization of multiple text files.

Voyant Tools

Simple text analysis.

SimplyAnalytics

A web-based data visualization platform for creating thematic maps and reports with demographic and socio-economic data of the United States.

Tableau Public

A free software that can allow anyone to connect to a spreadsheet or file and create interactive data visualizations for the web

Dandelion API

An online text analysis platform which provides services such as Text Similarity Analysis, Entity Extraction, and Text Classification.

Tutorials for Programming Tools

The Programming Historian

Novice-friendly, peer-reviewed tutorials that help humanists learn a wide range of digital tools, techniques, and workflows to facilitate research and teaching.

Text Mining with R

Introduction to doing sentiment analysis, word and document frequency analysis, and topic modeling with the tidytext R package.

GIS in R

Tutorials designed for users with some familiarity with R, but require no knowledge of spatial analysis. 

GIS in Python

Tutorial materials for managing GIS data with Python.

Humanities Programming

Teaching command line,html, and python basics for humanists.