Rotunda Press Digital Archives
Thanks to Patricia Searl for the following:
http://www.upress.virginia.edu/rotundaRot
https://founders.archives.gov/
Data Services
UVA Library Research Data Services -- Data Discovery and Acquisition: http://data.library.virginia.edu/datasources/
Thanks to Christine Ruotolo for the following:
Collections of Datasets
DH Toychest (Alan Liu):
http://dhresourcesforprojectbuilding.pbworks.com/w/page/69244469/Data%20Collections%20and%20Datasets
Includes demo corpora, which are “sample or toy collections of texts that are ready-to-go for demonstration purposes or hands-on tutorials--e.g., for teaching text analysis, topic modeling, etc.”
Journal of Open Humanities Data
https://openhumanitiesdata.metajnl.com/
“features peer reviewed publications describing humanities data or techniques with high potential for reuse”
Kaggle Datasets
https://www.kaggle.com/datasets
Eclectic collection of hundreds of datasets in many fields
Digital Text Collections
HathiTrust
Featured data sets: https://babel.hathitrust.org/cgi/mb?colltype=featured
https://www.hathitrust.org/datasets
Eighteenth Century Collections Online:
https://github.com/Text-Creation-Partnership/ECCO-TCP (in TEI)
https://old.datahub.io/dataset/tcp-ecco-18th-century-texts (in plain text)
Over 2,000 texts made available by the ECCO Text Creation Partnership
Internet Archive
Modern English Collection: Public domain texts digitized by the UVA Library
(link to download will be posted soon; titles can be browsed here)
Twitter Datasets
Documenting the Now:
(datasets of tweet IDs that can be rehydrated back into full tweets)
Museum Datasets
https://github.com/caesar0301/awesome-public-datasets#museums
Data Repositories
Dataverse:
https://dataverse.lib.virginia.edu/ (UVA’s Dataverse)
https://dataverse.harvard.edu/ (Harvard’s Dataverse; includes data from many institutions)
Share, publish, and archive your data. Find and cite data across all research fields
Tools for Cleaning Your Data
OpenRefine: http://openrefine.org/