Comparison & Similarity: Using Machine Learning to Study a Large Collection of Russian Diaries
This talk explores how machine learning, specifically transformer-based large language models (LLMs), can analyze an extensive collection of Russian historical diaries (1800-2018). LLMs enable computational methods like semantic text similarity and clustering by creating numerical representations of the texts. These methods can reveal significant topics and subjects within the diaries, such as prices and weather, offering new possibilities for digital scholarship.