Tag: Digital Humanities - Jan Švec

LLM-Based Metadata Extraction from NATO Scanned Documents

June 2, 2026

Large Language Models Digital Humanities

Historical archives contain valuable evidence, but scanned documents are difficult to search when their metadata is incomplete or inconsistent. At the C4DHI Anniversary Workshop, I presented a workflow that uses large language models to extract structured metadata from scanned NATO archival documents. The talk focused on noisy OCR, multilingual records and the need to preserve evidence for human review.

Agentic AI for Digital Humanities

26 April - 12 May 2026

Applied AI Large Language Models Digital Humanities

Complex research tasks do not fit into a single prompt: they need tools, intermediate checks and an inspectable sequence of steps. These workshop materials introduce agentic AI as a workflow for digital humanities and archival research. They connect an Oxford research stay, CLARIN collaboration and practical work with NATO archival documents.

# Digital Humanities

LLM-Based Metadata Extraction from NATO Scanned Documents

Agentic AI for Digital Humanities