Overview Publications Github Contribute Annotation Examples

Isabelle Parallel Corpus

Overview

A parallel corpus is a collection of scripts in one language paired to their counterparts in another language.

The Isabelle Parallel Corpus (IPC) is a community-driven initiative to create a parallel corpus of Isabelle artefacts. The IPC pairs formal artefacts (e.g., theorems, lemmata, definitions etc) in Isabelle to their natural language counterparts.

Entries in the IPC have multiple levels of data annotation. On the first level, natural language statements of an artefact (e.g., statement of a theorem) and the natural language source (e.g., a textbook) are logged. Level two annotation pairs sentences in the natural language proof to statements in the corresponding Isabelle proof script. Level three annotation allows for the alignment of words in the natural language to tokens in the Isabelle proof script.

The IPC is a living corpus with ongoing contributions from the community and with credited authorship (see our Hall of Contributors).

If you are actively working on formalising mathematical results from textbooks, papers or even simple problems found online, please consider making a contribution to the IPC.

We accept formalised problems of all difficulties. To learn more what information we need with your contribution, please see the contribute section.