Enron sample — enron.sample • idiolect

A small sample of the Enron corpus comprising ten authors with approximately the same amount of data. The data was pre-processed using the POSnoise algorithm to mask content (see contentmask()).

Usage

enron.sample

Format

A quanteda corpus object.

Source

Halvani, Oren. 2021. Practice-Oriented Authorship Verification. Technical University of Darmstadt PhD Thesis. https://tuprints.ulb.tu-darmstadt.de/19861/