conference logo

Playlist "28C3: behind enemy lines"

Deceiving Authorship Detection

Michael Brennan and Rachel Greenstadt

Stylometry is the art of detecting authorship of a document based on the linguistic style present in the text. As authorship recognition methods based on machine learning have improved, they have also presented a threat to privacy and anonymity. We have developed two open-source tools, Stylo and Anonymouth, which we will release at 28C3 and introduce in this talk. Anonymouth aids individuals in obfuscating documents to protect identity from authorship analysis. Stylo is a machine-learning based authorship detection research tool that provides the basis for Anonymouth's decision making. We will also review the problem of stylometry and the privacy implications and present new research related to detecting writing style deception, threats to anonymity in short message services like Twitter, examine the implications for languages other than English, and release a large adversarial stylometry corpus for linguistic and privacy research purposes.