posted on 2022-07-25, 00:16authored byF Bohnert, Y Seroussi, I Zukerman
Authorship attribution deals with identifying the authors of anonymous texts. Recently, we found that the Latent Dirichlet Allocation (LDA) topic model can be used to improve authorship attribution accuracy. We build on this finding and show that employing a previously-suggested Author-Topic (AT) model outperforms LDA when applied to scenarios with many authors. In addition, we define a model that combines LDA and AT by representing authors and documents over two disjoint topic sets, and show that our model outperforms LDA, AT and support vector machines on datasets with two to 19,320 authors.