playground / nlp / stylometry

AI-Generated Text Detection

Twenty-eight statistical features decide human vs machine. Paste text or pick a sample and watch the classifier reason.

interactive / live in your browser

input text

or try

top discriminative features / mRMR

Whitespace ratio, readability, and vocabulary richness carry the most signal.

the pipeline

From raw data to a verifiable result

01 / dataset
HC3 human vs ChatGPT
The Human ChatGPT Comparison Corpus pairs human-expert and ChatGPT answers across open QA, finance, medicine, wiki/CS-AI, and Reddit ELI5. After filtering, 84,391 texts: 57,552 human, 26,839 AI.
02 / features
28 stylometric features
Every text is reduced to 28 model-agnostic surface features across four categories, capturing vocabulary richness, punctuation habits, readability, and distributional structure.
8 lexical7 syntactic5 readability8 distributional
03 / selection
mRMR and RFE agree
Two independent methods, a filter (mRMR) and a wrapper (RFE), each select 15 features. Their agreement on whitespace ratio, vocabulary richness, and Zipf's coefficient is strong evidence those features genuinely matter.
04 / training
Four classifiers, grid search
SVM, AdaBoost, Decision Tree, and Random Forest, each tuned via 5-fold stratified GridSearchCV over a StandardScaler pipeline.
05 / inference
Score a passage
Below: paste text or choose a sample. The demo surfaces the nearest curated example, its verdict, and the feature signals that drove it.
06 / evaluation
The generalization cliff
In-domain the model is excellent. Move to a different generator (Bloomz) and it drops below chance. The feature fingerprints are generator-specific, not universal.

evaluation artifacts

next experiment

News Topic Classification