playground / nlp / stylometry
AI-Generated Text Detection
Twenty-eight statistical features decide human vs machine. Paste text or pick a sample and watch the classifier reason.
- HC3 accuracy
- 97.40%
- Random Forest
- macro F1
- 0.9700
- Random Forest
- cross-gen
- 20-30%
- on Bloomz, below chance
- top feature
- whitespace
- MI 0.255
- train corpus
- HC3
- / 84,391 texts, 5 domains
- features
- 28
- / lexical / syntactic / readability / distributional
- selection
- mRMR + RFE
- / each picks 15
- compute
- CPU-only
- / no GPU inference
interactive / live in your browser
top discriminative features / mRMR
Whitespace ratio, readability, and vocabulary richness carry the most signal.
the pipeline
From raw data to a verifiable result
- 01 / dataset
HC3 human vs ChatGPT
The Human ChatGPT Comparison Corpus pairs human-expert and ChatGPT answers across open QA, finance, medicine, wiki/CS-AI, and Reddit ELI5. After filtering, 84,391 texts: 57,552 human, 26,839 AI.
- 02 / features
28 stylometric features
Every text is reduced to 28 model-agnostic surface features across four categories, capturing vocabulary richness, punctuation habits, readability, and distributional structure.
8 lexical7 syntactic5 readability8 distributional - 03 / selection
mRMR and RFE agree
Two independent methods, a filter (mRMR) and a wrapper (RFE), each select 15 features. Their agreement on whitespace ratio, vocabulary richness, and Zipf's coefficient is strong evidence those features genuinely matter.
- 04 / training
Four classifiers, grid search
SVM, AdaBoost, Decision Tree, and Random Forest, each tuned via 5-fold stratified GridSearchCV over a StandardScaler pipeline.
- 05 / inference
Score a passage
Below: paste text or choose a sample. The demo surfaces the nearest curated example, its verdict, and the feature signals that drove it.
- 06 / evaluation
The generalization cliff
In-domain the model is excellent. Move to a different generator (Bloomz) and it drops below chance. The feature fingerprints are generator-specific, not universal.
evaluation artifacts