Detection accuracy is measured against all 18 HIPAA Safe Harbor identifier categories using a synthetic annotated benchmark corpus. Data below are from the full Tenet pipeline: piiranha ML + regex augmentation + Presidio fallback + clinical NER.
All benchmarks use synthetic data. Real-world performance may vary. Biometric identifiers and full-face photographs are not covered.
AUC scores from ROC analysis across the full detection pipeline
| HIPAA Category | AUC | Method | |
|---|---|---|---|
| Dates (DOB) | 1.000 | ML + regex | |
| SSN | 0.999 | ML | |
| 0.994 | ML | ||
| Account Numbers | 0.991 | ML | |
| Names (Given + Surname) | 0.975 | ML + Presidio | |
| Phone/Fax | 0.949 | ML | |
| Geographic (Street, City, ZIP) | 0.932 | ML | |
| Other Unique IDs (Tax, CC) | 0.885 | ML + regex | |
| License Numbers | 0.650 | ML | |
| Device Identifiers | 0.642 | ML | |
| IP Addresses | — | Regex-only | |
| VIN | — | Regex-only | |
| Medical Record # | — | Regex-only | |
| Web URLs | — | Regex-only | |
| Biometric | — | Not covered |
8 of 10 ML-scored categories exceed AUC 0.88. The 6 HIPAA-critical types average 0.98 AUC.
Configured thresholds balance recall (catching PHI) against false positive rate (flagging clean data)
| Category | Threshold | Recall (TPR) | False Positive Rate |
|---|---|---|---|
| Dates | 0.94 | 100% | 0.0% |
| SSN | 0.97 | 100% | 0.4% |
| 0.66 | 98.8% | 0.0% | |
| Account Numbers | 0.82 | 98.1% | 0.0% |
| Names | 0.45 | 96.0% | 1.0% |
| Phone/Fax | 0.44 | 90.5% | 0.4% |
| Geographic | 0.51 | 86.7% | 0.8% |
Lower thresholds on Names and Geographic reflect inherent ambiguity — these categories have higher irreducible false positive rates than structured identifiers.
Separate model evaluated on diagnosis, treatment, and lab test extraction from clinical notes
| Clinical Entity | AUC | Positives | Negatives |
|---|---|---|---|
| Diagnosis | 0.906 | 48 | 122 |
| Treatment | 0.903 | 41 | 129 |
| Lab Test | 0.745 | 14 | 156 |
Lab test AUC reflects limited positive sample size (n=14) — interpret with caution. Diagnosis and treatment detection are production-ready.
Full receiver operating characteristic curves for each HIPAA identifier category and clinical entity type
All HIPAA Categories
Names AUC 0.975
Dates / DOB AUC 1.000
Email AUC 0.994
SSN AUC 0.999
Phone / Fax AUC 0.949
Geographic AUC 0.932
Account Numbers AUC 0.991
All Clinical Entities
Diagnosis AUC 0.906
Treatment AUC 0.903
Lab Test AUC 0.745
The benchmark suite fails CI if any of these thresholds are not met
70% minimum — HIPAA-critical types are hard failures
15% maximum including adversarial samples
60% minimum across all entity types
Minimum 8 of 18 HIPAA identifier categories