Skip to content

v1.0 alpha: Results and Statistics

Kathleen Muenzen edited this page May 23, 2018 · 2 revisions


The tag-specific recalls, category-specific recalls, false negatives andfalse positives for v1.0-alpha of PHIlter can be found on this google sheet. Of the remaining false negatives, the following word types were not considered "critical PHI" for this release of PHIlter:

  1. Days of the week
  2. Years
  3. Holidays
  4. Device models
  5. Provider and patient initials
  6. Hospital department names
  7. Some hospital names
  8. Organization names
  9. Isolated state abbreviations
  10. Isolated street numbers

Performance Statistics

True Negatives: 306284 True Positives: 20402 False Negatives: 267 False Positives: 6993

  • Global Recall (before category correction): 98.71%
  • Global Recall (after category correction): 98.99%
  • Global Precision: 74.47%
  • Global Retention: 97.77%