Issues reproducing Precision/Recall/F1/F2 on the i2b2 dataset #11

soulaven · 2021-02-19T20:17:53Z

Hi,

Thank you for the development and release of this package. I followed the steps 0, 2a, 1b, 1c using the PHI config file, and then 2d with prod=True. In calculation of the scores and following my understanding of the paper, I separated all PHI text on the word level including sanitizing for edge cases such as "," and "." at the end of words (otherwise the stats are much lower). However, I was only able to achieve Precision 0.696 Recall 0.915 F1 0.791 F2 0.861 on the test set, which is some way away from the statistics reported on the i2b2 test set in the paper. I think I am most likely missing something, but am unsure what it is.

RedChrists · 2022-06-01T20:31:14Z

In addition to step 0, a manual review of the results may be necessary to confirm that the missed i2b2 tags are actual PHI according to HIPAA.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issues reproducing Precision/Recall/F1/F2 on the i2b2 dataset #11

Issues reproducing Precision/Recall/F1/F2 on the i2b2 dataset #11

soulaven commented Feb 19, 2021 •

edited

Loading

RedChrists commented Jun 1, 2022

Issues reproducing Precision/Recall/F1/F2 on the i2b2 dataset #11

Issues reproducing Precision/Recall/F1/F2 on the i2b2 dataset #11

Comments

soulaven commented Feb 19, 2021 • edited Loading

RedChrists commented Jun 1, 2022

soulaven commented Feb 19, 2021 •

edited

Loading