Background
needs_human_review is triggered by two conditions: leakage_mass > flag_leakage_mass_above (default 2.0) or any_high_leaked=True. Investigation of TAB runs showed all flagged rows were triggered by any_high_leaked, not leakage mass.
The entities flagged as HIGH sensitivity were:
These are quasi-identifiers at most. Flagging them as HIGH causes needs_human_review=True even when leakage mass is low and utility is good (e.g. utility=0.97, leakage=0.96).
Questions to resolve
Related
Background
needs_human_reviewis triggered by two conditions:leakage_mass > flag_leakage_mass_above(default 2.0) orany_high_leaked=True. Investigation of TAB runs showed all flagged rows were triggered byany_high_leaked, not leakage mass.The entities flagged as HIGH sensitivity were:
court_name(e.g. 'District Court', 'Budapest District Court') — fixed by PR feat: Improve Sensitivity Disposition supplements #94 domain supplementslegal_role(e.g. 'defendant_in_recovery_action')former_senator(occupation type)These are quasi-identifiers at most. Flagging them as HIGH causes
needs_human_review=Trueeven when leakage mass is low and utility is good (e.g. utility=0.97, leakage=0.96).Questions to resolve
any_high_leakedalone be sufficient to triggerneeds_human_review, or should it require a minimum leakage mass too?Related
court_namevia LEGAL domain supplements