Suggestion
Currently, Popper implements the MDL cost function
$cost_{B, E}(h) = size(h) + fn_{E,B}(h) + fp_{E,B}(h)$.
As proposed by Hocquette et al., 2024, this can be extended to
$cost_{B, E}(h) = \alpha size(h) + \beta fn_{E,B}(h) + \gamma fp_{E,B}(h)$
where $\alpha$, $\beta$ and $\gamma$ are parameters set by the user. For use cases where false negatives are worse than false positives (or vice versa), this would allow users to adjust how "general" (less FNs, more FPs) / "specific" (less FPs, more FNs) the learned programs become.
Let me know if you think this would be a worthwhile addition to Popper.
Implementation
I have already implemented this on my fork: https://github.com/sfluegel05/Popper/ and used it for some experiments. With higher $\beta$ values, I get a higher recall, with higher $\gamma$ values, I get a better precision.
Since my fork is not based on version 5.0.0 of Popper, I would have to check whether the fork can still be merged directly into Popper. Otherwise, I could also create a fresh PR.
Suggestion
Currently, Popper implements the MDL cost function
As proposed by Hocquette et al., 2024, this can be extended to
where$\alpha$ , $\beta$ and $\gamma$ are parameters set by the user. For use cases where false negatives are worse than false positives (or vice versa), this would allow users to adjust how "general" (less FNs, more FPs) / "specific" (less FPs, more FNs) the learned programs become.
Let me know if you think this would be a worthwhile addition to Popper.
Implementation
I have already implemented this on my fork: https://github.com/sfluegel05/Popper/ and used it for some experiments. With higher$\beta$ values, I get a higher recall, with higher $\gamma$ values, I get a better precision.
Since my fork is not based on version 5.0.0 of Popper, I would have to check whether the fork can still be merged directly into Popper. Otherwise, I could also create a fresh PR.