Skip to content

Parameterise the cost function #143

@sfluegel05

Description

@sfluegel05

Suggestion

Currently, Popper implements the MDL cost function

$cost_{B, E}(h) = size(h) + fn_{E,B}(h) + fp_{E,B}(h)$.

As proposed by Hocquette et al., 2024, this can be extended to

$cost_{B, E}(h) = \alpha size(h) + \beta fn_{E,B}(h) + \gamma fp_{E,B}(h)$

where $\alpha$, $\beta$ and $\gamma$ are parameters set by the user. For use cases where false negatives are worse than false positives (or vice versa), this would allow users to adjust how "general" (less FNs, more FPs) / "specific" (less FPs, more FNs) the learned programs become.

Let me know if you think this would be a worthwhile addition to Popper.

Implementation

I have already implemented this on my fork: https://github.com/sfluegel05/Popper/ and used it for some experiments. With higher $\beta$ values, I get a higher recall, with higher $\gamma$ values, I get a better precision.

Since my fork is not based on version 5.0.0 of Popper, I would have to check whether the fork can still be merged directly into Popper. Otherwise, I could also create a fresh PR.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions