Skip to content

fix(data): align filter() kwarg between Python stub and pybind#1511

Open
Ace3Z wants to merge 1 commit into
facebookresearch:mainfrom
Ace3Z:fix/filter-predicate-kwarg
Open

fix(data): align filter() kwarg between Python stub and pybind#1511
Ace3Z wants to merge 1 commit into
facebookresearch:mainfrom
Ace3Z:fix/filter-predicate-kwarg

Conversation

@Ace3Z

@Ace3Z Ace3Z commented May 25, 2026

Copy link
Copy Markdown

Follow up to #1510 for the same bug class.

The Python stub at src/fairseq2/data/data_pipeline.py:275 declares def filter(self, predicate: ...), but the pybind binding at native/python/src/fairseq2n/bindings/data/data_pipeline.cc:516 exposed the arg as py::arg("fn"). So calling pipeline.filter(predicate=...) blows up:

TypeError: filter(): incompatible function arguments. The following argument types are supported:
    1. (..., fn: Callable[[Any], bool]) -> ...
Invoked with: ...; kwargs: predicate=<lambda>

Renamed the pybind kwarg from fn to predicate to match the documented Python signature. I grep'd every .filter( call across src/, tests/, examples/, and recipes/: there are 17 callers, all of them pass the predicate positionally, none use fn=. So no internal breakage.

Added a regression test in tests/unit/data/data_pipeline/test_filter.py that calls filter with predicate= as a kwarg. Built fairseq2n from source on Linux + Python 3.12 + Torch 2.12 and ran the A/B cycle: test fails against the unpatched binding with the exact TypeError above, passes against the patched one. The other two tests in TestFilterOp are unchanged and still pass.

As promised in the description of #1510.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 25, 2026
@Ace3Z Ace3Z force-pushed the fix/filter-predicate-kwarg branch 2 times, most recently from 12a1b86 to e61bf98 Compare May 25, 2026 10:16
The Python stub `DataPipelineBuilder.filter` declared `predicate` while
the C++ pybind binding used `py::arg("fn")`, so calling
`pipeline.filter(predicate=...)` (the documented form) raised
`TypeError: incompatible function arguments`.

Same bug class as facebookresearch#1103 and the fix that landed in facebookresearch#1510 for
`dynamic_bucket`. Rename the pybind keyword from `fn` to `predicate`
so the runtime matches the documented Python signature. No internal
callsite uses the keyword form, so this is source-compatible for all
existing callers.

Follow-up to facebookresearch#1510.
@Ace3Z Ace3Z force-pushed the fix/filter-predicate-kwarg branch from e61bf98 to b1e1e2c Compare May 25, 2026 10:16
@Ace3Z

Ace3Z commented May 29, 2026

Copy link
Copy Markdown
Author

Friendly ping. This is the sibling of #1510, the same one line kwarg fix but applied to filter(). @cirquit @cbalioglu a quick look when you get a chance would be great.

@Ace3Z

Ace3Z commented May 29, 2026

Copy link
Copy Markdown
Author

Friendly ping. @cbalioglu, would you have a moment to look at this? Tiny pybind/stub alignment fix. Happy to address any feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant