THRIFT-6069: python: use a flat fastbinary encode buffer#3596
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR adds coverage for Thrift’s accelerated binary protocol round-tripping a large TApplicationException, and refactors the Python C-extension encode buffer implementation away from std::vector to a manual malloc/realloc-backed buffer.
Changes:
- Add a unit test that exercises
_fast_encode/_fast_decodeviaTBinaryProtocolAcceleratedFactoryfor a large exception payload. - Replace
EncodeBuffer’sstd::vector<char>storage with a manual heap buffer and update protocol write logic to usememcpy. - Add required C/C++ headers for the new buffer implementation.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| lib/py/test/thrift_TBinaryProtocol.py | Adds accelerated large-message roundtrip test coverage for the fastbinary path. |
| lib/py/src/ext/types.h | Refactors EncodeBuffer to raw malloc/realloc storage with destructor-based cleanup. |
| lib/py/src/ext/protocol.tcc | Updates encode buffer allocation and write path to use the new EncodeBuffer API. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
As long as they adhere to AGENTS.md all is fine. |
Code reviewFound 1 issue:
Lines 57 to 71 in 35c1a53 🤖 Generated with Claude Code - If this code review was useful, please react with 👍. Otherwise, react with 👎. |
|
Hi, any plans to continue here? |
Client: py Replace the std::vector<char> back_inserter writes in the Python 3 fastbinary encoder with a malloc/realloc-backed buffer so encoded bytes can be copied in bulk while preserving the existing growth behavior. EncodeBuffer is non-copyable (its move/copy ctors are deleted), accepts a zero initial capacity without depending on implementation-defined malloc(0) behavior, and guards capacity growth against size_t overflow on both the running total and the doubling step. Allocation failures surface as Python MemoryError. Add a fastbinary round-trip test for a large TApplicationException payload to exercise the new buffer-growth path. The Thrift spec metadata used by the test is an immutable tuple. Performance (50k iterations, warmed): | Workload | Baseline | This commit | Speedup | |---------------------------|----------|-------------|---------| | encode simple (30B) | 0.60 us | 0.53 us | 1.13x | | encode 10-string (182B) | 1.44 us | 1.25 us | 1.15x | | encode complex (395B) | 3.02 us | 2.63 us | 1.15x | Larger object encode throughput: | Message size | apache/master | branch | Speedup | |--------------|---------------|-----------|---------| | 8 KiB | 12.19 us | 1.11 us | 11.0x | | 64 KiB | 101.58 us | 5.48 us | 18.5x | | 256 KiB | 564.34 us | 25.22 us | 22.4x | | 1 MiB | 2188.97 us | 136.22 us | 16.1x | Decode performance is unchanged; only the encode path is affected. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2a00614 to
9f7ec3d
Compare
Hi - I figured I'd share a few perf optimizations we are using internally. We are still on an older thrift version (😢 ), so I did use some agent magic to port these to the head of this repo. My testing was primarily on my branch, so buyer beware on that front!
This one is 1 of 3 PRs
Replace std::vector back_inserter writes in the fastbinary encoder with a malloc/realloc buffer so encoded bytes can be copied in bulk while preserving the existing growth behavior.
Performance (50k iterations, warmed)
Decode performance is unchanged because this only affects the encode path.