Add streamhg extractor#279
Conversation
WalkthroughAdds a new StreamHGExtractor that resolves HLS destinations, extends the schema to accept "StreamHG" as a host, and registers the extractor in the ExtractorFactory mapping. ChangesStreamHG Extractor Implementation
Possibly Related PRs
Estimated Code Review Effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@mediaflow_proxy/extractors/streamhg.py`:
- Line 8: The module docstring in streamhg.py incorrectly says "Mixdrop URL
extractor"; update it to accurately describe this extractor (e.g., "StreamHG URL
extractor" or "StreamHG extractor") so the top-of-file docstring in streamhg.py
reflects the actual implementation and purpose of the module.
- Line 12: Remove the trailing whitespace on the blank line in the module
streamhg.py that is causing Ruff W293; open streamhg.py, locate the empty line
that contains trailing spaces and delete those spaces so the line is truly blank
(no spaces or tabs), then save and re-run linting to confirm the W293 warning is
resolved.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 22a5ef8e-ae6b-4f28-9efa-e3230ddd60ce
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (3)
mediaflow_proxy/extractors/factory.pymediaflow_proxy/extractors/streamhg.pymediaflow_proxy/schemas.py
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@mediaflow_proxy/extractors/streamhg.py`:
- Around line 18-24: Wrap the call to eval_solver(self, url, headers, patterns)
in a try/except that catches UnpackingError and re-raises an ExtractorError with
a clear message and the original exception attached; move the final_url
assignment into the try block, and ensure ExtractorError is imported/used
(consistent with StreamWish) so the extractor returns a consistent error type
instead of propagating UnpackingError.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 2d316d28-3548-4553-9017-7115412b9d8f
📒 Files selected for processing (1)
mediaflow_proxy/extractors/streamhg.py
| final_url = await eval_solver(self, url, headers, patterns) | ||
|
|
||
| return { | ||
| "destination_url": final_url, | ||
| "request_headers": self.base_headers, | ||
| "mediaflow_endpoint": self.mediaflow_endpoint, | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify extractor contract consistency for unresolved destinations.
# Expected: Other extractors either raise on unresolved URLs or return a guaranteed non-empty destination_url.
fd -i 'extractors' mediaflow_proxy | sed 's#^`#Found`: #'
# Find extract() implementations and nearby return/guard logic.
rg -n --type=py -C3 'async def extract\(' mediaflow_proxy/extractors
rg -n --type=py -C3 '"destination_url"\s*:' mediaflow_proxy/extractors
rg -n --type=py -C3 'eval_solver\(' mediaflow_proxy/extractors
rg -n --type=py -C3 'raise .*extract|raise .*ValueError|if not .*url|if .* is None' mediaflow_proxy/extractorsRepository: mhdzumair/mediaflow-proxy
Length of output: 50381
🏁 Script executed:
# Check eval_solver implementation and return type
fd -type f -name '*.py' -path '*/extractors/*' | xargs grep -l 'def eval_solver'
# Check what exception types are raised in extractors
rg -n 'raise ExtractorError' mediaflow_proxy/extractors | head -20
# Verify eval_solver usage in streamwish for comparison
rg -B5 -A10 'if not final_url' mediaflow_proxy/extractors/streamwish.pyRepository: mhdzumair/mediaflow-proxy
Length of output: 3707
🏁 Script executed:
# Find eval_solver definition
rg -n 'def eval_solver' mediaflow_proxy/
# Check eval_solver in other extractors (filelions, mixdrop, fastream) for comparison
rg -B3 -A10 'final_url = await eval_solver' mediaflow_proxy/extractors/Repository: mhdzumair/mediaflow-proxy
Length of output: 4862
🏁 Script executed:
# Check the eval_solver implementation
rg -A 50 'async def eval_solver' mediaflow_proxy/utils/packed.pyRepository: mhdzumair/mediaflow-proxy
Length of output: 1075
🏁 Script executed:
# Check if streamhg imports or handles UnpackingError
rg -n 'UnpackingError\|import' mediaflow_proxy/extractors/streamhg.py
# Check how streamhg handles exceptions if at all
cat -n mediaflow_proxy/extractors/streamhg.py
# Verify what UnpackingError is and where it's defined
rg -n 'class UnpackingError' mediaflow_proxy/Repository: mhdzumair/mediaflow-proxy
Length of output: 1079
🏁 Script executed:
# Check if eval_solver can return empty string or always raises
rg -B2 'return extracted_url\|raise UnpackingError' mediaflow_proxy/utils/packed.py | head -30
# Check how streamwish handles the exception from eval_solver
rg -B5 -A15 'try:' mediaflow_proxy/extractors/streamwish.py | grep -A15 'eval_solver'
# Verify if mixdrop and filelions handle exceptions
rg -B5 -A5 'await eval_solver' mediaflow_proxy/extractors/mixdrop.py
rg -B5 -A5 'await eval_solver' mediaflow_proxy/extractors/filelions.pyRepository: mhdzumair/mediaflow-proxy
Length of output: 1357
🏁 Script executed:
# Check how other extractors handle UnpackingError
rg -n 'UnpackingError' mediaflow_proxy/extractors/
# Check if extract() methods are supposed to catch all exceptions
rg -B10 -A20 'async def extract' mediaflow_proxy/extractors/base.py
# Check if there's error handling at a higher level
rg -n 'try.*extract\|except.*UnpackingError' mediaflow_proxy/Repository: mhdzumair/mediaflow-proxy
Length of output: 889
Catch and handle exceptions from eval_solver() with proper error reporting.
eval_solver() raises UnpackingError if pattern extraction fails. Currently this exception propagates unhandled, crashing the extractor. The suggested fix should use ExtractorError (not ValueError) for consistency with other extractors like StreamWish.
Suggested fix
final_url = await eval_solver(self, url, headers, patterns)
+ if not final_url:
+ raise ExtractorError("Failed to resolve StreamHG destination URL")
return {
"destination_url": final_url,
"request_headers": self.base_headers,
"mediaflow_endpoint": self.mediaflow_endpoint,
}Alternatively, wrap the call in try-except to catch UnpackingError:
+ try:
final_url = await eval_solver(self, url, headers, patterns)
+ except Exception as e:
+ raise ExtractorError(f"Failed to resolve StreamHG destination URL: {e}")
return {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| final_url = await eval_solver(self, url, headers, patterns) | |
| return { | |
| "destination_url": final_url, | |
| "request_headers": self.base_headers, | |
| "mediaflow_endpoint": self.mediaflow_endpoint, | |
| } | |
| final_url = await eval_solver(self, url, headers, patterns) | |
| if not final_url: | |
| raise ExtractorError("Failed to resolve StreamHG destination URL") | |
| return { | |
| "destination_url": final_url, | |
| "request_headers": self.base_headers, | |
| "mediaflow_endpoint": self.mediaflow_endpoint, | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@mediaflow_proxy/extractors/streamhg.py` around lines 18 - 24, Wrap the call
to eval_solver(self, url, headers, patterns) in a try/except that catches
UnpackingError and re-raises an ExtractorError with a clear message and the
original exception attached; move the final_url assignment into the try block,
and ensure ExtractorError is imported/used (consistent with StreamWish) so the
extractor returns a consistent error type instead of propagating UnpackingError.
This commit add a new extractor
Summary by CodeRabbit