Skip to content

Add streamhg extractor#279

Merged
mhdzumair merged 2 commits into
mhdzumair:mainfrom
UrloMythus:main
May 17, 2026
Merged

Add streamhg extractor#279
mhdzumair merged 2 commits into
mhdzumair:mainfrom
UrloMythus:main

Conversation

@UrloMythus

@UrloMythus UrloMythus commented May 16, 2026

Copy link
Copy Markdown
Contributor

This commit add a new extractor

Summary by CodeRabbit

  • New Features
    • Added support for StreamHG as a streaming source, enabling extraction and proxying of HLS manifests from this host.
    • Extended URL host options to include StreamHG so requests for that host are recognized and routed to the new extractor.

Review Change Stack

@coderabbitai

coderabbitai Bot commented May 16, 2026

Copy link
Copy Markdown
Contributor

Walkthrough

Adds a new StreamHGExtractor that resolves HLS destinations, extends the schema to accept "StreamHG" as a host, and registers the extractor in the ExtractorFactory mapping.

Changes

StreamHG Extractor Implementation

Layer / File(s) Summary
StreamHGExtractor implementation and schema registration
mediaflow_proxy/extractors/streamhg.py, mediaflow_proxy/schemas.py
Implements StreamHGExtractor (sets mediaflow_endpoint, extract() uses eval_solver() with an hls2 pattern to compute destination_url) and adds "StreamHG" to ExtractorURLParams.host allowed literals.
Factory import and registration
mediaflow_proxy/extractors/factory.py
Imports StreamHGExtractor and registers it in ExtractorFactory._extractors under the "StreamHG" key so get_extractor() will instantiate it for that host.

Possibly Related PRs

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I nibble at manifests in the fog,
HLS threads gleam like a log,
StreamHG found, routed through the gate,
A tiny extractor, swift and straight,
Hop—now streams arrive on my plate!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Add streamhg extractor' accurately summarizes the main change: adding a new StreamHGExtractor class and its integration into the factory.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@mediaflow_proxy/extractors/streamhg.py`:
- Line 8: The module docstring in streamhg.py incorrectly says "Mixdrop URL
extractor"; update it to accurately describe this extractor (e.g., "StreamHG URL
extractor" or "StreamHG extractor") so the top-of-file docstring in streamhg.py
reflects the actual implementation and purpose of the module.
- Line 12: Remove the trailing whitespace on the blank line in the module
streamhg.py that is causing Ruff W293; open streamhg.py, locate the empty line
that contains trailing spaces and delete those spaces so the line is truly blank
(no spaces or tabs), then save and re-run linting to confirm the W293 warning is
resolved.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 22a5ef8e-ae6b-4f28-9efa-e3230ddd60ce

📥 Commits

Reviewing files that changed from the base of the PR and between 0ee1fac and bdc12b9.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (3)
  • mediaflow_proxy/extractors/factory.py
  • mediaflow_proxy/extractors/streamhg.py
  • mediaflow_proxy/schemas.py

Comment thread mediaflow_proxy/extractors/streamhg.py Outdated
Comment thread mediaflow_proxy/extractors/streamhg.py

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@mediaflow_proxy/extractors/streamhg.py`:
- Around line 18-24: Wrap the call to eval_solver(self, url, headers, patterns)
in a try/except that catches UnpackingError and re-raises an ExtractorError with
a clear message and the original exception attached; move the final_url
assignment into the try block, and ensure ExtractorError is imported/used
(consistent with StreamWish) so the extractor returns a consistent error type
instead of propagating UnpackingError.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 2d316d28-3548-4553-9017-7115412b9d8f

📥 Commits

Reviewing files that changed from the base of the PR and between bdc12b9 and 21927b4.

📒 Files selected for processing (1)
  • mediaflow_proxy/extractors/streamhg.py

Comment on lines +18 to +24
final_url = await eval_solver(self, url, headers, patterns)

return {
"destination_url": final_url,
"request_headers": self.base_headers,
"mediaflow_endpoint": self.mediaflow_endpoint,
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify extractor contract consistency for unresolved destinations.
# Expected: Other extractors either raise on unresolved URLs or return a guaranteed non-empty destination_url.

fd -i 'extractors' mediaflow_proxy | sed 's#^`#Found`: #'

# Find extract() implementations and nearby return/guard logic.
rg -n --type=py -C3 'async def extract\(' mediaflow_proxy/extractors
rg -n --type=py -C3 '"destination_url"\s*:' mediaflow_proxy/extractors
rg -n --type=py -C3 'eval_solver\(' mediaflow_proxy/extractors
rg -n --type=py -C3 'raise .*extract|raise .*ValueError|if not .*url|if .* is None' mediaflow_proxy/extractors

Repository: mhdzumair/mediaflow-proxy

Length of output: 50381


🏁 Script executed:

# Check eval_solver implementation and return type
fd -type f -name '*.py' -path '*/extractors/*' | xargs grep -l 'def eval_solver'

# Check what exception types are raised in extractors
rg -n 'raise ExtractorError' mediaflow_proxy/extractors | head -20

# Verify eval_solver usage in streamwish for comparison
rg -B5 -A10 'if not final_url' mediaflow_proxy/extractors/streamwish.py

Repository: mhdzumair/mediaflow-proxy

Length of output: 3707


🏁 Script executed:

# Find eval_solver definition
rg -n 'def eval_solver' mediaflow_proxy/

# Check eval_solver in other extractors (filelions, mixdrop, fastream) for comparison
rg -B3 -A10 'final_url = await eval_solver' mediaflow_proxy/extractors/

Repository: mhdzumair/mediaflow-proxy

Length of output: 4862


🏁 Script executed:

# Check the eval_solver implementation
rg -A 50 'async def eval_solver' mediaflow_proxy/utils/packed.py

Repository: mhdzumair/mediaflow-proxy

Length of output: 1075


🏁 Script executed:

# Check if streamhg imports or handles UnpackingError
rg -n 'UnpackingError\|import' mediaflow_proxy/extractors/streamhg.py

# Check how streamhg handles exceptions if at all
cat -n mediaflow_proxy/extractors/streamhg.py

# Verify what UnpackingError is and where it's defined
rg -n 'class UnpackingError' mediaflow_proxy/

Repository: mhdzumair/mediaflow-proxy

Length of output: 1079


🏁 Script executed:

# Check if eval_solver can return empty string or always raises
rg -B2 'return extracted_url\|raise UnpackingError' mediaflow_proxy/utils/packed.py | head -30

# Check how streamwish handles the exception from eval_solver
rg -B5 -A15 'try:' mediaflow_proxy/extractors/streamwish.py | grep -A15 'eval_solver'

# Verify if mixdrop and filelions handle exceptions
rg -B5 -A5 'await eval_solver' mediaflow_proxy/extractors/mixdrop.py
rg -B5 -A5 'await eval_solver' mediaflow_proxy/extractors/filelions.py

Repository: mhdzumair/mediaflow-proxy

Length of output: 1357


🏁 Script executed:

# Check how other extractors handle UnpackingError
rg -n 'UnpackingError' mediaflow_proxy/extractors/

# Check if extract() methods are supposed to catch all exceptions
rg -B10 -A20 'async def extract' mediaflow_proxy/extractors/base.py

# Check if there's error handling at a higher level
rg -n 'try.*extract\|except.*UnpackingError' mediaflow_proxy/

Repository: mhdzumair/mediaflow-proxy

Length of output: 889


Catch and handle exceptions from eval_solver() with proper error reporting.

eval_solver() raises UnpackingError if pattern extraction fails. Currently this exception propagates unhandled, crashing the extractor. The suggested fix should use ExtractorError (not ValueError) for consistency with other extractors like StreamWish.

Suggested fix
         final_url = await eval_solver(self, url, headers, patterns)
+        if not final_url:
+            raise ExtractorError("Failed to resolve StreamHG destination URL")

         return {
             "destination_url": final_url,
             "request_headers": self.base_headers,
             "mediaflow_endpoint": self.mediaflow_endpoint,
         }

Alternatively, wrap the call in try-except to catch UnpackingError:

+        try:
             final_url = await eval_solver(self, url, headers, patterns)
+        except Exception as e:
+            raise ExtractorError(f"Failed to resolve StreamHG destination URL: {e}")

         return {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
final_url = await eval_solver(self, url, headers, patterns)
return {
"destination_url": final_url,
"request_headers": self.base_headers,
"mediaflow_endpoint": self.mediaflow_endpoint,
}
final_url = await eval_solver(self, url, headers, patterns)
if not final_url:
raise ExtractorError("Failed to resolve StreamHG destination URL")
return {
"destination_url": final_url,
"request_headers": self.base_headers,
"mediaflow_endpoint": self.mediaflow_endpoint,
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@mediaflow_proxy/extractors/streamhg.py` around lines 18 - 24, Wrap the call
to eval_solver(self, url, headers, patterns) in a try/except that catches
UnpackingError and re-raises an ExtractorError with a clear message and the
original exception attached; move the final_url assignment into the try block,
and ensure ExtractorError is imported/used (consistent with StreamWish) so the
extractor returns a consistent error type instead of propagating UnpackingError.

@mhdzumair mhdzumair merged commit 4d23460 into mhdzumair:main May 17, 2026
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants