Fix: Migrate YouTube channel loader to pytubefix by lalithdabilpuram01 · Pull Request #5451 · crewAIInc/crewAI

lalithdabilpuram01 · 2026-04-14T16:56:15Z

Problem

The YoutubeChannelLoader was failing to load YouTube channels and extract video definitions because the pytube library is broken due to YouTube backend changes. This was causing tools to silently return "No relevant content found".

Solution

Swapped the core dependency from pytube to pytubefix in crewai-tools/pyproject.toml.
Updated youtube_channel_loader.py to correctly extract watch URLs from the pytubefix video objects via a list comprehension:
video_urls = [v.watch_url if hasattr(v, 'watch_url') else str(v) for v in raw_video_urls]

Summary by CodeRabbit

Bug Fixes
- Updated YouTube loading to work with a more reliable backend, improving channel video discovery and transcript fetching.
- Improved video URL handling so items are read more safely when loading channel content.
- Refreshed the missing-package message to guide users to the correct installation step.
Chores
- Updated a project dependency used for YouTube-related functionality.

github-actions · 2026-06-09T13:15:22Z

This PR is stale because it has been open for 45 days with no activity.

coderabbitai · 2026-06-24T05:40:47Z

📝 Walkthrough

Walkthrough

The PR replaces pytube~=15.0.0 with pytubefix>=8.4.0 in the pyproject.toml dependency list and updates YoutubeChannelLoader to import Channel and YouTube from pytubefix. The video URL extraction is also updated to safely read each item's watch_url attribute before falling back to string conversion.

Changes

pytube → pytubefix migration

Layer / File(s)	Summary
Dependency declaration update `lib/crewai-tools/pyproject.toml`	Replaces `pytube~=15.0.0` with `pytubefix>=8.4.0` in the project dependencies list.
YoutubeChannelLoader import and URL extraction migration `lib/crewai-tools/src/crewai_tools/rag/loaders/youtube_channel_loader.py`	Changes `Channel` and `YouTube` imports to `pytubefix`, updates the `ImportError` install hint, and replaces direct `channel.video_urls` slicing with a mapping that prefers each item's `watch_url` attribute, falling back to `str()`.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly matches the main change: migrating the YouTube channel loader from pytube to pytubefix.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@lib/crewai-tools/pyproject.toml`:
- Line 11: The YouTube loader dependency is mismatched with its import, causing
metadata extraction to silently fail. Update youtube_video_loader.py so it
imports YouTube from pytubefix, or alternatively restore pytube as a declared
dependency; make sure the YouTubeLoader class can successfully enter the
metadata block and populate title, author, length_seconds, and description
instead of falling into the broad except.

In `@lib/crewai-tools/src/crewai_tools/rag/loaders/youtube_channel_loader.py`:
- Around line 60-65: Materialize channel.video_urls once and reuse the same list
in YoutubeChannelLoader to avoid repeating the network-backed enumeration. In
the loader logic where raw_video_urls and metadata are built, keep the existing
watch_url fallback for pytubefix YouTube objects, but compute total_videos from
the cached list instead of calling list(channel.video_urls) again. Use the
nearby YoutubeChannelLoader variables raw_video_urls, video_urls, and metadata
to locate the change.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: d11d8a5e-bdac-42c5-b45b-39c21900f85c

📥 Commits

Reviewing files that changed from the base of the PR and between a046e6a and 6b53b45.

📒 Files selected for processing (2)

lib/crewai-tools/pyproject.toml
lib/crewai-tools/src/crewai_tools/rag/loaders/youtube_channel_loader.py

coderabbitai · 2026-06-24T05:44:16Z

 requires-python = ">=3.10, <3.14"
 dependencies = [
-    "pytube~=15.0.0",
+    "pytubefix>=8.4.0",


🗄️ Data Integrity & Integration | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Find any remaining pytube (non-pytubefix) imports/usages rg -nP '\bpytube\b(?!fix)' lib/crewai-tools/src

Repository: crewAIInc/crewAI

Length of output: 306

🏁 Script executed:

#!/bin/bash set -euo pipefail printf '\n== pyproject snippet ==\n' sed -n '1,80p' lib/crewai-tools/pyproject.toml printf '\n== youtube_video_loader.py relevant section ==\n' sed -n '1,180p' lib/crewai-tools/src/crewai_tools/rag/loaders/youtube_video_loader.py

Repository: crewAIInc/crewAI

Length of output: 6418

🏁 Script executed:

#!/bin/bash set -euo pipefail printf '\n== pyproject snippet ==\n' sed -n '1,80p' lib/crewai-tools/pyproject.toml printf '\n== youtube_video_loader.py relevant section ==\n' sed -n '1,180p' lib/crewai-tools/src/crewai_tools/rag/loaders/youtube_video_loader.py

Repository: crewAIInc/crewAI

Length of output: 6418

🌐 Web query:

pytubefix import path pytube compatibility YouTube documentation

💡 Result:

Pytubefix is an actively maintained fork of the original Pytube library, designed to serve as a drop-in replacement [1][2]. Because it maintains the same interface and class structure, transitioning from Pytube to Pytubefix generally requires only minimal changes to your import statements [3][2]. Import Path Changes To migrate, replace your existing Pytube import paths with Pytubefix import paths [3]. For example: Change: from pytube import YouTube from pytube.cli import on_progress To: from pytubefix import YouTube from pytubefix.cli import on_progress Compatibility and Migration - Compatibility: Pytubefix preserves the original Pytube API, meaning your existing logic for streams, downloading, and playlist management will function identically [3][2]. - Migration Status: As of June 2026, the original Pytube repository is considered unmaintained, and the community has widely adopted Pytubefix as the standard alternative to resolve ongoing YouTube API compatibility issues, such as HTTP 400 errors [1][2]. - Documentation: Detailed documentation and usage examples for Pytubefix are available at its official Read the Docs site [4][5]. If you are starting a new project or require more robust, long-term stability, many developers also consider using yt-dlp, which is frequently updated to handle YouTube's evolving API [2].

Citations:

1: Pytube is no longer maintained. Time to migrate to PYTUBEFIX pytube/pytube#2167

2: https://dev-solve.com/posts/4a8a09b

3: pytubefix pytube/pytube#1958

4: https://pytubefix.readthedocs.io/en/stable/user/quickstart.html

5: https://pytubefix.readthedocs.io/en/stable/index.html

🌐 Web query:

pytubefix import path pytube compatibility YouTube documentation

💡 Result:

Pytubefix is an actively maintained fork of the now-unmaintained pytube library [1][2]. It is designed as a drop-in replacement, meaning you can migrate existing projects by changing your import statements and installing the pytubefix package [3][2]. To transition from pytube to pytubefix: 1. Installation: Uninstall the original pytube package (if present) and install pytubefix via pip: pip install pytubefix [4][5] 2. Updating Imports: Change your Python import statements from pytube to pytubefix. For example, replace: from pytube import YouTube with from pytubefix import YouTube [6][2] Because pytubefix maintains a nearly identical API to pytube, your existing code structure for downloading streams, accessing video metadata, or using callbacks generally remains the same [6][2]. Official documentation for pytubefix is available on Read the Docs, which confirms this usage pattern [7][8][9].

Citations:

1: Pytube is no longer maintained. Time to migrate to PYTUBEFIX pytube/pytube#2167

2: https://roundproxies.com/blog/pytubefix/

3: https://www.reddit.com/r/learnpython/comments/1edm1q5/pytube_not_working_please_help/

4: https://pypi.org/project/pytubefix/

5: https://github.com/JuanBindez/pytubefix

6: pytubefix pytube/pytube#1958

7: https://pytubefix.readthedocs.io/en/stable/user/quickstart.html

8: https://pytubefix.readthedocs.io/en/stable/index.html

9: https://pytubefix.readthedocs.io/en/latest/

Update youtube_video_loader.py to use pytubefix or keep pytube as a dependency.
lib/crewai-tools/src/crewai_tools/rag/loaders/youtube_video_loader.py still imports from pytube import YouTube, so with only pytubefix declared the metadata block will be skipped and title, author, length_seconds, and description will remain unset because the broad except swallows the import failure.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/crewai-tools/pyproject.toml` at line 11, The YouTube loader dependency is mismatched with its import, causing metadata extraction to silently fail. Update youtube_video_loader.py so it imports YouTube from pytubefix, or alternatively restore pytube as a declared dependency; make sure the YouTubeLoader class can successfully enter the metadata block and populate title, author, length_seconds, and description instead of falling into the broad except.

coderabbitai · 2026-06-24T05:44:17Z

+            raw_video_urls = list(channel.video_urls)[:max_videos]
+            video_urls = [v.watch_url if hasattr(v, 'watch_url') else str(v) for v in raw_video_urls]
+            # -----------------------------------------------------------------
+
            metadata["num_videos_loaded"] = len(video_urls)
            metadata["total_videos"] = len(list(channel.video_urls))


🚀 Performance & Scalability | 🟠 Major

🧩 Analysis chain

🌐 Web query:

In pytubefix 8.4.0, does Channel.video_urls yield URL strings or YouTube objects, and does each item have a watch_url attribute?

💡 Result:

In pytubefix version 8.4.0, the Channel.video_urls property yields YouTube objects rather than URL strings [1]. Because these items are YouTube objects, they do possess a watch_url attribute, which provides the URL string [2][1]. Note that this behavior was a recognized bug in the library [1]. A fix to ensure that Channel.video_urls returns strings directly was merged into the repository in May 2026 [3][4], appearing in later versions of the library [5]. However, for version 8.4.0 specifically, you must access the watch_url attribute of the yielded YouTube objects to retrieve the URL strings [1].

Citations:

1: [BUG] Channel.video_urls yields YouTube instances instead of str JuanBindez/pytubefix#577

2: https://pytubefix.readthedocs.io/en/stable/user/channel.html

3: Return strings from Channel.video_urls JuanBindez/pytubefix#630

4: JuanBindez/pytubefix@5fa1e82

5: https://pytubefix.readthedocs.io/en/stable/api.html

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== target file ==" sed -n '1,140p' lib/crewai-tools/src/crewai_tools/rag/loaders/youtube_channel_loader.py echo echo "== dependency references ==" rg -n "pytubefix|youtube_channel_loader|video_urls|watch_url|channel\.videos" -S .

Repository: crewAIInc/crewAI

Length of output: 7705

🏁 Script executed:

#!/bin/bash set -euo pipefail python3 - <<'PY' import pathlib, re path = pathlib.Path('lib/crewai-tools/src/crewai_tools/rag/loaders/youtube_channel_loader.py') text = path.read_text() for i, line in enumerate(text.splitlines(), 1): if 'video_urls' in line or 'watch_url' in line or 'FIX' in line or 'metadata["total_videos"]' in line: print(f"{i}: {line}") PY

Repository: crewAIInc/crewAI

Length of output: 685

🏁 Script executed:

#!/bin/bash set -euo pipefail sed -n '1,140p' lib/crewai-tools/src/crewai_tools/rag/loaders/youtube_channel_loader.py printf '\n---\n' rg -n "pytubefix|video_urls|watch_url|channel\.videos" -S lib/crewai-tools

Repository: crewAIInc/crewAI

Length of output: 7398

🌐 Web query:

pytubefix Channel video_urls implementation generator list watch_url issue 577 8.4.0

💡 Result:

In the pytubefix library, issue 577 identified a bug where the Channel.video_urls property incorrectly yielded YouTube objects instead of string-based watch URLs [1]. This behavior was inconsistent with expected usage and required users to manually extract the watch_url attribute from each YouTube object [1]. This issue was officially resolved in pull request #630, which was merged on May 16, 2026 [2][3]. Following this update, the property was corrected to return a list of strings, aligning with the behavior documented in later versions of the library (e.g., version 10.8.1) [4]. If you are encountering this issue on an older version of pytubefix, such as 8.4.0 or other versions prior to the fix, you can implement a workaround by manually extracting the watch_url from the YouTube objects: from pytubefix import Channel, YouTube channel = Channel("YOUR_CHANNEL_URL") # Workaround for older versions def get_channel_video_urls(channel: Channel): for video in channel.video_urls: yield video.watch_url # Use the generator or cast to list urls = list(get_channel_video_urls(channel)) [1] It is highly recommended to upgrade to the latest stable version of pytubefix to benefit from this fix and other ongoing maintenance [4].

Citations:

1: [BUG] Channel.video_urls yields YouTube instances instead of str JuanBindez/pytubefix#577

2: JuanBindez/pytubefix@5fa1e82

3: Return strings from Channel.video_urls JuanBindez/pytubefix#630

4: https://pytubefix.readthedocs.io/en/stable/api.html

Materialize channel.video_urls once and reuse it. list(channel.video_urls) is evaluated twice here, so the loader repeats the same network-backed enumeration just to compute total_videos. Keep the watch_url fallback; in pytubefix 8.4.0 video_urls yields YouTube objects, so the channel.videos suggestion and # FIX wording are misleading.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/crewai-tools/src/crewai_tools/rag/loaders/youtube_channel_loader.py` around lines 60 - 65, Materialize channel.video_urls once and reuse the same list in YoutubeChannelLoader to avoid repeating the network-backed enumeration. In the loader logic where raw_video_urls and metadata are built, keep the existing watch_url fallback for pytubefix YouTube objects, but compute total_videos from the cached list instead of calling list(channel.video_urls) again. Use the nearby YoutubeChannelLoader variables raw_video_urls, video_urls, and metadata to locate the change.

Migrate YouTube channel loader to pytubefix

4613d23

lalithdabilpuram01 mentioned this pull request Apr 14, 2026

[ Bug]: YoutubeChannelSearchTool Fails for Valid Channel Inputs #5429

Closed

Merge branch 'main' into fix/youtube-loader-pytubefix

a20e735

github-actions Bot added the no-pr-activity label Jun 9, 2026

Merge branch 'main' into fix/youtube-loader-pytubefix

6b53b45

coderabbitai Bot reviewed Jun 24, 2026

View reviewed changes

github-actions Bot removed the no-pr-activity label Jun 24, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix: Migrate YouTube channel loader to pytubefix#5451

Fix: Migrate YouTube channel loader to pytubefix#5451
lalithdabilpuram01 wants to merge 3 commits into
crewAIInc:mainfrom
lalithdabilpuram01:fix/youtube-loader-pytubefix

lalithdabilpuram01 commented Apr 14, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

github-actions Bot commented Jun 9, 2026

Uh oh!

coderabbitai Bot commented Jun 24, 2026 •

edited

Loading

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 24, 2026

Uh oh!

coderabbitai Bot Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

lalithdabilpuram01 commented Apr 14, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Summary by CodeRabbit

Uh oh!

github-actions Bot commented Jun 9, 2026

Uh oh!

coderabbitai Bot commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

lalithdabilpuram01 commented Apr 14, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 24, 2026 •

edited

Loading