feat: support for local file storage#1001
Conversation
|
Warning Review limit reached
More reviews will be available in 13 minutes and 4 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (12)
📝 WalkthroughWalkthroughThis PR implements a GCS-vs-local storage mode split by introducing a configurable 🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c3e8c36413
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Actionable comments posted: 10
🧹 Nitpick comments (3)
daras_ai_v2/image_tools.py (1)
66-70: 💤 Low valueType hint should include
Nonefor robustness.
cv2.imdecodereturnsNoneon decode failure (called frombytes_to_cv2_imgline 60), soimg_existsis invoked withNonewhen the guard on line 61 evaluates. The function handlesNonecorrectly at runtime (isinstance(None, np.ndarray)→ False →bool(None)→ False), but the type signature should reflect this.📝 Update type hint
-def img_exists(img: np.ndarray | str) -> bool: +def img_exists(img: np.ndarray | str | None) -> bool: if isinstance(img, np.ndarray): return bool(len(img)) else: return bool(img)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@daras_ai_v2/image_tools.py` around lines 66 - 70, The type hint for img_exists should include None because cv2.imdecode can return None (bytes_to_cv2_img may pass None); update the signature of img_exists to accept np.ndarray | str | None (or Optional[np.ndarray | str]) and keep the same runtime checks/returns inside the function so callers and static checkers recognize None is allowed.functions/inbuilt_tools.py (1)
32-32: Move theComposioimport under theCOMPOSIO_API_KEYcheck
functions/inbuilt_tools.py:get_inbuilt_tools()importsComposiounconditionally (beforeif settings.COMPOSIO_API_KEY:), so a trimmed/misconfigured install can still fail withImportErrorwhile building the live tool registry even though Composio integration is intended to be disabled. (In this repocomposiois declared as a regular dependency inpyproject.toml, so this is mainly defensive hardening.)Suggested fix
def get_inbuilt_tools(state: dict) -> typing.Iterable[BaseLLMTool]: - from composio import Composio - variables = state.get("variables") or {} @@ - if settings.COMPOSIO_API_KEY: + if settings.COMPOSIO_API_KEY: + from composio import Composio + for tool_spec in Composio().tools.get_raw_composio_tools( tools=composio_tools, limit=50 ): yield ComposioLLMTool(tool_spec, composio_tools[tool_spec.slug])🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@functions/inbuilt_tools.py` at line 32, The module currently imports Composio unconditionally which can raise ImportError even when COMPOSIO_API_KEY is unset; move the from composio import Composio statement inside the conditional that checks settings.COMPOSIO_API_KEY in get_inbuilt_tools() and guard it with a try/except ImportError so the function skips or logs composio-related setup if the package is missing; update any code in get_inbuilt_tools() that references Composio to only run after the import succeeds.daras_ai/image_input.py (1)
41-50: ⚡ Quick winSwitch these new signatures to postponed annotations.
These signatures still need quoted forward refs and an implicit-Optional default on Line 45, so they are fighting both the repo convention and Ruff.
from __future__ import annotationswould let this module useWorkspace | None,AppUser | None,Blob, andstr | Noneconsistently without the extratyping.TYPE_CHECKINGindirection.As per coding guidelines, "When a module would otherwise need a
typing.TYPE_CHECKINGguard, addfrom __future__ import annotationsat the top instead to lazily evaluate annotations and use unquoted forward references instead of string literals," and Ruff is already reporting that "PEP 484 prohibits implicitOptional" on Line 45.Also applies to: 98-107
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@daras_ai/image_input.py` around lines 41 - 50, Add "from __future__ import annotations" to the top of the module and update the indicated function signatures (e.g., temp_upload_file_from_bytes and the other signature at lines ~98-107) to use postponed annotations (use Workspace | None, AppUser | None, Blob, str | None, etc.) instead of quoted forward refs and typing.Optional; remove the typing.TYPE_CHECKING indirection for these types and ensure defaults remain explicit None where appropriate so Ruff no longer reports implicit-Optional.Sources: Coding guidelines, Linters/SAST tools
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@daras_ai_v2/doc_search_settings_widgets.py`:
- Around line 20-25: The current _user_media_url_prefix is built with
os.path.join which strips URL-leading slashes and uses OS path separators;
replace that with URL-safe concatenation (e.g., use urllib.parse.urljoin or
manual join logic) to construct an absolute HTTP URL for both branches so it
doesn't collapse to "/media/". Specifically, change the non-GCS branch to build
the prefix from settings.APP_BASE_URL and settings.MEDIA_URL using
APP_BASE_URL.rstrip('/') + '/' + MEDIA_URL.lstrip('/') (or urljoin(APP_BASE_URL,
MEDIA_URL)), and for the GCS branch ensure the prefix starts with
"https://storage.googleapis.com/" + GS_BUCKET_NAME + '/' + GS_MEDIA_PATH (using
posix-style slashes), then keep is_user_uploaded_url() comparing against that
normalized _user_media_url_prefix.
In `@daras_ai_v2/gpu_server.py`:
- Around line 67-70: The tuple unpacking of signed URLs in
call_celery_task_outfile_with_ret is wrong because generate_signed_url yields
(upload_url, public_url) per context and the code does zip(<single iterable of
pairs>) instead of transposing; fix by collecting the pair contexts into an
iterable and then unpacking with zip(*pairs) (e.g., build the context-managed
pairs via stack.enter_context(generate_signed_url(...)) for _ in
range(num_outputs), then do upload_urls, public_urls = zip(*pairs) and convert
to lists if needed) so upload_urls and public_urls get the correct shapes; refer
to the function call_celery_task_outfile_with_ret and the contextmanager
generate_signed_url.
In `@daras_ai_v2/settings.py`:
- Line 248: The setting variable ENABLE_GCS_STROAGE is misspelled and used in
the startup assertion, causing ENABLE_GCS_STORAGE to be ignored; rename the
canonical setting to ENABLE_GCS_STORAGE and update any startup assertion or
checks that reference ENABLE_GCS_STROAGE to use ENABLE_GCS_STORAGE, but for
rollout compatibility read the old misspelled key as a fallback (e.g., check
config("ENABLE_GCS_STORAGE", ...) and if absent fall back to
config("ENABLE_GCS_STROAGE", ...)); update references to the symbol
ENABLE_GCS_STROAGE and the startup assertion lines so all checks use
ENABLE_GCS_STORAGE as the primary flag.
- Around line 282-287: When GCS storage is enabled, ensure a valid bucket name
is provided instead of falling back to ".appspot.com": update settings.py to not
silently construct GS_BUCKET_NAME from an empty GCP_PROJECT and add a startup
validation that if ENABLE_GCS_STORAGE (or similar feature flag) is true then
either GS_BUCKET_NAME is non-empty/valid or GCP_PROJECT is set so you can derive
a sane bucket name; change the GS_BUCKET_NAME default to empty (or None),
compute a derived name only if GCP_PROJECT is present, and raise a clear
configuration error (or exit) during app initialization if neither
GS_BUCKET_NAME nor GCP_PROJECT is provided.
- Around line 290-291: The MEDIA_ROOT and MEDIA_URL settings are hardcoded and
must instead read from environment variables; update the MEDIA_ROOT and
MEDIA_URL assignments to read their values from env (e.g., MEDIA_ROOT_ENV /
MEDIA_URL_ENV) with sensible fallbacks — for MEDIA_ROOT fallback to BASE_DIR /
"media" and for MEDIA_URL fallback to "/media/" — preserving MEDIA_ROOT as a
Path object if BASE_DIR is a pathlib.Path and ensuring MEDIA_URL ends with a
trailing slash; use os.getenv or the existing config/env loader in the project
to implement this change in the MEDIA_ROOT and MEDIA_URL definitions.
In `@daras_ai/image_input.py`:
- Around line 136-149: generate_signed_url is GCS-only because it always calls
gcs_blob_for/blob.generate_signed_url, causing local-storage deployments to
raise UserError; update generate_signed_url to branch on the configured storage
backend: if GCS (keep current behavior using gcs_blob_for,
blob.generate_signed_url and register_blob), otherwise implement a local-storage
path that creates a local upload target (e.g., create a temp file path or local
blob object, register it with register_blob or a new register_blob_local) and
return an upload endpoint/identifier plus a public URL (or a file:// path)
compatible with callers in daras_ai_v2/gpu_server.py and
recipes/TextToSpeech.py; alternatively, if local uploads are unsupported,
explicitly raise a descriptive UserError early from generate_signed_url so
callers are gated as GCS-only. Ensure the function references:
generate_signed_url, gcs_blob_for, register_blob, and blob.generate_signed_url
when implementing the branch.
- Around line 124-133: The save_local_file_from_bytes function currently accepts
raw filenames and can be tricked into path-traversal; update
save_local_file_from_bytes to sanitize the filename (reuse the same
safe_filename logic used by gcs_blob_for), build the path, then resolve and
assert the resulting path is a child of settings.MEDIA_ROOT before writing;
likewise ensure delete_uploaded_url validates that the URL maps back inside
MEDIA_ROOT before unlinking. Make generate_signed_url safe for non-GCS
deployments by not forcing gcs_blob_for/gcs_bucket when
settings.ENABLE_GCS_STORAGE is false (either return a local file URL or raise a
controlled error only when GCS is required), and update any call sites that
assume GCS-only behavior. Finally fix Ruff typing RUF013 by changing parameters
like content_type: str = None to content_type: str | None = None (or
Optional[str]) where present.
In `@recipes/GoogleImageGen.py`:
- Around line 8-10: The selection loop in GoogleImageGen.py currently catches
(IOError, ConnectionError, ValueError) but not the new UserError raised by
bytes_to_cv2_img used via resize_img_scale, so a single bad Google Images result
aborts the workflow; fix by importing UserError (from daras_ai_v2.image_tools or
its defining module) and add UserError to the except clause in the selection
loop that wraps calls to resize_img_scale/bytes_to_cv2_img so the loop will skip
that candidate and continue.
In `@routers/static_pages.py`:
- Around line 29-30: The /internal/webflow-upload/ admin route and its helpers
call gcs_bucket() which raises a UserError when settings.ENABLE_GCS_STROAGE is
false, so add a feature-flag guard to short-circuit and return a 404/403 instead
of letting the error bubble: in the route handler for /internal/webflow-upload/
(and in helpers render_webflow_upload, extract_zip_to_gcloud, upload_zipfile)
check settings.ENABLE_GCS_STROAGE at the top and raise
HTTPException(status_code=404) (or 403) if false, or alternatively avoid
registering the route when the flag is off; ensure no code path calls
gcs_bucket() when the flag is disabled.
In `@server.py`:
- Around line 72-75: The app currently mounts StaticFiles only when
settings.MEDIA_ROOT.exists(), causing /media routes to 404 until restart; remove
that conditional so app.mount(settings.MEDIA_URL,
StaticFiles(directory=settings.MEDIA_ROOT), name="media") always runs, and
ensure the directory exists at startup by creating settings.MEDIA_ROOT if
missing (use settings.MEDIA_ROOT.mkdir(parents=True, exist_ok=True)) before
mounting; update the startup block in server.py that references
settings.ENABLE_GCS_STROAGE, settings.MEDIA_ROOT, settings.MEDIA_URL and
StaticFiles, and note that save_local_file_from_bytes already creates the dir on
demand so this just guarantees availability without restart.
---
Nitpick comments:
In `@daras_ai_v2/image_tools.py`:
- Around line 66-70: The type hint for img_exists should include None because
cv2.imdecode can return None (bytes_to_cv2_img may pass None); update the
signature of img_exists to accept np.ndarray | str | None (or
Optional[np.ndarray | str]) and keep the same runtime checks/returns inside the
function so callers and static checkers recognize None is allowed.
In `@daras_ai/image_input.py`:
- Around line 41-50: Add "from __future__ import annotations" to the top of the
module and update the indicated function signatures (e.g.,
temp_upload_file_from_bytes and the other signature at lines ~98-107) to use
postponed annotations (use Workspace | None, AppUser | None, Blob, str | None,
etc.) instead of quoted forward refs and typing.Optional; remove the
typing.TYPE_CHECKING indirection for these types and ensure defaults remain
explicit None where appropriate so Ruff no longer reports implicit-Optional.
In `@functions/inbuilt_tools.py`:
- Line 32: The module currently imports Composio unconditionally which can raise
ImportError even when COMPOSIO_API_KEY is unset; move the from composio import
Composio statement inside the conditional that checks settings.COMPOSIO_API_KEY
in get_inbuilt_tools() and guard it with a try/except ImportError so the
function skips or logs composio-related setup if the package is missing; update
any code in get_inbuilt_tools() that references Composio to only run after the
import succeeds.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 7b230375-54ae-44b3-92da-03df8a3c9f9e
📒 Files selected for processing (23)
configuration.mddaras_ai/image_input.pydaras_ai_v2/doc_search_settings_widgets.pydaras_ai_v2/extract_face.pydaras_ai_v2/gpu_server.pydaras_ai_v2/image_tools.pydaras_ai_v2/language_model.pydaras_ai_v2/repositioning.pydaras_ai_v2/settings.pydaras_ai_v2/stable_diffusion.pydaras_ai_v2/vcard.pyfunctions/composio_tools.pyfunctions/inbuilt_tools.pyglossary_resources/tests.pylivekit_agent.pyrecipes/GoogleImageGen.pyrecipes/ImageSegmentation.pyrecipes/ModelTrainer.pyrecipes/ObjectInpainting.pyrecipes/QRCodeGenerator.pyrecipes/TextToSpeech.pyrouters/static_pages.pyserver.py
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (6)
daras_ai_v2/settings.py (1)
275-276:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRead
MEDIA_ROOTandMEDIA_URLfrom env.These assignments still pin every deployment to
BASE_DIR / "media"and"/media/", which breaks the local-storage config contract described in the PR summary for self-hosted installs.Suggested fix
-MEDIA_ROOT = BASE_DIR / "media" -MEDIA_URL = "/media/" +MEDIA_ROOT = Path(config("MEDIA_ROOT", default=str(BASE_DIR / "media"))) +MEDIA_URL = config("MEDIA_URL", default="/media/")🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@daras_ai_v2/settings.py` around lines 275 - 276, MEDIA_ROOT and MEDIA_URL are hardcoded; change them to be read from environment variables with sensible fallbacks so deployments can override local-storage paths. Update the MEDIA_ROOT and MEDIA_URL assignments to read os.environ (e.g., MEDIA_ROOT = Path(os.environ.get("MEDIA_ROOT", BASE_DIR / "media")) and MEDIA_URL = os.environ.get("MEDIA_URL", "/media/")) ensuring you import os and use BASE_DIR and Path consistently; preserve existing defaults if env vars are not set.server.py (1)
72-75:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAlways mount local media when local storage is enabled.
save_local_file_from_bytes()createssettings.MEDIA_ROOTon demand, but this route only exists if the directory is already present at startup. Fresh local-storage deployments will therefore 404/media/...URLs until the process restarts.Suggested fix
-if not settings.GS_BUCKET_NAME and settings.MEDIA_ROOT.exists(): +if not settings.GS_BUCKET_NAME: + settings.MEDIA_ROOT.mkdir(parents=True, exist_ok=True) app.mount( settings.MEDIA_URL, StaticFiles(directory=settings.MEDIA_ROOT), name="media" )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server.py` around lines 72 - 75, The media route is only mounted if settings.MEDIA_ROOT.exists(), causing fresh local-storage deployments to 404; remove the existence check so the mount always happens when local storage is enabled (i.e., when not settings.GS_BUCKET_NAME), and ensure the directory is created at startup (call settings.MEDIA_ROOT.mkdir(parents=True, exist_ok=True) or equivalent) before calling app.mount(settings.MEDIA_URL, StaticFiles(directory=settings.MEDIA_ROOT), name="media"); reference settings.GS_BUCKET_NAME, settings.MEDIA_ROOT, settings.MEDIA_URL, app.mount, StaticFiles and save_local_file_from_bytes() when making the change.daras_ai/image_input.py (3)
140-153:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift
generate_signed_url()still hard-fails in local-storage mode.This helper still unconditionally goes through
gcs_blob_for()/gcs_bucket(), so local deployments hitUserError("This feature is only available with GCS"). That contradicts the backend-agnostic upload contract introduced elsewhere in this PR.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@daras_ai/image_input.py` around lines 140 - 153, generate_signed_url currently always calls gcs_blob_for and fails in local-storage mode; change it to branch on the storage backend instead of assuming GCS. Inside generate_signed_url (which currently calls gcs_blob_for and register_blob), detect whether GCS is enabled (or catch the UserError from gcs_blob_for) and: if GCS, keep the existing flow (generate_signed_url on the GCS blob and yield (upload_url, blob.public_url)); if local storage, obtain the local blob (e.g. local_blob_for or equivalent), do not call generate_signed_url, and yield a direct upload endpoint/path plus blob.public_url (or the same tuple shape) while still using register_blob for lifecycle tracking. Ensure the returned tuple shape and register_blob usage remain consistent so callers are backend-agnostic.
66-80:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winConstrain local-file paths to
MEDIA_ROOT.
save_local_file_from_bytes()still writesfilenameverbatim underMEDIA_ROOT / <uuid> / filename, so../or absolute names can escape the media root.delete_uploaded_url()has the inverse problem:/media/../../...survives the lexicalrelative_to()call andunlink()can target files outsideMEDIA_ROOT.Suggested fix
def delete_uploaded_url(url: str): from daras_ai_v2.doc_search_settings_widgets import is_user_uploaded_url @@ else: try: relpath = Path(furl(url).pathstr).relative_to(settings.MEDIA_URL) except ValueError: return - abspath = settings.MEDIA_ROOT / relpath + abspath = (settings.MEDIA_ROOT / relpath).resolve() + abspath.relative_to(settings.MEDIA_ROOT.resolve()) abspath.unlink(missing_ok=True) @@ def save_local_file_from_bytes(filename: str, data: bytes) -> tuple[Path, str]: - path = settings.MEDIA_ROOT / str(uuid.uuid1()) / filename + safe_name = safe_filename(Path(filename).name) or "upload.bin" + relpath = Path(str(uuid.uuid1())) / safe_name + path = (settings.MEDIA_ROOT / relpath).resolve() + path.relative_to(settings.MEDIA_ROOT.resolve()) path.parent.mkdir(parents=True, exist_ok=True) path.write_bytes(data) url = str( furl(settings.APP_BASE_URL) / settings.MEDIA_URL - / str(path.relative_to(settings.MEDIA_ROOT)) + / str(relpath) ) return path, urlAlso applies to: 191-200
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@daras_ai/image_input.py` around lines 66 - 80, delete_uploaded_url and save_local_file_from_bytes allow path traversal because they use Path.relative_to() and raw filenames; fix by normalizing and validating paths: in save_local_file_from_bytes strip any directory components from the provided filename (use Path(filename).name) and build the target under settings.MEDIA_ROOT; after building the path call .resolve() and verify the resolved path startswith(settings.MEDIA_ROOT.resolve()) before writing. In delete_uploaded_url, convert the URL path to a Path, join with settings.MEDIA_ROOT, call .resolve() and ensure the resolved path is inside settings.MEDIA_ROOT.resolve() before calling unlink(); apply the same validation to the other occurrence at lines ~191-200. Ensure any failure to validate aborts without touching the filesystem.
83-92:⚠️ Potential issue | 🟡 MinorFix implicit
Optionalforcontent_typeparameters flagged byRUF013.
- In
daras_ai/image_input.py,temp_upload_file_from_bytesusescontent_type: str = None(line 87); change tocontent_type: str | None = None.- Apply the same change to the other
content_type: str = Noneparameters also flagged by Ruff (lines 119 and 168).Suggested fix
def temp_upload_file_from_bytes( filename: str, data: bytes, - content_type: str = None, + content_type: str | None = None,🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@daras_ai/image_input.py` around lines 83 - 92, Change implicit Optional annotations for content_type to use explicit union syntax; update the function signature for temp_upload_file_from_bytes to accept content_type: str | None = None and make the same change for the other functions in the file that currently declare content_type: str = None (the two additional functions flagged by Ruff). Locate the functions by name (temp_upload_file_from_bytes and the other two functions that accept a content_type parameter) and replace the parameter typing to use str | None so the typing is explicit and Ruff013 is satisfied.Source: Linters/SAST tools
daras_ai_v2/doc_search_settings_widgets.py (1)
20-25:⚠️ Potential issue | 🟠 Major | ⚡ Quick winBuild a URL prefix here, not an OS path.
In the local branch,
os.path.join(settings.APP_BASE_URL, settings.MEDIA_URL)collapses to"/media/"becauseMEDIA_URLis absolute. That makesis_user_uploaded_url()treat unrelated URLs containing/media/as user-uploaded.Suggested fix
if settings.GS_BUCKET_NAME: - _user_media_url_prefix = os.path.join( - "storage.googleapis.com", settings.GS_BUCKET_NAME, settings.GS_MEDIA_PATH - ) + _user_media_url_prefix = ( + f"https://storage.googleapis.com/" + f"{settings.GS_BUCKET_NAME.strip('/')}/" + f"{settings.GS_MEDIA_PATH.strip('/')}/" + ) else: - _user_media_url_prefix = os.path.join(settings.APP_BASE_URL, settings.MEDIA_URL) + _user_media_url_prefix = ( + f"{settings.APP_BASE_URL.rstrip('/')}/{settings.MEDIA_URL.strip('/')}/" + )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@daras_ai_v2/doc_search_settings_widgets.py` around lines 20 - 25, The code builds _user_media_url_prefix using os.path.join which treats MEDIA_URL as an OS path and collapses absolute MEDIA_URL ("/media/") causing incorrect prefixes and false positives in is_user_uploaded_url(); fix by constructing a proper URL string instead of os.path.join: for the GCS branch assemble "https://storage.googleapis.com" + bucket + media path ensuring single slashes, and for the local branch build the prefix by trimming trailing slash from settings.APP_BASE_URL and leading slash from settings.MEDIA_URL and concatenating with a single '/' (or use urllib.parse functions correctly) so _user_media_url_prefix is a valid URL that won't be collapsed. Ensure references to _user_media_url_prefix, settings.APP_BASE_URL, settings.MEDIA_URL, and is_user_uploaded_url() are used to locate the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@daras_ai/image_input.py`:
- Around line 108-113: The finally block that does path.unlink() should be made
idempotent to avoid raising when the file has already been removed (caller may
call delete_uploaded_url()); update the cleanup in the generator (the block that
calls save_local_file_from_bytes and yields public_url) to catch and ignore
FileNotFoundError (and optionally PermissionError) around path.unlink() so
deletion is attempted but failures due to missing file do not propagate.
---
Duplicate comments:
In `@daras_ai_v2/doc_search_settings_widgets.py`:
- Around line 20-25: The code builds _user_media_url_prefix using os.path.join
which treats MEDIA_URL as an OS path and collapses absolute MEDIA_URL
("/media/") causing incorrect prefixes and false positives in
is_user_uploaded_url(); fix by constructing a proper URL string instead of
os.path.join: for the GCS branch assemble "https://storage.googleapis.com" +
bucket + media path ensuring single slashes, and for the local branch build the
prefix by trimming trailing slash from settings.APP_BASE_URL and leading slash
from settings.MEDIA_URL and concatenating with a single '/' (or use urllib.parse
functions correctly) so _user_media_url_prefix is a valid URL that won't be
collapsed. Ensure references to _user_media_url_prefix, settings.APP_BASE_URL,
settings.MEDIA_URL, and is_user_uploaded_url() are used to locate the change.
In `@daras_ai_v2/settings.py`:
- Around line 275-276: MEDIA_ROOT and MEDIA_URL are hardcoded; change them to be
read from environment variables with sensible fallbacks so deployments can
override local-storage paths. Update the MEDIA_ROOT and MEDIA_URL assignments to
read os.environ (e.g., MEDIA_ROOT = Path(os.environ.get("MEDIA_ROOT", BASE_DIR /
"media")) and MEDIA_URL = os.environ.get("MEDIA_URL", "/media/")) ensuring you
import os and use BASE_DIR and Path consistently; preserve existing defaults if
env vars are not set.
In `@daras_ai/image_input.py`:
- Around line 140-153: generate_signed_url currently always calls gcs_blob_for
and fails in local-storage mode; change it to branch on the storage backend
instead of assuming GCS. Inside generate_signed_url (which currently calls
gcs_blob_for and register_blob), detect whether GCS is enabled (or catch the
UserError from gcs_blob_for) and: if GCS, keep the existing flow
(generate_signed_url on the GCS blob and yield (upload_url, blob.public_url));
if local storage, obtain the local blob (e.g. local_blob_for or equivalent), do
not call generate_signed_url, and yield a direct upload endpoint/path plus
blob.public_url (or the same tuple shape) while still using register_blob for
lifecycle tracking. Ensure the returned tuple shape and register_blob usage
remain consistent so callers are backend-agnostic.
- Around line 66-80: delete_uploaded_url and save_local_file_from_bytes allow
path traversal because they use Path.relative_to() and raw filenames; fix by
normalizing and validating paths: in save_local_file_from_bytes strip any
directory components from the provided filename (use Path(filename).name) and
build the target under settings.MEDIA_ROOT; after building the path call
.resolve() and verify the resolved path
startswith(settings.MEDIA_ROOT.resolve()) before writing. In
delete_uploaded_url, convert the URL path to a Path, join with
settings.MEDIA_ROOT, call .resolve() and ensure the resolved path is inside
settings.MEDIA_ROOT.resolve() before calling unlink(); apply the same validation
to the other occurrence at lines ~191-200. Ensure any failure to validate aborts
without touching the filesystem.
- Around line 83-92: Change implicit Optional annotations for content_type to
use explicit union syntax; update the function signature for
temp_upload_file_from_bytes to accept content_type: str | None = None and make
the same change for the other functions in the file that currently declare
content_type: str = None (the two additional functions flagged by Ruff). Locate
the functions by name (temp_upload_file_from_bytes and the other two functions
that accept a content_type parameter) and replace the parameter typing to use
str | None so the typing is explicit and Ruff013 is satisfied.
In `@server.py`:
- Around line 72-75: The media route is only mounted if
settings.MEDIA_ROOT.exists(), causing fresh local-storage deployments to 404;
remove the existence check so the mount always happens when local storage is
enabled (i.e., when not settings.GS_BUCKET_NAME), and ensure the directory is
created at startup (call settings.MEDIA_ROOT.mkdir(parents=True, exist_ok=True)
or equivalent) before calling app.mount(settings.MEDIA_URL,
StaticFiles(directory=settings.MEDIA_ROOT), name="media"); reference
settings.GS_BUCKET_NAME, settings.MEDIA_ROOT, settings.MEDIA_URL, app.mount,
StaticFiles and save_local_file_from_bytes() when making the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: e77eb9d4-cccf-43a1-bd2a-fbd633cbbdee
📒 Files selected for processing (14)
.env.exampleconfiguration.mddaras_ai/image_input.pydaras_ai_v2/doc_search_settings_widgets.pydaras_ai_v2/gpu_server.pydaras_ai_v2/settings.pyfunctions/composio_tools.pyfunctions/inbuilt_tools.pyglossary_resources/tests.pylivekit_agent.pyrecipes/ModelTrainer.pyrecipes/TextToSpeech.pyrouters/static_pages.pyserver.py
💤 Files with no reviewable changes (1)
- .env.example
✅ Files skipped from review due to trivial changes (1)
- configuration.md
🚧 Files skipped from review as they are similar to previous changes (6)
- routers/static_pages.py
- functions/inbuilt_tools.py
- glossary_resources/tests.py
- daras_ai_v2/gpu_server.py
- livekit_agent.py
- recipes/TextToSpeech.py
Split auth into base/firebase/local routers selected by
settings.ENABLE_FIREBASE_AUTH so self-hosters can run without Firebase.
Add an AppUser.password field, django admin password reset, and
LoginForm/ForgotPasswordForm components for the local login flow.
Co-authored-by: Kaustubh M 37668193+nikochiko@users.noreply.github.com### Q/A checklist
How to check import time?
You can visualize this using tuna:
To measure import time for a specific library:
To reduce import times, import libraries that take a long time inside the functions that use them instead of at the top of the file:
Legal Boilerplate
Look, I get it. The entity doing business as “Gooey.AI” and/or “Dara.network” was incorporated in the State of Delaware in 2020 as Dara Network Inc. and is gonna need some rights from me in order to utilize my contributions in this PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Dara Network Inc can use, modify, copy, and redistribute my contributions, under its choice of terms.