fix(community): resolve document_ids MIME types via files().get (#1867) by lcmartinezdev · Pull Request #1868 · langchain-ai/langchain-google

Luis Martinez (lcmartinezdev) · 2026-06-22T23:59:29Z

Description

GoogleDriveLoader.load() fails with HTTP 400 for any document_ids. _load_documents_from_ids resolves MIME types with files().list(q="id = '...'"), but the Drive API q parameter has no id field, so the request is rejected with "Invalid Value". Loading by document_ids is broken against the real API on every release since #1644 (4.0.0, 5.0.0, and main). 3.x is not affected. Mocked unit tests did not catch it because the Drive service is stubbed.

Fix

Resolve each id with files().get(fileId=..., fields="mimeType") — the correct way to fetch a file by id. The per-type dispatch added in #1644 (Sheets → _load_sheet_from_id, PDF → _load_file_from_id, else → _load_document_from_id) is unchanged.

Tests

Updated the mock service to stub files().get() (the call the loader now makes).
Added a regression test asserting MIME resolution uses files().get(fileId=...) and never files().list.
All unit tests pass; ruff and mypy are clean.

Note

This makes one files().get call per id instead of a single files().list. That is required, since files().list cannot query by id, and document_ids lists are typically small.

Fixes #1867

…chain-ai#1867) GoogleDriveLoader._load_documents_from_ids resolved each file's MIME type with files().list(q="id = '...'"). The Drive API `q` parameter has no `id` field, so that request is rejected with HTTP 400 "Invalid Value" against the real API, and load() fails for every document_id. Mocked tests did not catch it because the Drive service is stubbed. Resolve each id with files().get(fileId=..., fields="mimeType") instead, which is the correct way to fetch a file by id. The per-type dispatch from langchain-ai#1644 (Sheets -> _load_sheet_from_id, PDF -> _load_file_from_id, else -> _load_document_from_id) is unchanged. Fixes langchain-ai#1867

Luis Martinez (lcmartinezdev) · 2026-06-23T00:05:30Z

Why `files.get` per id (and not a single request)

Per the Drive API docs, id is not a valid files.list query term. The supported q terms are:

name, fullText, mimeType, modifiedTime, viewedByMeTime, trashed, starred, parents, owners, writers, readers, sharedWithMe, createdTime, properties, appProperties, visibility, shortcutDetails.targetId

So there is no "fixed query" version of files.list(q="id = '...'") — you fetch a file by id with files.get, not with search. That leaves two shapes:

files.get per id (this fix) — N calls, simple and standard.
Batch HTTP request — one round-trip for up to 100 ids, but more error-handling code, and it does not reduce quota: the docs state "n requests batched together count as n requests, not one."

Batching would only save round-trip latency, not API usage, so it adds complexity for a benefit most callers don't need (id lists are usually short). I went with files.get per id: it is the documented way to fetch by id and keeps the change minimal. A batched variant can be added later if a caller needs to load many ids at once.

Docs: file-specific query terms, batching.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(community): resolve document_ids MIME types via files().get (#1867)#1868

fix(community): resolve document_ids MIME types via files().get (#1867)#1868
Luis Martinez (lcmartinezdev) wants to merge 1 commit into
langchain-ai:mainfrom
lcmartinezdev:fix/drive-loader-document-ids-files-get

Luis Martinez (lcmartinezdev) commented Jun 22, 2026

Uh oh!

Luis Martinez (lcmartinezdev) commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Luis Martinez (lcmartinezdev) commented Jun 22, 2026

Description

Fix

Tests

Note

Uh oh!

Luis Martinez (lcmartinezdev) commented Jun 23, 2026

Why files.get per id (and not a single request)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Why `files.get` per id (and not a single request)