…chain-ai#1867)
GoogleDriveLoader._load_documents_from_ids resolved each file's MIME type
with files().list(q="id = '...'"). The Drive API `q` parameter has no `id`
field, so that request is rejected with HTTP 400 "Invalid Value" against the
real API, and load() fails for every document_id. Mocked tests did not catch
it because the Drive service is stubbed.
Resolve each id with files().get(fileId=..., fields="mimeType") instead, which
is the correct way to fetch a file by id. The per-type dispatch from langchain-ai#1644
(Sheets -> _load_sheet_from_id, PDF -> _load_file_from_id, else ->
_load_document_from_id) is unchanged.
Fixes langchain-ai#1867
Description
GoogleDriveLoader.load()fails with HTTP 400 for anydocument_ids._load_documents_from_idsresolves MIME types withfiles().list(q="id = '...'"), but the Drive APIqparameter has noidfield, so the request is rejected with "Invalid Value". Loading bydocument_idsis broken against the real API on every release since #1644 (4.0.0, 5.0.0, andmain). 3.x is not affected. Mocked unit tests did not catch it because the Drive service is stubbed.Fix
Resolve each id with
files().get(fileId=..., fields="mimeType")— the correct way to fetch a file by id. The per-type dispatch added in #1644 (Sheets →_load_sheet_from_id, PDF →_load_file_from_id, else →_load_document_from_id) is unchanged.Tests
files().get()(the call the loader now makes).files().get(fileId=...)and neverfiles().list.ruffandmypyare clean.Note
This makes one
files().getcall per id instead of a singlefiles().list. That is required, sincefiles().listcannot query by id, anddocument_idslists are typically small.Fixes #1867