Summary
gemini-3.5-flash can take 50-120s, return MAX_TOKENS, or return MALFORMED_FUNCTION_CALL when generating a large string inside a function-call argument.
This reproduces with a direct @google/genai call, without any agent framework or application wrapper.
Minimal repro gist:
https://gist.github.com/uriva/6bc9e20d1915c8d655ad3b3558796977
Environment
- SDK:
@google/genai@2.4.0
- Runtime: Deno
- Model:
gemini-3.5-flash
- API: Gemini API via API key
- Call:
ai.models.generateContent(request)
- Streaming: not required to reproduce
Repro shape
The repro sends one user message and one declared tool:
- Tool:
run_command
- Args schema:
command: string
params: any
- Prompt asks the model to call
run_command with command: code_execution/write_file
params.content should contain a complete single-file HTML landing page
Request size is very small:
- About 1.5KB serialized request
- About 268 prompt tokens in one observed run
- One user content item
- One function declaration
- No previous turns
- No thought signatures in input
Config:
thinkingConfig: {
includeThoughts: false,
thinkingLevel: ThinkingLevel.LOW,
},
maxOutputTokens: 16000,
toolConfig: { functionCallingConfig: {} },
Observed behavior
Across repeated runs of the same direct SDK request, outcomes vary:
-
Slow successful function call
- ~50-60s latency
finishReason: STOP
- returned
run_command function call
- generated ~48KB-56KB JSON function-call args
-
Token exhaustion
- ~120s latency
finishReason: MAX_TOKENS
- returned partial/large
run_command function call args around ~56KB-60KB
-
Malformed function call
- ~122s latency
finishReason: MALFORMED_FUNCTION_CALL
- no content parts exposed by the SDK response
Example observed malformed run:
{
"elapsedMs": 122344,
"finishReason": "MALFORMED_FUNCTION_CALL",
"usage": {
"promptTokenCount": 268,
"totalTokenCount": 268,
"serviceTier": "standard"
},
"parts": []
}
Example observed slow successful run:
{
"elapsedMs": 50557,
"finishReason": "STOP",
"usage": {
"promptTokenCount": 234,
"candidatesTokenCount": 12581,
"totalTokenCount": 13288
},
"parts": [
{
"functionCall": "run_command",
"argBytes": 48015
}
]
}
Expected behavior
Large string arguments in function calls should either:
- complete reliably within reasonable latency, or
- fail quickly with an actionable error, or
- expose enough partial output/diagnostics to understand why the function call became malformed.
A large file write via a tool argument is a valid agentic workflow. The issue seems specific to generating large structured function-call JSON arguments, not to finding the tool.
Notes
- This is not caused by our app's agent wrapper; the gist calls
GoogleGenAI directly.
- This is not caused by prior thought signatures; the minimal repro has no prior model turns.
- This is not about many tools; the minimal repro has one declared function.
- The model often does find the intended tool and starts generating
params.content correctly, but generation is slow/flaky and sometimes ends as MAX_TOKENS or MALFORMED_FUNCTION_CALL.
Summary
gemini-3.5-flashcan take 50-120s, returnMAX_TOKENS, or returnMALFORMED_FUNCTION_CALLwhen generating a large string inside a function-call argument.This reproduces with a direct
@google/genaicall, without any agent framework or application wrapper.Minimal repro gist:
https://gist.github.com/uriva/6bc9e20d1915c8d655ad3b3558796977
Environment
@google/genai@2.4.0gemini-3.5-flashai.models.generateContent(request)Repro shape
The repro sends one user message and one declared tool:
run_commandcommand: stringparams: anyrun_commandwithcommand: code_execution/write_fileparams.contentshould contain a complete single-file HTML landing pageRequest size is very small:
Config:
Observed behavior
Across repeated runs of the same direct SDK request, outcomes vary:
Slow successful function call
finishReason: STOPrun_commandfunction callToken exhaustion
finishReason: MAX_TOKENSrun_commandfunction call args around ~56KB-60KBMalformed function call
finishReason: MALFORMED_FUNCTION_CALLExample observed malformed run:
{ "elapsedMs": 122344, "finishReason": "MALFORMED_FUNCTION_CALL", "usage": { "promptTokenCount": 268, "totalTokenCount": 268, "serviceTier": "standard" }, "parts": [] }Example observed slow successful run:
{ "elapsedMs": 50557, "finishReason": "STOP", "usage": { "promptTokenCount": 234, "candidatesTokenCount": 12581, "totalTokenCount": 13288 }, "parts": [ { "functionCall": "run_command", "argBytes": 48015 } ] }Expected behavior
Large string arguments in function calls should either:
A large file write via a tool argument is a valid agentic workflow. The issue seems specific to generating large structured function-call JSON arguments, not to finding the tool.
Notes
GoogleGenAIdirectly.params.contentcorrectly, but generation is slow/flaky and sometimes ends asMAX_TOKENSorMALFORMED_FUNCTION_CALL.