Gemini 3.5 Flash slow or malformed when generating large function-call arguments

## Summary

`gemini-3.5-flash` can take 50-120s, return `MAX_TOKENS`, or return `MALFORMED_FUNCTION_CALL` when generating a large string inside a function-call argument.

This reproduces with a direct `@google/genai` call, without any agent framework or application wrapper.

Minimal repro gist:
https://gist.github.com/uriva/6bc9e20d1915c8d655ad3b3558796977

## Environment

- SDK: `@google/genai@2.4.0`
- Runtime: Deno
- Model: `gemini-3.5-flash`
- API: Gemini API via API key
- Call: `ai.models.generateContent(request)`
- Streaming: not required to reproduce

## Repro shape

The repro sends one user message and one declared tool:

- Tool: `run_command`
- Args schema:
  - `command: string`
  - `params: any`
- Prompt asks the model to call `run_command` with `command: code_execution/write_file`
- `params.content` should contain a complete single-file HTML landing page

Request size is very small:

- About 1.5KB serialized request
- About 268 prompt tokens in one observed run
- One user content item
- One function declaration
- No previous turns
- No thought signatures in input

Config:

```ts
thinkingConfig: {
  includeThoughts: false,
  thinkingLevel: ThinkingLevel.LOW,
},
maxOutputTokens: 16000,
toolConfig: { functionCallingConfig: {} },
```

## Observed behavior

Across repeated runs of the same direct SDK request, outcomes vary:

1. Slow successful function call
   - ~50-60s latency
   - `finishReason: STOP`
   - returned `run_command` function call
   - generated ~48KB-56KB JSON function-call args

2. Token exhaustion
   - ~120s latency
   - `finishReason: MAX_TOKENS`
   - returned partial/large `run_command` function call args around ~56KB-60KB

3. Malformed function call
   - ~122s latency
   - `finishReason: MALFORMED_FUNCTION_CALL`
   - no content parts exposed by the SDK response

Example observed malformed run:

```json
{
  "elapsedMs": 122344,
  "finishReason": "MALFORMED_FUNCTION_CALL",
  "usage": {
    "promptTokenCount": 268,
    "totalTokenCount": 268,
    "serviceTier": "standard"
  },
  "parts": []
}
```

Example observed slow successful run:

```json
{
  "elapsedMs": 50557,
  "finishReason": "STOP",
  "usage": {
    "promptTokenCount": 234,
    "candidatesTokenCount": 12581,
    "totalTokenCount": 13288
  },
  "parts": [
    {
      "functionCall": "run_command",
      "argBytes": 48015
    }
  ]
}
```

## Expected behavior

Large string arguments in function calls should either:

- complete reliably within reasonable latency, or
- fail quickly with an actionable error, or
- expose enough partial output/diagnostics to understand why the function call became malformed.

A large file write via a tool argument is a valid agentic workflow. The issue seems specific to generating large structured function-call JSON arguments, not to finding the tool.

## Notes

- This is not caused by our app's agent wrapper; the gist calls `GoogleGenAI` directly.
- This is not caused by prior thought signatures; the minimal repro has no prior model turns.
- This is not about many tools; the minimal repro has one declared function.
- The model often does find the intended tool and starts generating `params.content` correctly, but generation is slow/flaky and sometimes ends as `MAX_TOKENS` or `MALFORMED_FUNCTION_CALL`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gemini 3.5 Flash slow or malformed when generating large function-call arguments #1619

Summary

Environment

Repro shape

Observed behavior

Expected behavior

Notes

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Gemini 3.5 Flash slow or malformed when generating large function-call arguments #1619

Description

Summary

Environment

Repro shape

Observed behavior

Expected behavior

Notes

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions