Skip to content

Streamed JSONL responses with no content-type header are returned unparsed #208

Description

@jimryan

Bug: messages.batches.results_streaming yields raw JSON strings instead of coerced model objects when the results response has no content-type header

Environment

  • anthropic gem: 1.49.0
  • Ruby: 4.0.3

Summary

Iterating a batch results stream yields raw JSONL line Strings instead of coerced Anthropic::Models::Messages::MessageBatchIndividualResponse objects.
Any attribute access on the yielded value raises NoMethodError.

client = Anthropic::Client.new

client.messages.batches.results_streaming("msgbatch_...").each do |result|
  result.custom_id
  # => NoMethodError: undefined method 'custom_id' for an instance of String
end

The yielded objects are the raw JSONL lines:

item = nil
client.messages.batches.results_streaming("msgbatch_...").each { |r| item = r; break }

item.class    # => String  (expected: Anthropic::Models::Messages::MessageBatchIndividualResponse)
item[0, 60]   # => "{\"custom_id\":\"foo-1\",\"result\":{\"type\":\"succeeded\",\"mes"

Expected

The stream should yield coerced MessageBatchIndividualResponse objects (.custom_id, .result.type, .result.message, etc.), as the type signature and docs imply.

Root cause

The batch-results endpoint returns the .jsonl body with no content-type response header:

require "net/http"
uri = URI("https://api.anthropic.com/v1/messages/batches/msgbatch_.../results")                                                                   
req = Net::HTTP::Get.new(uri)
req["x-api-key"] = ENV["ANTHROPIC_API_KEY"]
req["anthropic-version"] = "2023-06-01"
req["accept"] = "application/x-jsonl"
res = Net::HTTP.start(uri.host, uri.port, use_ssl: true) { _1.request(req) }

res.code              # => "200"
res["content-type"]   # => nil

Anthropic::Internal::Util.decode_content (lib/anthropic/internal/util.rb) decides whether to parse JSONL by matching the response content-type against a regex:

def decode_content(headers, stream:, suppress_error: false)
case (content_type = headers["content-type"])
in Anthropic::Internal::Util::JSON_CONTENT
return nil if (json = stream.to_a.join).empty?
begin
JSON.parse(json, symbolize_names: true)
rescue JSON::ParserError => e
raise e unless suppress_error
json
end
in Anthropic::Internal::Util::JSONL_CONTENT
lines = decode_lines(stream)
chain_fused(lines) do |y|
lines.each do
next if _1.empty?
y << JSON.parse(_1, symbolize_names: true)
end
end
in %r{^text/event-stream}
lines = decode_lines(stream)
decode_sse(lines)
else
text = stream.to_a.join
force_charset!(content_type, text: text)
StringIO.new(text)
end
end

Because content_type is nil, none of the case/in regex patterns match (Regexp === nil is always false), so it falls into the else branch and returns the unparsed body as a StringIO.

JsonLStream#iterator (lib/anthropic/internal/json_l_stream.rb) then iterates that StringIO line by line and coerces each:

@iterator ||= Anthropic::Internal::Util.chain_fused(@stream) do |y|
@stream.each do
y << Anthropic::Internal::Type::Converter.coerce(@model, _1)
end
end

Converter.coerce(MessageBatchIndividualResponse, "<raw json string>") can't convert a String to the model, so it leniently returns the string unchanged — and raw strings reach the caller.

So the request side already knows the format — Resources::Messages::Batches#results_streaming sets accept: application/x-jsonl and wraps the response in JsonLStream — but the parse decision in decode_content keys off the (absent) response content-type instead of the expected type.

Suggested fixes (any one resolves it)

  1. In decode_content, when the response is being consumed as a JsonLStream (or when accept/expected type is JSONL), force the JSONL decode path rather
    than relying solely on the response content-type header.
  2. Make JsonLStream#iterator JSON-parse raw String lines itself (defensive against an upstream that yields unparsed lines).
  3. Have results_streaming pass its known expected content type into the decode step so a missing response header falls back to JSONL rather than the raw else branch.

Workaround

Parse + coerce each line manually (note: coercion requires symbol keys):

MODEL = Anthropic::Models::Messages::MessageBatchIndividualResponse

client.messages.batches.results_streaming(batch_id).each do |raw|
  result = raw.is_a?(String) ? Anthropic::Internal::Type::Converter.coerce(MODEL, JSON.parse(raw, symbolize_names: true)) : raw
  # result.custom_id, result.result.type, result.result.message, ...
end

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions