Bug: messages.batches.results_streaming yields raw JSON strings instead of coerced model objects when the results response has no content-type header
Environment
anthropic gem: 1.49.0
- Ruby: 4.0.3
Summary
Iterating a batch results stream yields raw JSONL line Strings instead of coerced Anthropic::Models::Messages::MessageBatchIndividualResponse objects.
Any attribute access on the yielded value raises NoMethodError.
client = Anthropic::Client.new
client.messages.batches.results_streaming("msgbatch_...").each do |result|
result.custom_id
# => NoMethodError: undefined method 'custom_id' for an instance of String
end
The yielded objects are the raw JSONL lines:
item = nil
client.messages.batches.results_streaming("msgbatch_...").each { |r| item = r; break }
item.class # => String (expected: Anthropic::Models::Messages::MessageBatchIndividualResponse)
item[0, 60] # => "{\"custom_id\":\"foo-1\",\"result\":{\"type\":\"succeeded\",\"mes"
Expected
The stream should yield coerced MessageBatchIndividualResponse objects (.custom_id, .result.type, .result.message, etc.), as the type signature and docs imply.
Root cause
The batch-results endpoint returns the .jsonl body with no content-type response header:
require "net/http"
uri = URI("https://api.anthropic.com/v1/messages/batches/msgbatch_.../results")
req = Net::HTTP::Get.new(uri)
req["x-api-key"] = ENV["ANTHROPIC_API_KEY"]
req["anthropic-version"] = "2023-06-01"
req["accept"] = "application/x-jsonl"
res = Net::HTTP.start(uri.host, uri.port, use_ssl: true) { _1.request(req) }
res.code # => "200"
res["content-type"] # => nil
Anthropic::Internal::Util.decode_content (lib/anthropic/internal/util.rb) decides whether to parse JSONL by matching the response content-type against a regex:
|
def decode_content(headers, stream:, suppress_error: false) |
|
case (content_type = headers["content-type"]) |
|
in Anthropic::Internal::Util::JSON_CONTENT |
|
return nil if (json = stream.to_a.join).empty? |
|
|
|
begin |
|
JSON.parse(json, symbolize_names: true) |
|
rescue JSON::ParserError => e |
|
raise e unless suppress_error |
|
json |
|
end |
|
in Anthropic::Internal::Util::JSONL_CONTENT |
|
lines = decode_lines(stream) |
|
chain_fused(lines) do |y| |
|
lines.each do |
|
next if _1.empty? |
|
|
|
y << JSON.parse(_1, symbolize_names: true) |
|
end |
|
end |
|
in %r{^text/event-stream} |
|
lines = decode_lines(stream) |
|
decode_sse(lines) |
|
else |
|
text = stream.to_a.join |
|
force_charset!(content_type, text: text) |
|
StringIO.new(text) |
|
end |
|
end |
Because content_type is nil, none of the case/in regex patterns match (Regexp === nil is always false), so it falls into the else branch and returns the unparsed body as a StringIO.
JsonLStream#iterator (lib/anthropic/internal/json_l_stream.rb) then iterates that StringIO line by line and coerces each:
|
@iterator ||= Anthropic::Internal::Util.chain_fused(@stream) do |y| |
|
@stream.each do |
|
y << Anthropic::Internal::Type::Converter.coerce(@model, _1) |
|
end |
|
end |
Converter.coerce(MessageBatchIndividualResponse, "<raw json string>") can't convert a String to the model, so it leniently returns the string unchanged — and raw strings reach the caller.
So the request side already knows the format — Resources::Messages::Batches#results_streaming sets accept: application/x-jsonl and wraps the response in JsonLStream — but the parse decision in decode_content keys off the (absent) response content-type instead of the expected type.
Suggested fixes (any one resolves it)
- In
decode_content, when the response is being consumed as a JsonLStream (or when accept/expected type is JSONL), force the JSONL decode path rather
than relying solely on the response content-type header.
- Make
JsonLStream#iterator JSON-parse raw String lines itself (defensive against an upstream that yields unparsed lines).
- Have
results_streaming pass its known expected content type into the decode step so a missing response header falls back to JSONL rather than the raw else branch.
Workaround
Parse + coerce each line manually (note: coercion requires symbol keys):
MODEL = Anthropic::Models::Messages::MessageBatchIndividualResponse
client.messages.batches.results_streaming(batch_id).each do |raw|
result = raw.is_a?(String) ? Anthropic::Internal::Type::Converter.coerce(MODEL, JSON.parse(raw, symbolize_names: true)) : raw
# result.custom_id, result.result.type, result.result.message, ...
end
Bug:
messages.batches.results_streamingyields raw JSON strings instead of coerced model objects when the results response has nocontent-typeheaderEnvironment
anthropicgem: 1.49.0Summary
Iterating a batch results stream yields raw JSONL line
Strings instead of coercedAnthropic::Models::Messages::MessageBatchIndividualResponseobjects.Any attribute access on the yielded value raises
NoMethodError.The yielded objects are the raw JSONL lines:
Expected
The stream should yield coerced
MessageBatchIndividualResponseobjects (.custom_id,.result.type,.result.message, etc.), as the type signature and docs imply.Root cause
The batch-results endpoint returns the
.jsonlbody with nocontent-typeresponse header:Anthropic::Internal::Util.decode_content(lib/anthropic/internal/util.rb) decides whether to parse JSONL by matching the responsecontent-typeagainst a regex:anthropic-sdk-ruby/lib/anthropic/internal/util.rb
Lines 743 to 771 in 67f5d27
Because
content_typeisnil, none of thecase/inregex patterns match (Regexp === nilis alwaysfalse), so it falls into theelsebranch and returns the unparsed body as aStringIO.JsonLStream#iterator(lib/anthropic/internal/json_l_stream.rb) then iterates that StringIO line by line and coerces each:anthropic-sdk-ruby/lib/anthropic/internal/jsonl_stream.rb
Lines 18 to 22 in 67f5d27
Converter.coerce(MessageBatchIndividualResponse, "<raw json string>")can't convert aStringto the model, so it leniently returns the string unchanged — and raw strings reach the caller.So the request side already knows the format —
Resources::Messages::Batches#results_streamingsetsaccept: application/x-jsonland wraps the response inJsonLStream— but the parse decision indecode_contentkeys off the (absent) response content-type instead of the expected type.Suggested fixes (any one resolves it)
decode_content, when the response is being consumed as aJsonLStream(or whenaccept/expected type is JSONL), force the JSONL decode path ratherthan relying solely on the response
content-typeheader.JsonLStream#iteratorJSON-parse rawStringlines itself (defensive against an upstream that yields unparsed lines).results_streamingpass its known expected content type into the decode step so a missing response header falls back to JSONL rather than the rawelsebranch.Workaround
Parse + coerce each line manually (note: coercion requires symbol keys):