Skip to content

Fix Column.string returning raw bytes for non-text column types#105

Open
ericraio wants to merge 1 commit into
apple:mainfrom
ericraio:eraio/fix-column-string-type-check
Open

Fix Column.string returning raw bytes for non-text column types#105
ericraio wants to merge 1 commit into
apple:mainfrom
ericraio:eraio/fix-column-string-type-check

Conversation

@ericraio

@ericraio ericraio commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Fix Column.string returning raw bytes for non-text column types

Motivation:

The DataStax C driver's cass_value_get_string has no type checking - it calls as_string_ref() and returns CASS_OK for any non-null value, regardless of the column's actual CQL type.

The Swift toString() helper called this unconditionally, meaning Column.string appeared to succeed for int, bigint, blob, uuid, boolean, float, double, UDT and frozen collection columns, returning their raw binary serialization decoded as lossy UTF-8.

This caused callers that try Column.string before falling back to typed accessors to silently receive binary garbage instead of nil, breaking display layers for frozen UDTs and complex types.

Modifications:

  • In toString(), call cass_value_type first and return nil early for any type that is not ASCII, TEXT, or VARCHAR.
  • Replace String(decoding:as:UTF8.self) (non-failable, silently replaces invalid UTF-8 with U+FFFD) with String(bytes:encoding:) (failable, returns nil for invalid UTF-8).
  • Add DataTests.swift with integration tests verifying Column.string returns a value for text-typed columns and nil for all other types.

Result:

Column.string now returns nil for non-text columns, matching the documented semantics. Typed accessors on those columns continue to work correctly. Callers that use Column.string as a fallback for unknown types will correctly fall through to bytes or other representations.

Motivation:

The DataStax C driver's cass_value_get_string has no type checking -
it calls as_string_ref() and returns CASS_OK for any non-null value,
regardless of the column's actual CQL type. The Swift toString() helper
called this unconditionally, meaning Column.string appeared to succeed
for int, bigint, blob, uuid, boolean, float, double, UDT, and frozen
collection columns, returning their raw binary serialization decoded as
lossy UTF-8.

This caused callers that try Column.string before falling back to typed
accessors to silently receive binary garbage instead of nil, breaking
display layers for frozen UDTs and complex types.

Modifications:

- In toString(), call cass_value_type first and return nil early for any
  type that is not ASCII, TEXT, or VARCHAR.
- Replace String(decoding:as:UTF8.self) (non-failable, silently replaces
  invalid UTF-8 with U+FFFD) with String(bytes:encoding:) (failable,
  returns nil for invalid UTF-8).
- Add DataTests.swift with integration tests verifying Column.string
  returns a value for text-typed columns and nil for all other types.

Result:

Column.string now returns nil for non-text columns, matching the
documented semantics. Typed accessors on those columns continue to work
correctly. Callers that use Column.string as a fallback for unknown
types will correctly fall through to bytes or other representations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant