Skip to content

fix(type): Cast IntervalDayTimeType::valueToString fields to int for snprintf#2164

Open
yingsu00 wants to merge 1 commit into
IBM:mainfrom
yingsu00:cast-perf-01-fix-interval-day-time-varargs
Open

fix(type): Cast IntervalDayTimeType::valueToString fields to int for snprintf#2164
yingsu00 wants to merge 1 commit into
IBM:mainfrom
yingsu00:cast-perf-01-fix-interval-day-time-varargs

Conversation

@yingsu00

Copy link
Copy Markdown
Collaborator

This is a fix for a pre-existing CastExprTest.intervalDayTimeToVarchar failure that surfaces on macOS arm64 while continuing to pass on CentOS / Linux x86_64. The bug was introduced by 6bde69d ("Fix cast(IntervalDayTime as Varchar) for negative intervals (facebookincubator#9871)").

IntervalDayTimeType::valueToString built its string with

  snprintf(buf, sizeof(buf), "%s%lld %02d:%02d:%02d.%03d",
           sign.c_str(), days, hours, minutes, seconds, remainMillis);

where the four arguments matched to %d / %02d / %03d were not ints:

  • hours, minutes, seconds were int64_t
  • remainMillis was int128_t (widened by 6bde69d to safely negate std::numeric_limits<int64_t>::min())

Passing a wider integer through varargs to a %d conversion is undefined behavior. The int64_t -> %d path happens to work on every common ABI because int64_t occupies one 8-byte varargs slot and %d reads its low 4 bytes (which on little-endian contain the small value for hours / minutes / seconds < 64). The int128_t -> %d path is where macOS arm64 and Linux x86_64 diverge:

  • Linux x86_64 (SysV, CentOS CI): the first six integer/pointer varargs go in registers (RDI..R9). For our snprintf, buf/size/ format/sign/days/hours fill those, so minutes, seconds and remainMillis spill to the stack. __int128 is laid out with its low 64 bits at the lower address, and the byte where printf reads %03d happens to coincide with the low 4 bytes of remainMillis (which holds a value < 1000, so the low 4 bytes are the correct value). The test passes by coincidence.

  • macOS arm64 (Apple variadic ABI): variadic arguments are passed exclusively on the stack regardless of available registers, with strict natural alignment. The stack layout for the six varargs is:

    off  0: sign       (char*,   8 bytes)
    off  8: days       (int64_t, 8 bytes)
    off 16: hours      (int64_t, 8 bytes)
    off 24: minutes    (int64_t, 8 bytes)
    off 32: seconds    (int64_t, 8 bytes)
    off 40: 8 bytes of padding (int128 needs 16-byte alignment)
    off 48: remainMillis (int128, 16 bytes)
    

    va_arg(list, int) on this ABI reads 4 bytes and advances the cursor by 8 (the minimum slot size). After processing %s, %lld, %02d, %02d, %02d the cursor sits at offset 40 - exactly the alignment padding. %03d reads 4 bytes of uninitialised stack and prints whatever happens to be there. The visible symptom:

    expected "1 00:00:00.000"
    got      "1 00:00:00.1803785600"
    

    Day, hours, minutes, seconds render correctly (int64_t -> %d still works); only the milliseconds field, which reads from the padding, is junk.

After the day divmod each of hours, minutes, seconds, milliseconds is bounded (<24, <60, <60, <1000), so narrowing to int is safe. Compute them as int locals and pass int to snprintf - eliminates the UB on every ABI. remainMillis stays int128_t for the intermediate arithmetic so the INT64_MIN negation introduced by 6bde69d still works.

CastExprTest.intervalDayTimeToVarchar now passes on macOS arm64 without regressing Linux x86_64; 56/56 CastExprTest tests pass.

…snprintf

This is a fix for a pre-existing CastExprTest.intervalDayTimeToVarchar
failure that surfaces on macOS arm64 while continuing to pass on
CentOS / Linux x86_64. The bug was introduced by 6bde69d ("Fix
cast(IntervalDayTime as Varchar) for negative intervals (facebookincubator#9871)") -
the cast-performance stack on this branch happens to also exercise
the failing test, so the fix is bundled here for convenience.

IntervalDayTimeType::valueToString built its string with

  snprintf(buf, sizeof(buf), "%s%lld %02d:%02d:%02d.%03d",
           sign.c_str(), days, hours, minutes, seconds, remainMillis);

where the four arguments matched to %d / %02d / %03d were not ints:
- hours, minutes, seconds were int64_t
- remainMillis was int128_t (widened by 6bde69d to safely negate
  std::numeric_limits<int64_t>::min())

Passing a wider integer through varargs to a %d conversion is
undefined behavior. The int64_t -> %d path happens to work on every
common ABI because int64_t occupies one 8-byte varargs slot and %d
reads its low 4 bytes (which on little-endian contain the small
value for hours / minutes / seconds < 64). The int128_t -> %d path
is where macOS arm64 and Linux x86_64 diverge:

* Linux x86_64 (SysV, CentOS CI): the first six integer/pointer
  varargs go in registers (RDI..R9). For our snprintf, buf/size/
  format/sign/days/hours fill those, so minutes, seconds and
  remainMillis spill to the stack. __int128 is laid out with its
  low 64 bits at the lower address, and the byte where printf reads
  %03d happens to coincide with the low 4 bytes of remainMillis
  (which holds a value < 1000, so the low 4 bytes are the correct
  value). The test passes by coincidence.

* macOS arm64 (Apple variadic ABI): variadic arguments are passed
  exclusively on the stack regardless of available registers, with
  strict natural alignment. The stack layout for the six varargs is:

      off  0: sign       (char*,   8 bytes)
      off  8: days       (int64_t, 8 bytes)
      off 16: hours      (int64_t, 8 bytes)
      off 24: minutes    (int64_t, 8 bytes)
      off 32: seconds    (int64_t, 8 bytes)
      off 40: 8 bytes of padding (int128 needs 16-byte alignment)
      off 48: remainMillis (int128, 16 bytes)

  va_arg(list, int) on this ABI reads 4 bytes and advances the
  cursor by 8 (the minimum slot size). After processing %s, %lld,
  %02d, %02d, %02d the cursor sits at offset 40 - exactly the
  alignment padding. %03d reads 4 bytes of uninitialised stack and
  prints whatever happens to be there. The visible symptom:

      expected "1 00:00:00.000"
      got      "1 00:00:00.1803785600"

  Day, hours, minutes, seconds render correctly (int64_t -> %d
  still works); only the milliseconds field, which reads from the
  padding, is junk.

After the day divmod each of hours, minutes, seconds, milliseconds
is bounded (<24, <60, <60, <1000), so narrowing to int is safe.
Compute them as int locals and pass int to snprintf - eliminates the
UB on every ABI. remainMillis stays int128_t for the intermediate
arithmetic so the INT64_MIN negation introduced by 6bde69d still
works.

CastExprTest.intervalDayTimeToVarchar now passes on macOS arm64
without regressing Linux x86_64; 56/56 CastExprTest tests pass.
@yingsu00 yingsu00 requested a review from xin-zhang2 June 19, 2026 05:02
@yingsu00 yingsu00 self-assigned this Jun 19, 2026
@yingsu00 yingsu00 added the bolt label Jun 19, 2026
@prestodb-ci prestodb-ci force-pushed the main branch 2 times, most recently from 10cb9b5 to 06e3606 Compare June 27, 2026 07:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant