Skip to content

Fix mathematical, bitwise and logical operator precedence in SQL#4221

Open
hazefully wants to merge 2 commits into
FoundationDB:mainfrom
hazefully:fix-operator-precedence
Open

Fix mathematical, bitwise and logical operator precedence in SQL#4221
hazefully wants to merge 2 commits into
FoundationDB:mainfrom
hazefully:fix-operator-precedence

Conversation

@hazefully

@hazefully hazefully commented May 26, 2026

Copy link
Copy Markdown
Contributor

This PR changes our ANTLR4 parser rules to respect the precedence of mathematical, bitwise and logical operators while parsing expressions, leading to the correct results in cases where no explicit order of operations is enforced. The parser rules are changed so the different mathematical, bitwise and logical operators are listed as alternatives in the left-recursive expression rules, which makes ANTLR process them with the same precedence as the order of the alternate rules.

This approach is based on section "5.4 Dealing with Precedence, Left Recursion, and Associativity" from the The Definitive ANTLR 4 Reference.

This fixes #4219.

@hazefully hazefully requested a review from hatyo May 26, 2026 15:21
@hazefully hazefully added the bug fix Change that fixes a bug label May 26, 2026
@hazefully hazefully force-pushed the fix-operator-precedence branch from b7a3074 to d2e65d5 Compare May 26, 2026 18:04
@github-actions

Copy link
Copy Markdown

📊 Metrics Diff Analysis Report

Summary

  • New queries: 3
  • Dropped queries: 0
  • Plan changed + metrics changed: 0
  • Plan unchanged + metrics changed: 0
ℹ️ About this analysis

This automated analysis compares query planner metrics between the base branch and this PR. It categorizes changes into:

  • New queries: Queries added in this PR
  • Dropped queries: Queries removed in this PR. These should be reviewed to ensure we are not losing coverage.
  • Plan changed + metrics changed: The query plan has changed along with planner metrics.
  • Metrics only changed: Same plan but different metrics

The last category in particular may indicate planner regressions that should be investigated.

New Queries

Count of new queries by file:

  • yaml-tests/src/test/resources/operator-precedence.metrics.yaml: 3

Comment on lines +1251 to +1253
| left=expressionAtom operator=(BIT_SHIFT_LEFT_OP | BIT_SHIFT_RIGHT_OP) right=expressionAtom #bitExpressionAtom // done
| left=expressionAtom operator=BIT_AND_OP right=expressionAtom #bitExpressionAtom // done
| left=expressionAtom operator=(BIT_XOR_OP | BIT_OR_OP) right=expressionAtom #bitExpressionAtom // done

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hatyo Do you know if it is intentional that we used to have a higher precedence for bitwise operators over arithmetic operators? I think most languages (with few exceptions like Pascal) give a higher precedence to arithmetic operators so the expression 6 & 4 + 4 would be zero instead of 8, but it looks like we have some unit tests that assert we actually give precedence to the bitwise operators:

void cachingQueryWithComplexGroupByExpressionSubsumingIndexExpressionBehavesCorrectlyCase2() throws Exception {
final var ticker = new FakeTicker();
final var cache = getCache(ticker);
final var c17Equals2 = equalsConstraint(17, 2);
final var c17Int = ofTypeInt(17);
final var c10Int = ofTypeInt(10);
final var c8Int = ofTypeInt(8);
final var c17IntNotNull = isNotNullInt(17);
final var c8IntNotNull = isNotNullInt(8);
final var c10IntNotNull = isNotNullInt(10);
final var c8Equalsc15 = covsEqualsConstraints(8, 17);
planQuery(cache, "SELECT MAX(score), game & 2 + 10 FROM score GROUP BY game & 2", BitAndScore2);
cacheShouldBe(cache, Map.of("SELECT MAX ( \"SCORE\" ) , \"GAME\" & ? + ? FROM \"SCORE\" GROUP BY \"GAME\" & ? ",
Map.of(ppe(cons(and(and(c17Equals2, c17Equals2), and(c17Int, c8Int, c10Int, c8Equalsc15, c17IntNotNull, c8IntNotNull, c10IntNotNull)))), BitAndScore2)));
final var c17Equals4 = equalsConstraint(17, 4);
planQuery(cache, "SELECT MAX(score), game & 4 + 10 FROM score GROUP BY game & 4", BitAndScore4);
cacheShouldBe(cache, Map.of("SELECT MAX ( \"SCORE\" ) , \"GAME\" & ? + ? FROM \"SCORE\" GROUP BY \"GAME\" & ? ",
Map.of(
ppe(cons(and(and(c17Equals2, c17Equals2), and(c17Int, c8Int, c10Int, c8Equalsc15, c17IntNotNull, c8IntNotNull, c10IntNotNull)))), BitAndScore2,
ppe(cons(and(and(c17Equals4, c17Equals4), and(c17Int, c8Int, c10Int, c8Equalsc15, c17IntNotNull, c8IntNotNull, c10IntNotNull)))), BitAndScore4)));
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not intentional, put differently, we did not make an explicit decision to give bitwise operations higher precedence over the other operators. The referenced unit test does not verify precedence, it just verifies that plan constraints are created exactly as they should be. That's all.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think most languages (with few exceptions like Pascal) give a higher precedence to arithmetic operators so the expression 6 & 4 + 4 would be zero instead of 8

I mean sure, but we should align with DB vendors (since bitwise operators are not part of the SQL standard) who for the most part, align with programming languages operator precedence conventions.

Bitwise operators are usually lower than basic arithmetic operator, and higher than logical operators. We should align with that. For example in PgSQL:

SELECT
1 + 3 & 5       AS actual,       -- 4 (proves + binds first)
(1 + 3) & 5     AS arith_first,  -- 4 (same as actual)
1 + (3 & 5)     AS bitwise_first -- 2 (different)
;
 actual | arith_first | bitwise_first 
--------+-------------+---------------
      4 |           4 |             2


SELECT
3 & 2 != 0                      AS actual,          -- true (proves & binds first)
(3 & 2) != 0                    AS bitwise_first,   -- true (same as actual)
3 & (2 != 0)::int               AS comparison_first -- 1 (different)

actual | bitwise_first | comparison_first
--------+---------------+------------------
 true   | true          | 1

Doing so, may break some customer queries though, so we have to be extra careful.

(once #4199 is ready, we can use it to introduce these illustrative examples).

.put("dot_product_distance", argumentsCount -> BuiltInFunctionCatalog.resolve("dot_product_distance", argumentsCount))
.put("not", argumentsCount -> BuiltInFunctionCatalog.resolve("not", argumentsCount))
.put("and", argumentsCount -> BuiltInFunctionCatalog.resolve("and", argumentsCount))
.put("&&", argumentsCount -> BuiltInFunctionCatalog.resolve("and", argumentsCount))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not seem relevant to this PR, can you please revert it (and line 137)? We can decide later if we want to add support to the non-standard && and ||.

Comment on lines +1251 to +1253
| left=expressionAtom operator=(BIT_SHIFT_LEFT_OP | BIT_SHIFT_RIGHT_OP) right=expressionAtom #bitExpressionAtom // done
| left=expressionAtom operator=BIT_AND_OP right=expressionAtom #bitExpressionAtom // done
| left=expressionAtom operator=(BIT_XOR_OP | BIT_OR_OP) right=expressionAtom #bitExpressionAtom // done

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not intentional, put differently, we did not make an explicit decision to give bitwise operations higher precedence over the other operators. The referenced unit test does not verify precedence, it just verifies that plan constraints are created exactly as they should be. That's all.

Comment on lines +1251 to +1253
| left=expressionAtom operator=(BIT_SHIFT_LEFT_OP | BIT_SHIFT_RIGHT_OP) right=expressionAtom #bitExpressionAtom // done
| left=expressionAtom operator=BIT_AND_OP right=expressionAtom #bitExpressionAtom // done
| left=expressionAtom operator=(BIT_XOR_OP | BIT_OR_OP) right=expressionAtom #bitExpressionAtom // done

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think most languages (with few exceptions like Pascal) give a higher precedence to arithmetic operators so the expression 6 & 4 + 4 would be zero instead of 8

I mean sure, but we should align with DB vendors (since bitwise operators are not part of the SQL standard) who for the most part, align with programming languages operator precedence conventions.

Bitwise operators are usually lower than basic arithmetic operator, and higher than logical operators. We should align with that. For example in PgSQL:

SELECT
1 + 3 & 5       AS actual,       -- 4 (proves + binds first)
(1 + 3) & 5     AS arith_first,  -- 4 (same as actual)
1 + (3 & 5)     AS bitwise_first -- 2 (different)
;
 actual | arith_first | bitwise_first 
--------+-------------+---------------
      4 |           4 |             2


SELECT
3 & 2 != 0                      AS actual,          -- true (proves & binds first)
(3 & 2) != 0                    AS bitwise_first,   -- true (same as actual)
3 & (2 != 0)::int               AS comparison_first -- 1 (different)

actual | bitwise_first | comparison_first
--------+---------------+------------------
 true   | true          | 1

Doing so, may break some customer queries though, so we have to be extra careful.

(once #4199 is ready, we can use it to introduce these illustrative examples).

| EXISTS '(' query ')' #existsExpressionAtom // done
| expressionAtom predicate? #predicatedExpression
| expression logicalOperator expression #logicalExpression // done
| left=expressionAtom comparisonOperator right=expressionAtom #binaryComparisonExpression // done

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these be expression comparisonOperator expression so we take advantage of left-recursion elimination and define clear precedence rules for groups of logical operators?

;

// Expressions, predicates
// Mathemtical and logical expressions must be listed as different alternatives to the left-recursive

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please rewrite the comment to better illustrate the intentions and guide future modifications to the rules below, something like:

put comparison and logical operators higher so they have less precedence than below bitwise and arithmetic operators. Order of rules matter to leverage inter-rule precedence rules as supported by Antlr4.

Comment on lines +34 to +36
- unorderedResult: [ { 20 } ]
- initialVersionAtLeast: !current_version
- unorderedResult: [ { 8 } ]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should probably mark this PR as breaking change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug fix Change that fixes a bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Order of operations when evaluating expressions is incorrect

2 participants