Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -310,8 +310,7 @@ mysql> SELECT cast('[1, 2]' as json);
```

### Key Differences and Notes:
- CAST(string AS JSON): Used to parse strings that conform to JSON syntax.
- CAST(string AS JSON): For Number types, it will only parse Int8, Int16, Int32, Int64, Int128, and Double types, not Decimal type.
- CAST(string AS JSON): Used to parse strings that conform to JSON syntax. For Number types, it will only parse Int8, Int16, Int32, Int64, Int128, and Double types, not Decimal type.
- Unlike most other JSON implementations, Doris's JSONB type supports up to Int128 precision. Numbers exceeding Int128 precision may overflow.
- If the input number string is 12.34, it will be parsed as a Double; if there's no decimal point, it will be parsed as an integer (if the size exceeds Int128 range, it will be converted to Double but with precision loss)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,12 @@
{
"title": "STRING",
"language": "en",
"description": "STRING (M) A variable length string. Default support is 1048576 bytes (1M), adjustable up to 2147483643 bytes (2G),"
"description": "A variable length string. Default support is 1048576 bytes (1M), adjustable up to 2147483643 bytes (2G),"
}
---

## STRING
### Description
STRING (M)
A variable length string. Default support is 1048576 bytes (1M), adjustable up to 2147483643 bytes (2G),and the length of the String type is also limited by the configuration string_type_length_soft_limit_bytes(a soft limit of string type length) of be. the String type can only be used in the value column, not in the key column and the partition and bucket columns

Note: Variable length strings are stored in UTF-8 encoding, so usually English characters occupies 1 byte, and Chinese characters occupies 3 bytes.
Expand Down
2 changes: 1 addition & 1 deletion docs/sql-manual/basic-element/variables.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ User-defined variables are a mechanism for temporarily storing data within a ses
| allow_partition_column_nullable | true | true | 0 |
| analyze_timeout | 43200 | 43200 | 0 |
| version | 5.7.99 | 5.7.99 | 0 |
| version_comment | Doris version doris0.0.0--de61c5823 | Doris version doris-0.0--de61c5823 | 0 |
| version_comment | Doris version doris-0.0.0--de61c58223 | Doris version doris-0.0.0--de61c58223 | 0 |
| wait_full_block_schedule_times | 2 | 2 | 0 |
| wait_timeout | 28800 | 28800 | 0 |
| workload_group | | | 0 |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@ INSERT INTO product_reviews VALUES

Using AI_AGG to summarize and evaluate:
```sql
SET default_ai_resoure = 'ai_resource_name';
SET default_ai_resource = 'ai_resource_name';
SELECT
product_id,
AI_AGG(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,3 @@ select array_agg(c2) from test_doris_array_agg where c1 is null;
| [] |
+---------------+
```
| 1 | ["a","b"] |


Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,6 @@ Returns a value of Bitmap type. If there is no valid data in the group, returns

## Example

## Example

```sql
-- setup
CREATE TABLE user_tags (
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ BOOL_OR(<expr>)

## Return Value

The return value is BOOLEAN. It returns TRUE when all non-NULL values exist, otherwise it returns FALSE.
The return value is BOOLEAN. It returns TRUE when at least one non-NULL value is TRUE, otherwise it returns FALSE.

If all values of the expression are NULL or the expression is empty, the function returns NULL.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,3 @@ select id, covar_samp(x, y) from baseall group by id;
| 5 | NULL |
+------+------------------+
```
| 4 | NULL |
| 5 | NULL |
+------+------------------+
```
Original file line number Diff line number Diff line change
Expand Up @@ -126,8 +126,8 @@ Query result description:
Field description:
- num_buckets: The number of buckets
- buckets: All buckets
- lower: Upper bound of the bucket
- upper: Lower bound of the bucket
- lower: Lower bound of the bucket
- upper: Upper bound of the bucket
- count: The number of elements contained in the bucket
- pre_sum: The total number of elements in the front bucket
- ndv: The number of different values in the bucket
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -76,14 +76,14 @@ FROM sales_data;
```

```sql
select percentile(sale_price, NULL) from sales_data;
select percentile_reservoir(sale_price, NULL) from sales_data;
```

If all input values are NULL, returns NULL.

```text
+------------------------------+
| percentile(sale_price, NULL) |
| percentile_reservoir(sale_price, NULL) |
+------------------------------+
| NULL |
+------------------------------+
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,12 @@
{
"title": "STRING",
"language": "en",
"description": "STRING (M) A variable length string. Default support is 1048576 bytes (1M), adjustable up to 2147483643 bytes (2G),"
"description": "A variable length string. Default support is 1048576 bytes (1M), adjustable up to 2147483643 bytes (2G),"
}
---

## STRING
### Description
STRING (M)
A variable length string. Default support is 1048576 bytes (1M), adjustable up to 2147483643 bytes (2G),and the length of the String type is also limited by the configuration string_type_length_soft_limit_bytes(a soft limit of string type length) of be. the String type can only be used in the value column, not in the key column and the partition and bucket columns

Note: Variable length strings are stored in UTF-8 encoding, so usually English characters occupies 1 byte, and Chinese characters occupies 3 bytes.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ User-defined variables are a mechanism for temporarily storing data within a ses
| allow_partition_column_nullable | true | true | 0 |
| analyze_timeout | 43200 | 43200 | 0 |
| version | 5.7.99 | 5.7.99 | 0 |
| version_comment | Doris version doris0.0.0--de61c5823 | Doris version doris-0.0--de61c5823 | 0 |
| version_comment | Doris version doris-0.0.0--de61c58223 | Doris version doris-0.0.0--de61c58223 | 0 |
| wait_full_block_schedule_times | 2 | 2 | 0 |
| wait_timeout | 28800 | 28800 | 0 |
| workload_group | | | 0 |
Expand Down
Original file line number Diff line number Diff line change
@@ -1,14 +1,13 @@
---

Check notice on line 1 in versioned_docs/version-3.x/sql-manual/basic-element/sql-data-types/string-type/STRING.md

View workflow job for this annotation

GitHub Actions / Build Check

i18n-sync-locale-candidate

Japanese docs are report-only. Generate a candidate translation from the changed files and merge it only after human review. Owner%3A @apache/doris-website-maintainers

Check notice on line 1 in versioned_docs/version-3.x/sql-manual/basic-element/sql-data-types/string-type/STRING.md

View workflow job for this annotation

GitHub Actions / Build Check

i18n-sync-locale-counterpart

Chinese 3.x counterpart exists. Confirm whether the change is supported in 3.x before leaving it unsynced. Owner%3A @apache/doris-website-maintainers
{
"title": "STRING",
"language": "en",
"description": "STRING (M) A variable length string. Default support is 1048576 bytes (1M), adjustable up to 2147483643 bytes (2G),"
"description": "A variable length string. Default support is 1048576 bytes (1M), adjustable up to 2147483643 bytes (2G),"
}
---

## STRING
### Description
STRING (M)
A variable length string. Default support is 1048576 bytes (1M), adjustable up to 2147483643 bytes (2G),and the length of the String type is also limited by the configuration string_type_length_soft_limit_bytes(a soft limit of string type length) of be. the String type can only be used in the value column, not in the key column and the partition and bucket columns

Note: Variable length strings are stored in UTF-8 encoding, so usually English characters occupies 1 byte, and Chinese characters occupies 3 bytes.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
---

Check notice on line 1 in versioned_docs/version-3.x/sql-manual/basic-element/variables.md

View workflow job for this annotation

GitHub Actions / Build Check

i18n-sync-locale-candidate

Japanese docs are report-only. Generate a candidate translation from the changed files and merge it only after human review. Owner%3A @apache/doris-website-maintainers

Check notice on line 1 in versioned_docs/version-3.x/sql-manual/basic-element/variables.md

View workflow job for this annotation

GitHub Actions / Build Check

i18n-sync-locale-counterpart

Chinese 3.x counterpart exists. Confirm whether the change is supported in 3.x before leaving it unsynced. Owner%3A @apache/doris-website-maintainers
{
"title": "Variables",
"language": "en",
Expand Down Expand Up @@ -109,7 +109,7 @@
| allow_partition_column_nullable | true | true | 0 |
| analyze_timeout | 43200 | 43200 | 0 |
| version | 5.7.99 | 5.7.99 | 0 |
| version_comment | Doris version doris0.0.0--de61c5823 | Doris version doris-0.0--de61c5823 | 0 |
| version_comment | Doris version doris-0.0.0--de61c58223 | Doris version doris-0.0.0--de61c58223 | 0 |
| wait_full_block_schedule_times | 2 | 2 | 0 |
| wait_timeout | 28800 | 28800 | 0 |
| workload_group | | | 0 |
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
---

Check notice on line 1 in versioned_docs/version-4.x/sql-manual/basic-element/sql-data-types/semi-structured/JSON.md

View workflow job for this annotation

GitHub Actions / Build Check

i18n-sync-locale-candidate

Japanese docs are report-only. Generate a candidate translation from the changed files and merge it only after human review. Owner%3A @apache/doris-website-maintainers

Check notice on line 1 in versioned_docs/version-4.x/sql-manual/basic-element/sql-data-types/semi-structured/JSON.md

View workflow job for this annotation

GitHub Actions / Build Check

i18n-sync-version-candidate

A 3.x counterpart exists. Confirm whether the change is supported in 3.x before leaving it unsynced. Owner%3A @apache/doris-website-maintainers
{
"title": "JSON | Semi Structured",
"language": "en",
Expand All @@ -11,7 +11,7 @@

## JSON Introduction

JSON (JavaScript Object Notation) is an open standard file format and data interchange format that uses human-readable text to store and transmit data. According to the official specification [RFC7159](https://datatracker.ietf.org/doc/html/rfc7159), JSON supports the following basic types:

Check notice on line 14 in versioned_docs/version-4.x/sql-manual/basic-element/sql-data-types/semi-structured/JSON.md

View workflow job for this annotation

GitHub Actions / Build Check

link-external-report-only

External link is report-only and was not fetched%3A https%3A//datatracker.ietf.org/doc/html/rfc7159. Owner%3A @apache/doris-website-maintainers
- Bool
- Null
- Number
Expand All @@ -19,7 +19,7 @@
- Array
- Object

The JSON data type stores [JSON](https://www.rfc-editor.org/rfc/rfc8785) data efficiently in a binary format and allows access to its internal fields through JSON functions.

Check notice on line 22 in versioned_docs/version-4.x/sql-manual/basic-element/sql-data-types/semi-structured/JSON.md

View workflow job for this annotation

GitHub Actions / Build Check

link-external-report-only

External link is report-only and was not fetched%3A https%3A//www.rfc-editor.org/rfc/rfc8785. Owner%3A @apache/doris-website-maintainers

By default, it supports up to 1048576 bytes (1MB), and can be increased up to 2147483643 bytes (2GB). This can be adjusted via the `string_type_length_soft_limit_bytes` configuration.

Expand Down Expand Up @@ -219,7 +219,7 @@
```sql
json_column_name JSON
```

Check warning on line 222 in versioned_docs/version-4.x/sql-manual/basic-element/sql-data-types/semi-structured/JSON.md

View workflow job for this annotation

GitHub Actions / Build Check

markdown-code-fence-language

Code fence should declare a language. Owner%3A @apache/doris-website-maintainers
**Insertion:**
- Using `INSERT INTO VALUES` with the format as a string surrounded by quotes. For example:
```sql
Expand Down Expand Up @@ -310,8 +310,7 @@
```

### Key Differences and Notes:
- CAST(string AS JSON): Used to parse strings that conform to JSON syntax.
- CAST(string AS JSON): For Number types, it will only parse Int8, Int16, Int32, Int64, Int128, and Double types, not Decimal type.
- CAST(string AS JSON): Used to parse strings that conform to JSON syntax. For Number types, it will only parse Int8, Int16, Int32, Int64, Int128, and Double types, not Decimal type.
- Unlike most other JSON implementations, Doris's JSONB type supports up to Int128 precision. Numbers exceeding Int128 precision may overflow.
- If the input number string is 12.34, it will be parsed as a Double; if there's no decimal point, it will be parsed as an integer (if the size exceeds Int128 range, it will be converted to Double but with precision loss)

Expand Down Expand Up @@ -392,19 +391,19 @@
7. Nested structure handling:
- Objects and arrays support unlimited nesting levels
- Each nesting level processed recursively using the same rules

Check warning on line 394 in versioned_docs/version-4.x/sql-manual/basic-element/sql-data-types/semi-structured/JSON.md

View workflow job for this annotation

GitHub Actions / Build Check

markdown-code-fence-language

Code fence should declare a language. Owner%3A @apache/doris-website-maintainers
## Number Precision Issues

When converting Doris internal types to JSONB using to_json, no precision loss occurs.
When using Doris internal JSON functions, if the return value is also a JSONB type, no precision loss occurs.
However, converting Doris JSONB to plain text and then back to JSONB can cause precision loss.

Example: Doris JSON type object

Check warning on line 401 in versioned_docs/version-4.x/sql-manual/basic-element/sql-data-types/semi-structured/JSON.md

View workflow job for this annotation

GitHub Actions / Build Check

markdown-code-fence-language

Code fence should declare a language. Owner%3A @apache/doris-website-maintainers
```
Object{
"a": (Decimal 18446744073709551616.123)
}
```

Check warning on line 406 in versioned_docs/version-4.x/sql-manual/basic-element/sql-data-types/semi-structured/JSON.md

View workflow job for this annotation

GitHub Actions / Build Check

markdown-code-fence-language

Code fence should declare a language. Owner%3A @apache/doris-website-maintainers

Converted to plain text:
```
Expand All @@ -419,9 +418,9 @@
```

## Configuration and Limitations
- JSON supports 1,048,576 bytes (1 MB) by default

Check warning on line 421 in versioned_docs/version-4.x/sql-manual/basic-element/sql-data-types/semi-structured/JSON.md

View workflow job for this annotation

GitHub Actions / Build Check

markdown-heading-increment

Heading level jumps from H2 to H4. Owner%3A @apache/doris-website-maintainers
- Size limit can be adjusted via the BE configuration parameter string_type_length_soft_limit_bytes
- Maximum adjustment up to 2,147,483,643 bytes (approximately 2 GB)

Check warning on line 423 in versioned_docs/version-4.x/sql-manual/basic-element/sql-data-types/semi-structured/JSON.md

View workflow job for this annotation

GitHub Actions / Build Check

markdown-code-fence-language

Code fence should declare a language. Owner%3A @apache/doris-website-maintainers
- In Doris JSON type Objects, key length cannot exceed 255 bytes

## Usage Example
Expand All @@ -443,7 +442,7 @@
PROPERTIES("replication_num" = "1");
```

#### Load data

Check warning on line 445 in versioned_docs/version-4.x/sql-manual/basic-element/sql-data-types/semi-structured/JSON.md

View workflow job for this annotation

GitHub Actions / Build Check

markdown-code-fence-language

Code fence should declare a language. Owner%3A @apache/doris-website-maintainers

##### stream load test_json.csv test data

Expand Down Expand Up @@ -473,7 +472,7 @@
19 ''
20 'abc'
21 abc
22 100x

Check warning on line 475 in versioned_docs/version-4.x/sql-manual/basic-element/sql-data-types/semi-structured/JSON.md

View workflow job for this annotation

GitHub Actions / Build Check

markdown-code-fence-language

Code fence should declare a language. Owner%3A @apache/doris-website-maintainers
23 6.a8
24 {x
25 [123, abc]
Expand All @@ -497,7 +496,7 @@
"LoadTimeMs": 48,
"BeginTxnTimeMs": 0,
"StreamLoadPutTimeMs": 1,
"ReadDataTimeMs": 0,

Check warning on line 499 in versioned_docs/version-4.x/sql-manual/basic-element/sql-data-types/semi-structured/JSON.md

View workflow job for this annotation

GitHub Actions / Build Check

markdown-code-fence-language

Code fence should declare a language. Owner%3A @apache/doris-website-maintainers
"WriteDataTimeMs": 45,
"CommitAndPublishTimeMs": 0,
"ErrorURL": "http://172.21.0.5:8840/api/_load_error_log?file=__shard_2/error_log_insert_stmt_95435c4bf5f156df-426735082a9296af_95435c4bf5f156df_426735082a9296af"
Expand All @@ -522,7 +521,7 @@
"BeginTxnTimeMs": 0,
"StreamLoadPutTimeMs": 2,
"ReadDataTimeMs": 0,
"WriteDataTimeMs": 45,

Check warning on line 524 in versioned_docs/version-4.x/sql-manual/basic-element/sql-data-types/semi-structured/JSON.md

View workflow job for this annotation

GitHub Actions / Build Check

markdown-code-fence-language

Code fence should declare a language. Owner%3A @apache/doris-website-maintainers
"CommitAndPublishTimeMs": 19,
"ErrorURL": "http://172.21.0.5:8840/api/_load_error_log?file=__shard_0/error_log_insert_stmt_a1463f98a7b15caf-c79399b920f5bfa3_a1463f98a7b15caf_c79399b920f5bfa3"
}
Expand Down
Original file line number Diff line number Diff line change
@@ -1,14 +1,13 @@
---

Check notice on line 1 in versioned_docs/version-4.x/sql-manual/basic-element/sql-data-types/string-type/STRING.md

View workflow job for this annotation

GitHub Actions / Build Check

i18n-sync-locale-candidate

Japanese docs are report-only. Generate a candidate translation from the changed files and merge it only after human review. Owner%3A @apache/doris-website-maintainers
{
"title": "STRING",
"language": "en",
"description": "STRING (M) A variable length string. Default support is 1048576 bytes (1M), adjustable up to 2147483643 bytes (2G),"
"description": "A variable length string. Default support is 1048576 bytes (1M), adjustable up to 2147483643 bytes (2G),"
}
---

## STRING
### Description
STRING (M)
A variable length string. Default support is 1048576 bytes (1M), adjustable up to 2147483643 bytes (2G),and the length of the String type is also limited by the configuration string_type_length_soft_limit_bytes(a soft limit of string type length) of be. the String type can only be used in the value column, not in the key column and the partition and bucket columns

Note: Variable length strings are stored in UTF-8 encoding, so usually English characters occupies 1 byte, and Chinese characters occupies 3 bytes.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ User-defined variables are a mechanism for temporarily storing data within a ses
| allow_partition_column_nullable | true | true | 0 |
| analyze_timeout | 43200 | 43200 | 0 |
| version | 5.7.99 | 5.7.99 | 0 |
| version_comment | Doris version doris0.0.0--de61c5823 | Doris version doris-0.0--de61c5823 | 0 |
| version_comment | Doris version doris-0.0.0--de61c58223 | Doris version doris-0.0.0--de61c58223 | 0 |
| wait_full_block_schedule_times | 2 | 2 | 0 |
| wait_timeout | 28800 | 28800 | 0 |
| workload_group | | | 0 |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@ INSERT INTO product_reviews VALUES

Using AI_AGG to summarize and evaluate:
```sql
SET default_ai_resoure = 'ai_resource_name';
SET default_ai_resource = 'ai_resource_name';
SELECT
product_id,
AI_AGG(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,3 @@ select array_agg(c2) from test_doris_array_agg where c1 is null;
| [] |
+---------------+
```
| 1 | ["a","b"] |


Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,6 @@ Returns a value of Bitmap type. If there is no valid data in the group, returns

## Example

## Example

```sql
-- setup
CREATE TABLE user_tags (
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ BOOL_OR(<expr>)

## Return Value

The return value is BOOLEAN. It returns TRUE when all non-NULL values exist, otherwise it returns FALSE.
The return value is BOOLEAN. It returns TRUE when at least one non-NULL value is TRUE, otherwise it returns FALSE.

If all values of the expression are NULL or the expression is empty, the function returns NULL.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,3 @@ select id, covar_samp(x, y) from baseall group by id;
| 5 | NULL |
+------+------------------+
```
| 4 | NULL |
| 5 | NULL |
+------+------------------+
```
Original file line number Diff line number Diff line change
Expand Up @@ -126,8 +126,8 @@ Query result description:
Field description:
- num_buckets: The number of buckets
- buckets: All buckets
- lower: Upper bound of the bucket
- upper: Lower bound of the bucket
- lower: Lower bound of the bucket
- upper: Upper bound of the bucket
- count: The number of elements contained in the bucket
- pre_sum: The total number of elements in the front bucket
- ndv: The number of different values in the bucket
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@

## Description

This function applies [reservoir sampling](https://en.wikipedia.org/wiki/Reservoir_sampling) with a reservoir size up to 8192 and a random number generator for sampling. This used to calculate approximate percentiles at position `p`.

Check notice on line 11 in versioned_docs/version-4.x/sql-manual/sql-functions/aggregate-functions/percentile_reservoir.md

View workflow job for this annotation

GitHub Actions / Build Check

link-external-report-only

External link is report-only and was not fetched%3A https%3A//en.wikipedia.org/wiki/Reservoir_sampling. Owner%3A @zclllyybb
The value of `p` is between `0` and `1`.
Note that this is not the average of the two numbers.

Expand Down Expand Up @@ -76,14 +76,14 @@
```

```sql
select percentile(sale_price, NULL) from sales_data;
select percentile_reservoir(sale_price, NULL) from sales_data;
```

If all input values are NULL, returns NULL.

```text
+------------------------------+
| percentile(sale_price, NULL) |
| percentile_reservoir(sale_price, NULL) |
+------------------------------+
| NULL |
+------------------------------+
Expand Down
Loading