Fix double-free when removing text from a degenerate text range#103
Conversation
Problem
-------
A text range is two positions: start and end. Internally a position has
both a "paragraph index" (which paragraph it lives in) and a
"global text offset" (how far into the whole document it is). For
well-formed inputs these two agree: a later paragraph also has a
later global offset.
skb_rich_text_get_paragraph_range_from_text_range orders its two
positions by global offset alone, then hands the ordered pair to
skb__rich_text_replace. But skb__rich_text_replace doesn't only use
the global offset — it indexes paragraphs directly and assumes
start.paragraph_idx <= end.paragraph_idx.
Two positions can tie on global offset while still sitting in
different paragraphs. When that happens, the ordering-by-global step
silently picks the wrong order, and skb__rich_text_replace runs with
start.paragraph_idx > end.paragraph_idx. From there:
- removed_paragraphs_count = end.paragraph_idx + 1 - start.paragraph_idx
goes negative.
- new_paragraphs_count = old - removed + source therefore grows the
array instead of shrinking it.
- The follow-up memmove leaves two paragraph slots holding the same
`attributes` pointer.
- The next edit or reset walks those slots and calls free() on that
pointer twice. Boom.
How a tie shows up in practice
------------------------------
The simplest way to produce a tie is to pass a position whose
text offset is negative. The position lookup clamps negative offsets
to {paragraph_idx=0, text_offset=0, global_text_offset=0}. If the other
endpoint of the range happens to also resolve to global offset 0 but
in a later paragraph (for instance because the leading paragraphs are
empty), the two positions tie on global offset 0 but live in
paragraph 0 and paragraph N respectively, and the sort picks the
later paragraph as "start".
A concrete way to hit this from the editor: press Delete in a
document made up of empty paragraphs. The Delete handler builds its
remove range as
{ start = selection.end,
end = skb_rich_text_get_next_grapheme_pos(selection.end) }
and in that document next_grapheme_pos returns offset = -1 (its EOL
branch computes paragraph->global_text_offset + text_count - 1, and
both are zero). The negative offset, the global-only sort, and the
paragraph-index-based replace path then combine to corrupt the
paragraph array.
Fix
---
Sort the two positions by (paragraph_idx, text_offset) instead of by
global offset. That ordering matches what every downstream consumer
already assumes, so the returned range always satisfies
start.paragraph_idx <= end.paragraph_idx and the rest of the pipeline
behaves correctly.
|
Looks like there might be other issues too that are bringing up this issue. The I think it should be: There seems to be a bunch of other cases with Only the last paragraph is allowed the truly empty, all the others should have at least one character, \n. Similar test with global text offset is also done in:
I suggest following:
Let's fix the |
|
Thanks for the prompt response. I've made some edits. Please don't hesitate to give more direction if needed. |
memononen
left a comment
There was a problem hiding this comment.
I missed the < vs <=, let's change that and it's good to go.
| result.start = start_pos; | ||
| result.end = end_pos; | ||
| } else { | ||
| if (skb_paragraph_position_less_than(end_pos, start_pos)) { |
There was a problem hiding this comment.
I think this should be (start_pos, end_pos)
| /** Returns true if a comes strictly before b in document order. Compares by | ||
| * (paragraph_idx, text_offset); do not compare global_text_offset directly, as | ||
| * it can tie across paragraph boundaries. */ | ||
| static inline bool skb_paragraph_position_less_than(skb_paragraph_position_t a, skb_paragraph_position_t b) |
There was a problem hiding this comment.
Looking at the changed examples, I think we should make this <= (*_less_or_equal()), not less than.
It's more a stylistic thing (a code I would write), than correctness in this case. My bad not catching it earlier.
| RUN_TEST(image_atlas_tests); | ||
| RUN_TEST(cpp_tests); | ||
| RUN_TEST(attributed_text_tests); | ||
| RUN_TEST(rich_text_tests); |
|
No worries at all. I've made the changes, and have double-checked the logic. Things largely map better to less_or_equal. But there are a couple of exceptions in skb_editor.c, where I had to Don't hesitate to ask for more iterations. :) |
| const skb_paragraph_position_t initial_end = skb_rich_text_get_paragraph_position_from_text_position(&editor->rich_text, editor->drag_initial_selection.end, SKB_AFFINITY_USE); | ||
|
|
||
| if (sel_start.global_text_offset < initial_start.global_text_offset) { | ||
| if (!skb_paragraph_position_less_or_equal(initial_start, sel_start)) { |
There was a problem hiding this comment.
These are confusing, but we have comments! :) If these get annoying later we can add *_less and *_greater.
|
Thanks for the fix! |
Hey Mikko, I found this problem while doing fuzz-testing on papaya.io. I've used Claude Code to help detect and fix this issue. Ofc, I have also reviewed this PR myself, and verified that this fixes the issue.
Problem
A text range is two positions: start and end. Internally a position has both a "paragraph index" (which paragraph it lives in) and a "global text offset" (how far into the whole document it is). For well-formed inputs these two agree: a later paragraph also has a later global offset.
skb_rich_text_get_paragraph_range_from_text_rangeorders its two positions by global offset alone, then hands the ordered pair toskb__rich_text_replace. Butskb__rich_text_replacedoesn't only use the global offset — it indexes paragraphs directly and assumesstart.paragraph_idx <= end.paragraph_idx.Two positions can tie on global offset while still sitting in different paragraphs. When that happens, the ordering-by-global step silently picks the wrong order, and
skb__rich_text_replaceruns withstart.paragraph_idx > end.paragraph_idx. From there:removed_paragraphs_count = end.paragraph_idx + 1 - start.paragraph_idxgoes negative.new_paragraphs_count = old - removed + sourcetherefore grows the array instead of shrinking it.memmoveleaves two paragraph slots holding the sameattributespointer.free()on that pointer twice. Boom.How a tie shows up in practice
The simplest way to produce a tie is to pass a position whose text offset is negative. The position lookup clamps negative offsets to
{paragraph_idx=0, text_offset=0, global_text_offset=0}. If the other endpoint of the range happens to also resolve to global offset 0 but in a later paragraph (for instance because the leading paragraphs are empty), the two positions tie on global offset 0 but live in paragraph 0 and paragraph N respectively, and the sort picks the later paragraph as "start".A concrete way to hit this from the editor: press Delete in a document made up of empty paragraphs. The Delete handler builds its remove range as
and in that document
next_grapheme_posreturnsoffset = -1(its EOLbranch computes
paragraph->global_text_offset + text_count - 1, andboth are zero). The negative offset, the global-only sort, and the
paragraph-index-based replace path then combine to corrupt the
paragraph array.
Fix
Sort the two positions by
(paragraph_idx, text_offset)instead of by global offset. That ordering matches what every downstream consumer already assumes, so the returned range always satisfiesstart.paragraph_idx <= end.paragraph_idxand the rest of the pipeline behaves correctly.