Skip to content

Fix double-free when removing text from a degenerate text range#103

Merged
memononen merged 4 commits into
memononen:mainfrom
ApoorvaJ:fix-paragraph-range-inversion
May 17, 2026
Merged

Fix double-free when removing text from a degenerate text range#103
memononen merged 4 commits into
memononen:mainfrom
ApoorvaJ:fix-paragraph-range-inversion

Conversation

@ApoorvaJ

Copy link
Copy Markdown
Contributor

Hey Mikko, I found this problem while doing fuzz-testing on papaya.io. I've used Claude Code to help detect and fix this issue. Ofc, I have also reviewed this PR myself, and verified that this fixes the issue.

Problem

A text range is two positions: start and end. Internally a position has both a "paragraph index" (which paragraph it lives in) and a "global text offset" (how far into the whole document it is). For well-formed inputs these two agree: a later paragraph also has a later global offset.

skb_rich_text_get_paragraph_range_from_text_range orders its two positions by global offset alone, then hands the ordered pair to skb__rich_text_replace. But skb__rich_text_replace doesn't only use the global offset — it indexes paragraphs directly and assumes start.paragraph_idx <= end.paragraph_idx.

Two positions can tie on global offset while still sitting in different paragraphs. When that happens, the ordering-by-global step silently picks the wrong order, and skb__rich_text_replace runs with start.paragraph_idx > end.paragraph_idx. From there:

  • removed_paragraphs_count = end.paragraph_idx + 1 - start.paragraph_idx goes negative.
  • new_paragraphs_count = old - removed + source therefore grows the array instead of shrinking it.
  • The follow-up memmove leaves two paragraph slots holding the same attributes pointer.
  • The next edit or reset walks those slots and calls free() on that pointer twice. Boom.

How a tie shows up in practice

The simplest way to produce a tie is to pass a position whose text offset is negative. The position lookup clamps negative offsets to {paragraph_idx=0, text_offset=0, global_text_offset=0}. If the other endpoint of the range happens to also resolve to global offset 0 but in a later paragraph (for instance because the leading paragraphs are empty), the two positions tie on global offset 0 but live in paragraph 0 and paragraph N respectively, and the sort picks the later paragraph as "start".

A concrete way to hit this from the editor: press Delete in a document made up of empty paragraphs. The Delete handler builds its remove range as

    { start = selection.end,
      end   = skb_rich_text_get_next_grapheme_pos(selection.end) }

and in that document next_grapheme_pos returns offset = -1 (its EOL
branch computes paragraph->global_text_offset + text_count - 1, and
both are zero). The negative offset, the global-only sort, and the
paragraph-index-based replace path then combine to corrupt the
paragraph array.

Fix

Sort the two positions by (paragraph_idx, text_offset) instead of by global offset. That ordering matches what every downstream consumer already assumes, so the returned range always satisfies start.paragraph_idx <= end.paragraph_idx and the rest of the pipeline behaves correctly.

Problem
-------
A text range is two positions: start and end. Internally a position has
both a "paragraph index" (which paragraph it lives in) and a
"global text offset" (how far into the whole document it is). For
well-formed inputs these two agree: a later paragraph also has a
later global offset.

skb_rich_text_get_paragraph_range_from_text_range orders its two
positions by global offset alone, then hands the ordered pair to
skb__rich_text_replace. But skb__rich_text_replace doesn't only use
the global offset — it indexes paragraphs directly and assumes
start.paragraph_idx <= end.paragraph_idx.

Two positions can tie on global offset while still sitting in
different paragraphs. When that happens, the ordering-by-global step
silently picks the wrong order, and skb__rich_text_replace runs with
start.paragraph_idx > end.paragraph_idx. From there:

  - removed_paragraphs_count = end.paragraph_idx + 1 - start.paragraph_idx
    goes negative.
  - new_paragraphs_count = old - removed + source therefore grows the
    array instead of shrinking it.
  - The follow-up memmove leaves two paragraph slots holding the same
    `attributes` pointer.
  - The next edit or reset walks those slots and calls free() on that
    pointer twice. Boom.

How a tie shows up in practice
------------------------------
The simplest way to produce a tie is to pass a position whose
text offset is negative. The position lookup clamps negative offsets
to {paragraph_idx=0, text_offset=0, global_text_offset=0}. If the other
endpoint of the range happens to also resolve to global offset 0 but
in a later paragraph (for instance because the leading paragraphs are
empty), the two positions tie on global offset 0 but live in
paragraph 0 and paragraph N respectively, and the sort picks the
later paragraph as "start".

A concrete way to hit this from the editor: press Delete in a
document made up of empty paragraphs. The Delete handler builds its
remove range as
    { start = selection.end,
      end   = skb_rich_text_get_next_grapheme_pos(selection.end) }
and in that document next_grapheme_pos returns offset = -1 (its EOL
branch computes paragraph->global_text_offset + text_count - 1, and
both are zero). The negative offset, the global-only sort, and the
paragraph-index-based replace path then combine to corrupt the
paragraph array.

Fix
---
Sort the two positions by (paragraph_idx, text_offset) instead of by
global offset. That ordering matches what every downstream consumer
already assumes, so the returned range always satisfies
start.paragraph_idx <= end.paragraph_idx and the rest of the pipeline
behaves correctly.
@memononen

Copy link
Copy Markdown
Owner

Looks like there might be other issues too that are bringing up this issue.

The skb_rich_text_get_next_grapheme_pos() should always return valid position. I see that it returns paragraph->global_text_offset + text_count - 1 which can lead to negative offset of text_count == 0.

I think it should be:

.offset = paragraph->global_text_offset + skb_maxi(0, text_count - 1),

There seems to be a bunch of other cases with text_count - 1 that need require attention.

Only the last paragraph is allowed the truly empty, all the others should have at least one character, \n.

Similar test with global text offset is also done in:

  • skb_editor_get_selection_ordered_start()
  • skb_editor_get_selection_ordered_end()
  • skb_editor_process_mouse_drag()
  • skb_rich_layout_get_text_range_bounds() (
  • skb_rich_text_get_offset_range_from_text_range() (this might be ok, since it returns global text range)

I suggest following:

  • make a test that triggers the current issue (for example the one described above)
  • comment on skb_paragraph_position_t::global_text_offset that it's position can be ambiguous and should be only used to map from skb_paragraph_position_t -> offset, not to test ordering.
  • add skb_paragraph_position_less_than(a, b) which does the compare as per the fix
  • use skb_paragraph_position_less_than() instead of testing global_text_offset

Let's fix the text_count - 1 issues in separate change.

@ApoorvaJ

Copy link
Copy Markdown
Contributor Author

Thanks for the prompt response. I've made some edits. Please don't hesitate to give more direction if needed.

@memononen memononen left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed the < vs <=, let's change that and it's good to go.

Comment thread src/skb_rich_text.c Outdated
result.start = start_pos;
result.end = end_pos;
} else {
if (skb_paragraph_position_less_than(end_pos, start_pos)) {

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be (start_pos, end_pos)

Comment thread include/skb_common.h Outdated
/** Returns true if a comes strictly before b in document order. Compares by
* (paragraph_idx, text_offset); do not compare global_text_offset directly, as
* it can tie across paragraph boundaries. */
static inline bool skb_paragraph_position_less_than(skb_paragraph_position_t a, skb_paragraph_position_t b)

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the changed examples, I think we should make this <= (*_less_or_equal()), not less than.

It's more a stylistic thing (a code I would write), than correctness in this case. My bad not catching it earlier.

Comment thread test/tester.c
RUN_TEST(image_atlas_tests);
RUN_TEST(cpp_tests);
RUN_TEST(attributed_text_tests);
RUN_TEST(rich_text_tests);

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch :)

@ApoorvaJ

Copy link
Copy Markdown
Contributor Author

No worries at all.

I've made the changes, and have double-checked the logic. Things largely map better to less_or_equal. But there are a couple of exceptions in skb_editor.c, where I had to !skb_paragraph_position_less_or_equal(), which is slightly clumsy, but I did it in the interest of keeping the change small.

Don't hesitate to ask for more iterations. :)

Comment thread src/skb_editor.c
const skb_paragraph_position_t initial_end = skb_rich_text_get_paragraph_position_from_text_position(&editor->rich_text, editor->drag_initial_selection.end, SKB_AFFINITY_USE);

if (sel_start.global_text_offset < initial_start.global_text_offset) {
if (!skb_paragraph_position_less_or_equal(initial_start, sel_start)) {

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are confusing, but we have comments! :) If these get annoying later we can add *_less and *_greater.

@memononen memononen merged commit 64c9356 into memononen:main May 17, 2026
3 checks passed
@memononen

Copy link
Copy Markdown
Owner

Thanks for the fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants