Skip to content

DUX-5110 Don't cancel startup before writing an error log#444

Merged
9999years merged 1 commit into
mainfrom
wiggles/dux-5110/zskw
Apr 29, 2026
Merged

DUX-5110 Don't cancel startup before writing an error log#444
9999years merged 1 commit into
mainfrom
wiggles/dux-5110/zskw

Conversation

@9999years
Copy link
Copy Markdown
Member

Previously, the tokio::select! in ghci::manager would see that the ghci process had exited and would cancel the ghci.initialize() job as a result.

However, we actually want the ghci.initialize() job to complete (even if it errors out), because it will read the ghci output and write (e.g.) compilation errors to the error log.

Usually, ghci.initialize() will fail with a BrokenPipe IO error in this case (trying to write to GHCi's stdin after the process has exited), so we catch those errors explicitly and ignore them (needs DUX-5106).

@linear
Copy link
Copy Markdown

linear Bot commented Apr 28, 2026

@9999years 9999years force-pushed the wiggles/dux-5110/zskw branch from e01d606 to 8d5e0ca Compare April 28, 2026 23:16
@9999years 9999years marked this pull request as ready for review April 28, 2026 23:27
@9999years 9999years requested a review from a team as a code owner April 28, 2026 23:27
Copy link
Copy Markdown
Contributor

@lf- lf- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have thought about this a bit, and I think I am convinced you did fix a real race condition with this, so I think it's right :)

Comment thread src/ghci/manager.rs
.is_some_and(|io| io.kind() == std::io::ErrorKind::BrokenPipe)
});
if is_broken_pipe {
tracing::debug!("ghci stdin closed during startup (broken pipe): {err}");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this want to be a higher level than debug? Feels like the kinda thing that users maybe want to know by default ? Or maybe not?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah, this is expected; broken pipe = ghci exited, which we'll detect right after with the exited_sender.

Comment thread src/ghci/manager.rs
}
}
};
let startup_exit: Option<ExitStatus> = exited_receiver.try_recv().ok();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could simplify by matching on let Ok on the following line and deleting the .ok(), but shrug.

Comment thread src/ghci/manager.rs
Comment on lines +88 to +89
e.downcast_ref::<std::io::Error>()
.is_some_and(|io| io.kind() == std::io::ErrorKind::BrokenPipe)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have any way to test this? I am always scared of fucking up on these kinds of type id downcast based on my experience screwing up AnnotatedException in Haskell.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes oops #449

Previously, the `tokio::select!` in `ghci::manager` would see that the
`ghci` process had exited and would cancel the `ghci.initialize()` job
as a result.

However, we actually want the `ghci.initialize()` job to complete (even
if it errors out), because it will read the `ghci` output and write
(e.g.) compilation errors to the error log.

Usually, `ghci.initialize()` will fail with a `BrokenPipe` IO error in
this case (trying to write to GHCi's stdin after the process has
exited), so we catch those errors explicitly and ignore them (needs
DUX-5106).
@9999years 9999years force-pushed the wiggles/dux-5110/zskw branch from 8d5e0ca to b3f9dbf Compare April 28, 2026 23:52
@9999years 9999years enabled auto-merge (squash) April 29, 2026 00:02
@9999years 9999years merged commit 5751504 into main Apr 29, 2026
39 checks passed
@9999years 9999years deleted the wiggles/dux-5110/zskw branch April 29, 2026 00:06
9999years added a commit that referenced this pull request Apr 29, 2026
#444 got auto-merged a bit early, oops!

I can confirm that if you comment out the change in #444 then a bunch of
stuff fails:

```
     Summary [  83.049s] 239 tests run: 227 passed (1 leaky), 12 failed, 0 skipped
        FAIL [  36.712s] ghciwatch::error_log error_log_startup_failure_9102
        FAIL [  24.564s] ghciwatch::error_log error_log_startup_failure_9122
        FAIL [  22.095s] ghciwatch::error_log error_log_startup_failure_967
        FAIL [  33.764s] ghciwatch::error_log error_log_startup_failure_984
        FAIL [  22.941s] ghciwatch::shutdown handles_repeated_startup_failures_9102
        FAIL [  24.139s] ghciwatch::shutdown handles_repeated_startup_failures_9122
        FAIL [  25.440s] ghciwatch::shutdown handles_repeated_startup_failures_967
        FAIL [  27.936s] ghciwatch::shutdown handles_repeated_startup_failures_984
        FAIL [  27.663s] ghciwatch::shutdown handles_repeated_startup_failures_before_restart_ghci_hook_9102
        FAIL [  25.208s] ghciwatch::shutdown handles_repeated_startup_failures_before_restart_ghci_hook_9122
        FAIL [  22.660s] ghciwatch::shutdown handles_repeated_startup_failures_before_restart_ghci_hook_967
        FAIL [  31.145s] ghciwatch::shutdown handles_repeated_startup_failures_before_restart_ghci_hook_984
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants