Skip to content

Add cmpctblock inputs#276

Merged
maflcko merged 1 commit into
bitcoin-core:mainfrom
Crypt-iQ:05092026/cmpctblock-inputs
May 14, 2026
Merged

Add cmpctblock inputs#276
maflcko merged 1 commit into
bitcoin-core:mainfrom
Crypt-iQ:05092026/cmpctblock-inputs

Conversation

@Crypt-iQ

@Crypt-iQ Crypt-iQ commented May 8, 2026

Copy link
Copy Markdown
Contributor

Ran my corpus with set_cover_merge=1 and use_value_profile=0. Coverage is here, it does not hit some cases because it has only run for a few days, but I can PR those inputs when it does.

@dergoegge

Copy link
Copy Markdown
Member

ACK 8dfbac8

waiting for CI

@maflcko

maflcko commented May 8, 2026

Copy link
Copy Markdown
Contributor
Run cmpctblock with args ['/home/runner/work/_temp/build/bin/fuzz', PosixPath('/home/runner/work/_temp/ci/scratch/qa-assets/fuzz_corpora/cmpctblock')]Assertion failed: detected inconsistent lock order for 'cs' in txmempool.h:509 (in thread 'test'), details in debug log.
Error processing input "/home/runner/work/_temp/ci/scratch/qa-assets/fuzz_corpora/cmpctblock/5f961ac0646e590d064fafe943ddfd24cf71b61f"

Assertion failed: detected inconsistent lock order for 'cs' in txmempool.h:509 (in thread 'test'), details in debug log.
Error processing input "/home/runner/work/_temp/ci/scratch/qa-assets/fuzz_corpora/cmpctblock/5f961ac0646e590d064fafe943ddfd24cf71b61f"

⚠️ Failure generated from target with exit code 1: ['/home/runner/work/_temp/build/bin/fuzz', PosixPath('/home/runner/work/_temp/ci/scratch/qa-assets/fuzz_corpora/cmpctblock')]

@Crypt-iQ

Crypt-iQ commented May 8, 2026

Copy link
Copy Markdown
Contributor Author
Run cmpctblock with args ['/home/runner/work/_temp/build/bin/fuzz', PosixPath('/home/runner/work/_temp/ci/scratch/qa-assets/fuzz_corpora/cmpctblock')]Assertion failed: detected inconsistent lock order for 'cs' in txmempool.h:509 (in thread 'test'), details in debug log.
Error processing input "/home/runner/work/_temp/ci/scratch/qa-assets/fuzz_corpora/cmpctblock/5f961ac0646e590d064fafe943ddfd24cf71b61f"

Assertion failed: detected inconsistent lock order for 'cs' in txmempool.h:509 (in thread 'test'), details in debug log.
Error processing input "/home/runner/work/_temp/ci/scratch/qa-assets/fuzz_corpora/cmpctblock/5f961ac0646e590d064fafe943ddfd24cf71b61f"

⚠️ Failure generated from target with exit code 1: ['/home/runner/work/_temp/build/bin/fuzz', PosixPath('/home/runner/work/_temp/ci/scratch/qa-assets/fuzz_corpora/cmpctblock')]

This is because of ImmediateTaskRunner and can't happen during production. When BlockConnected gets called, mempool.cs is already locked in the single-thread mode and the callback isn't sent to the scheduler thread so the same thread locks m_tx_download_mutex. Then a later iteration of the fuzz loop sends a tx which locks m_tx_download_mutex and then mempool.cs. In production, the BlockConnected is instead sent to the scheduler thread.

Not really sure what to do here because I don't think I can modify the order of the locks here (and wouldn't if I could just for fuzz code).

@maflcko

maflcko commented May 11, 2026

Copy link
Copy Markdown
Contributor

Hmm, ImmediateTaskRunner is used by the kernel, so conceptually I wonder if we should worry about that. Hopefully the kernel tests cover all of this and can detect lock order inversions. Otherwise, it would seem there could be a deadlock in a multithreaded kernel app in the future?

For this fuzz test: Can it be fixed in a hacky way by turning ImmediateTaskRunner into a ImmediateBackgroundTaskRunner? E.g:

class ImmediateBackgroundTaskRunner : public TaskRunnerInterface
{
public:
    void insert(std::function<void()> func) override { std::thread(std::move(func)).join(); }
    void flush() override {}
    size_t size() override { return 0; }
};

@Crypt-iQ

Crypt-iQ commented May 11, 2026

Copy link
Copy Markdown
Contributor Author

Hopefully the kernel tests cover all of this and can detect lock order inversions. Otherwise, it would seem there could be a deadlock in a multithreaded kernel app in the future?

I am not familiar with the kernel code, it looks like the validation_signals in bitcoind use SerialTaskRunner, but then any app will use the ImmediateTaskRunner? Potentially this could cause double locking if the previously async callbacks are single-threaded? Maybe I don't understand kernel enough to know how an app calls into the kernel code.

The suggested patch does fix the issue, it should fix any current false positive deadlocks and also can't have false negative deadlocks. If we previously locked A->B->C and now split off B->C into a separate thread, we have a subset of the original lock orders. However double locking could technically go undetected since some callbacks are launched in the same thread (BlockChecked, ActiveTipChange, NewPoWValidBlock). I don't think there is currently any double locking, so maybe we can leave that to the functional tests to detect. Thanks for the change, I will test my full corpus on it and then PR if all is good.

@maflcko

maflcko commented May 12, 2026

Copy link
Copy Markdown
Contributor

Thanks for the change, I will test my full corpus on it and then PR if all is good.

One thing to consider is how much overhead it has to launch a std::thread for each function. If it is too much, one can consider adding a "pool" with one thread (src/test/fuzz/threadpool.cpp:ThreadPool g_pool{"fuzz"};) and using that in place of the std::thread.

The submit funciton will then probably be:

    void insert(std::function<void()> func) override { Assert(g_pool.Submit(std::move(func)))->wait(); }

@Crypt-iQ

Copy link
Copy Markdown
Contributor Author

One thing to consider is how much overhead it has to launch a std::thread for each function. If it is too much, one can consider adding a "pool" with one thread (src/test/fuzz/threadpool.cpp:ThreadPool g_pool{"fuzz"};) and using that in place of the std::thread.

Am currently benching this, probably should have started with profiling first though to see if it's indeed a bottleneck. The diff is a bit confusing since ThreadPool is passed to ImmediateBackgroundTaskRunner through TestOpts and I still need to modify insert to default to running the function in the same thread if no pool* is passed (for the rest of the fuzz tests not using a ThreadPool). The pool needs to be started and then stopped in .init so that the block validation callbacks don't cause a crash due to no workers being available. I think it's ok, but seems a little hacky?

diff
diff --git a/src/test/fuzz/cmpctblock.cpp b/src/test/fuzz/cmpctblock.cpp
index 3e4268cb65..2ca333c15b 100644
--- a/src/test/fuzz/cmpctblock.cpp
+++ b/src/test/fuzz/cmpctblock.cpp
@@ -36,6 +36,7 @@
 #include <txmempool.h>
 #include <uint256.h>
 #include <util/check.h>
+#include <util/threadpool.h>
 #include <util/time.h>
 #include <util/translation.h>
 #include <validation.h>
@@ -139,16 +140,34 @@ void ResetChainmanAndMempool(TestingSetup& setup)
 
 extern void MakeRandDeterministicDANGEROUS(const uint256& seed) noexcept;
 
+ThreadPool g_pool{"fuzz"};
+Mutex g_pool_mutex;
+size_t g_num_workers = 1;
+
+static void StartPoolIfNeeded() EXCLUSIVE_LOCKS_REQUIRED(!g_pool_mutex)
+{
+    LOCK(g_pool_mutex);
+    if (g_pool.WorkersCount() == g_num_workers) return;
+    g_pool.Start(g_num_workers);
+}
+
 void initialize_cmpctblock()
 {
-    static const auto testing_setup = MakeNoLogFileContext<TestingSetup>();
+    StartPoolIfNeeded();
+    static const auto testing_setup = MakeNoLogFileContext<TestingSetup>(ChainType::REGTEST, {.pool = &g_pool});
     g_setup = testing_setup.get();
     g_nBits = Params().GenesisBlock().nBits;
     ResetChainmanAndMempool(*g_setup);
+    // Stop the threads before the fuzzing engine forks.
+    LOCK(g_pool_mutex);
+    g_pool.Stop();
+    assert(g_pool.WorkersCount() == 0);
 }
 
-FUZZ_TARGET(cmpctblock, .init = initialize_cmpctblock)
+FUZZ_TARGET(cmpctblock, .init = initialize_cmpctblock) EXCLUSIVE_LOCKS_REQUIRED(!g_pool_mutex)
 {
+    StartPoolIfNeeded();
+
     SeedRandomStateForTest(SeedRand::ZEROS);
     FuzzedDataProvider fuzzed_data_provider(buffer.data(), buffer.size());
 
diff --git a/src/test/util/setup_common.cpp b/src/test/util/setup_common.cpp
index 39c691c336..4866ce5254 100644
--- a/src/test/util/setup_common.cpp
+++ b/src/test/util/setup_common.cpp
@@ -250,7 +250,7 @@ ChainTestingSetup::ChainTestingSetup(const ChainType chainType, TestOpts opts)
         m_node.validation_signals =
             // Use synchronous task runner while fuzzing to avoid non-determinism
             EnableFuzzDeterminism() ?
-                std::make_unique<ValidationSignals>(std::make_unique<util::ImmediateTaskRunner>()) :
+                std::make_unique<ValidationSignals>(std::make_unique<util::ImmediateBackgroundTaskRunner>(opts.pool)) :
                 std::make_unique<ValidationSignals>(std::make_unique<SerialTaskRunner>(*m_node.scheduler));
         {
             // Ensure deterministic coverage by waiting for m_service_thread to be running
diff --git a/src/test/util/setup_common.h b/src/test/util/setup_common.h
index 8b1b063218..916b25bd3a 100644
--- a/src/test/util/setup_common.h
+++ b/src/test/util/setup_common.h
@@ -22,6 +22,7 @@
 #include <util/fs.h>
 #include <util/signalinterrupt.h>
 #include <util/string.h>
+#include <util/threadpool.h>
 #include <util/vector.h>
 
 #include <functional>
@@ -53,6 +54,7 @@ struct TestOpts {
     bool setup_net{true};
     bool setup_validation_interface{true};
     bool min_validation_cache{false}; // Equivalent of -maxsigcachebytes=0
+    ThreadPool* pool;
 };
 
 /** Basic testing setup.
diff --git a/src/util/task_runner.h b/src/util/task_runner.h
index 951381823b..1f8ce156e9 100644
--- a/src/util/task_runner.h
+++ b/src/util/task_runner.h
@@ -7,6 +7,7 @@
 
 #include <cstddef>
 #include <functional>
+#include <util/threadpool.h>
 
 namespace util {
 
@@ -47,6 +48,18 @@ public:
     size_t size() override { return 0; }
 };
 
+class ImmediateBackgroundTaskRunner : public TaskRunnerInterface
+{
+public:
+    explicit ImmediateBackgroundTaskRunner(ThreadPool* pool) : pool{pool} {};
+    void insert(std::function<void()> func) override { Assert(pool->Submit(std::move(func)))->wait(); }
+    void flush() override {}
+    size_t size() override { return 0; }
+
+private:
+    ThreadPool* pool;
+};
+
 } // namespace util

 #endif // BITCOIN_UTIL_TASK_RUNNER_H

@maflcko

maflcko commented May 13, 2026

Copy link
Copy Markdown
Contributor

Ah, I didn't mean to globally replace it. Just in this single fuzz test that needs it call m_node.validation_signals = std::make_unique<ValidationSignals>(std::make_unique<util::ImmediateBackgroundTaskRunner>()) ;

This should allow to drop the diff in setup common.

Also, if you move class ImmediateBackgroundTaskRunner : public TaskRunnerInterface to the only file that needs it, it can either use a std::thread{_}.join() or Assert(pool->Submit(std::move(func)))->wait() whichever is faster for this fuzz target.

@maflcko

maflcko commented May 14, 2026

Copy link
Copy Markdown
Contributor

I guess this is now rfm, if CI passes, right? (After bitcoin/bitcoin#35284)

@maflcko maflcko closed this May 14, 2026
@maflcko maflcko reopened this May 14, 2026
@Crypt-iQ

Copy link
Copy Markdown
Contributor Author

I guess this is now rfm, if CI passes, right? (After bitcoin/bitcoin#35284)

Yup the format didn't change so these are still valid

@maflcko maflcko merged commit efb3589 into bitcoin-core:main May 14, 2026
7 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants