HASA-PAM: GPU-accelerated transcranial passive acoustic mapping

Introduction

HASA-PAM is a Julia project for GPU-accelerated transcranial passive acoustic mapping (PAM), developed for the ETH Zurich course "Solving PDEs in parallel on GPUs with Julia II" (Spring 2026 | ETHZ 101-0250-01).

The repository implements a reconstruction pipeline for localising acoustic sources inside the skull. The motivating application is non-invasive targeted drug delivery in the brain: if ultrasound-controllable drug carriers can be aggregated and uncaged at a chosen target, a future closed-loop system would also need feedback about whether drug-carrier or bubble clusters formed and where they are located.

This project addresses that feedback problem with transcranial PAM. While the primary focus is PAM, transcranial focusing (geometric and HASA delay estimation with optional k-Wave verification) is also fully implemented and accessible via CLI for both 2D and 3D domains. Receiver recordings are time-reversed through an acoustic model, and the reconstructed intensity field is used to estimate source locations. The implementation compares a fast homogeneous angular-spectrum baseline with a skull-aware Heterogeneous Angular Spectrum Approach (HASA) reconstruction that uses a CT-derived speed-of-sound model. The current benchmark is simulation-based and uses synthetic source emissions.

In the reported synthetic 3D benchmark, skull correction reduced mean localisation error from 1.77 mm to 0.49 mm, while the GPU implementation reduced corrected reconstruction march time from 13.28 s to 0.23 s.

What this repository does

The project provides tools for:

simulating sparse acoustic source emissions and receiver-plane RF data,
generating synthetic benchmark data with k-Wave,
reconstructing source distributions with homogeneous or skull-corrected HASA propagation,
accelerating reconstruction with CUDA/cuFFT batched frequency processing,
evaluating reconstructed peaks or masks against synthetic ground truth.

Background

Many neurological interventions would benefit from concentrating drugs at a known brain target rather than administering them systemically. Aggregation and Uncaging Focused Ultrasound Sequence (AU-FUS) provides one route toward this: Ozdas et al. demonstrated that ultrasound-controllable drug carriers can be aggregated at a target and then locally uncaged to release their payload (Ozdas et al., 2020). For closed-loop use, however, the system would need feedback about whether aggregation occurred at the intended target. If aggregated carriers or associated bubbles emit detectable acoustic signals, PAM can infer their locations from receiver recordings, but the skull distorts those signals through spatially varying sound speed and phase delay.

PAM reverses the physical emission path computationally. Starting from receiver recordings, the acoustic field is propagated back into the imaging volume, accumulated into an intensity map, and post-processed to estimate source locations. This uses acoustic time reversal: recorded wavefields can be re-emitted or backpropagated to refocus near their original source locations (Fink, 1993). A homogeneous angular-spectrum reconstruction is fast, but it treats the skull as water and can localise sources incorrectly.

The skull-corrected version instead propagates the received field through an individual speed-of-sound model of the skull. This project uses HASA, which was introduced for trans-skull imaging and focusing by Schoen & Arvanitis and later demonstrated for trans-skull volumetric PAM (Schoen & Arvanitis, 2020; Schoen, Dash & Arvanitis, 2022). The correction improves localisation, but it is only useful for feedback if the heterogeneous propagation is fast enough to run close to real time.

The videos below show the synthetic workflow: an oscillating source emits through the skull to a receiver plane, then the received data is propagated back inward either naively or with skull correction. The final intensity maps can be used to estimate cluster locations, for example by taking the maximum-intensity point.

1. Outward acoustic emission through the skull

2. Inward reconstruction: naive vs skull-corrected

Naive homogeneous propagation	With skull correction

Open MP4	Open MP4

Methods

Simulation and reconstruction pipeline

The pipeline has two stages. First, synthetic cluster emissions are propagated outward with k-Wave to generate receiver-plane RF data. Second, those recordings are time-reversed through either a homogeneous angular-spectrum model or a skull-aware HASA model. The homogeneous baseline is computationally cheap and shows how much localisation error is introduced by ignoring the skull.

At a high level, the reconstruction loop is:

receiver_rf = run_k_wave(cluster_sources, medium)
receiver_spectra = FFT_time(receiver_rf)

for frequency in selected_frequencies
    receiver_field = receiver_spectra[frequency]
    P = FFT_xy(receiver_field)

    for each depth plane
        P = homogeneous_angular_spectrum_step(P)

        if skull_correction_enabled
            p = IFFT_xy(P)
            C = FFT_xy(sound_speed_contrast[depth] * p)
            P += HASA_heterogeneity_correction(C)
        end

        reconstructed_image[depth] += |IFFT_xy(P)|²
    end
end

cluster_locations = peak_or_mask_analysis(reconstructed_image)

In the GPU implementation, selected frequency bins are batched into a single array. Batched cuFFT calls handle transverse FFT/IFFT operations, propagation and correction factors are precomputed or fused where possible, and intensity is accumulated on-device across frequencies.

Cluster source model

The source model is a proxy for detectable AU-FUS-related acoustic emissions, not a full model of aggregation dynamics, nonlinear bubble behaviour, or drug release. It assumes sparse, compact emitters whose finite-duration signals contain the driving frequency and selected harmonics, with optional harmonic-specific phase offsets.

For a single cluster, the emitted signal is modelled as a Tukey-windowed harmonic series:

$$s(t) = A \, w_{\mathrm{Tukey}}(t; T_{\mathrm{gate}}, r) \sum_{n \in \mathcal{H}} \alpha_n \cos \left(2\pi n f_0 t + \phi_n \right)$$

Here, A is the overall source amplitude, w_Tukey gates the signal over the emission duration T_gate with taper ratio r, f0 is the centre frequency, H is the selected set of harmonic indices, α_n controls the relative strength of each harmonic, and φ_n allows harmonic-specific phase offsets.

For multiple sparse clusters, the source field is represented as a sum of point emitters:

$$q(\mathbf{x}, t) = \sum_i s_i(t)\,\delta(\mathbf{x}-\mathbf{x}_i)$$

The resulting source field is propagated outward with k-Wave to produce synthetic receiver recordings. Those recordings are then treated like measured data and passed to the PAM reconstruction code.

HASA propagation

The inward reconstruction needs to propagate recorded pressure fields through a heterogeneous speed-of-sound map quickly enough to be useful for feedback. HASA represents each reconstructed frequency by its transverse angular spectrum and marches it axially through the volume, following the trans-skull angular-spectrum formulation introduced by Schoen & Arvanitis (Schoen & Arvanitis, 2020).

For one axial step from z_n to z_{n+1} = z_n + Δz, the pressure angular spectrum is updated as

$$P^{n+1} \approx \underbrace{P^n e^{i k_z \Delta z}}_{\text{homogeneous ASA propagation}} + \underbrace{ \frac{e^{i k_z \Delta z}}{2 i k_z} \left(P^n * \Lambda\right)\Delta z }_{\text{heterogeneous sound-speed correction}} .$$

Here P^n = P(k_x, k_y, z_n) is the transverse angular spectrum at the current depth, k_z is the axial wavenumber, and Λ is the spectrum of the local sound-speed heterogeneity term. The first term is the same homogeneous angular-spectrum update used by the uncorrected baseline. The second term is the HASA correction: it accounts for the skull by coupling the current pressure field with the local speed-of-sound contrast. In code, the convolution term is evaluated with FFT/IFFT operations: the current angular spectrum is transformed back to the transverse spatial grid, multiplied by the row-local heterogeneity field λ = k_0^2(1 - (c_0 / c)^2), transformed forward again, scaled by the HASA correction factor, and added to the homogeneous propagation step before marching to the next depth plane.

For derivation and implementation details, see the PAM algorithm documentation and the original HASA paper by Schoen & Arvanitis (2020).

Results

The repository uses synthetic data because suitable experimental datasets are limited. Benchmarks place known sources in a simulated medium, record the wavefield at the receiver plane with k-Wave, and reconstruct the source field with homogeneous or skull-corrected PAM. Accuracy is evaluated by comparing recovered peaks or predicted masks against synthetic ground truth.

Numerical validation against reference

The 2D PAM benchmark reproduces a point-source localisation sweep from Schoen & Arvanitis (2020). It simulates 54 point sources through a CT-derived skull slice with k-Wave and compares homogeneous ASA against skull-corrected HASA for 50 mm, 75 mm, and 100 mm apertures. The corrected HASA localisation errors match the reference paper within about 0.2 mm across all apertures: 1.1 ± 1.1 mm vs 1.2 ± 0.7 mm at 50 mm, 0.7 ± 0.5 mm vs 0.9 ± 0.5 mm at 75 mm, and 0.6 ± 0.4 mm vs 0.8 ± 0.4 mm at 100 mm.

Detailed validation numbers can be found in benchmark/2D_PAM_Accuracy/results.md.

3D reconstruction accuracy

For the single-source 3D benchmark, the reconstruction volume used a 50 mm depth, 50 × 50 mm transverse aperture, 0.2 mm grid spacing, and a CT-derived skull layer at 20 mm. Sources were simulated at 27 distinct positions and reconstructed with either homogeneous propagation only or skull-corrected HASA-PAM. Accounting for skull heterogeneity substantially improved localisation accuracy: the mean Euclidean localisation error dropped from 1.77 ± 0.88 mm to 0.49 ± 0.23 mm.

This result supports the central motivation of the project: skull-aware time reversal is not just a refinement, but a substantial improvement in source localisation quality for transcranial PAM.

Reconstructing multiple clusters

For the multi-cluster benchmark, synthetic clusters were seeded in a vessel-like mask. The reconstruction was converted into a predicted cluster mask by thresholding the intensity field at τ = -6 dB, and overlap with the ground truth was measured using the F1 score. In one preliminary setting with r = 1 mm, the corrected reconstruction reached F1 = 70%, compared with 2% for the uncorrected reconstruction. This suggests that skull-aware reconstruction can also improve distribution-level recovery, although the result is sensitive to the mask radius, threshold, aperture, and source-density assumptions.

Example corrected PAM reconstructions for sparse clustered sources:

Single vessel	Network of vessels in focal volume

Reconstruction speed

The same 3D benchmark was used to evaluate runtime on CPU and GPU. The reported configuration used a 50 × 50 mm aperture, 50 mm axial depth, 0.2 mm grid spacing, and a CT-derived skull layer at 20 mm, with runtimes reported as the median of 5 runs on an RTX 3080 / Ryzen 9 5900X system.

GPU acceleration reduced corrected march time from 13.28 s on CPU to 0.23 s on GPU. In a live workflow where setup is performed once, this corresponds to roughly 4 Hz feedback. The skull correction adds about 0.21 s over the uncorrected GPU march time of 0.02 s, while remaining in a real-time-relevant regime.

Scaling

Corrected HASA-PAM was benchmarked across reconstruction depths and transverse apertures on an RTX 3080 using a 0.2 mm grid, 4 axial substeps per cell, and 3 reconstructed frequencies.

Corrected HASA-PAM scaled predictably for 50–75 mm apertures, where march time was well described by t_march ≈ N_y² · (9.1e-7 + 2.3e-9 · n_rows) and effective memory bandwidth remained high (181–205 GB/s). The 25 mm case is dominated more by fixed overheads, while the 100 mm case leaves the most efficient memory regime and slows down substantially. Larger volumes are therefore mainly memory-limited rather than compute-limited.

Limitations and future work

The current benchmark is simulation-based. The source model is intentionally simplified and does not capture full bubble dynamics, realistic transducer geometry, drug-release physics, or experimental variability. The main computational bottleneck is GPU memory, so larger reconstruction volumes will require more memory-efficient operators or on-the-fly correction-factor generation.

Multi-GPU support was intentionally not implemented. The target use case is a closed-loop feedback system that should run near the acquisition setup on accessible single-workstation hardware; requiring several GPUs would make the system less portable without improving its practical usefulness for that feedback loop. For this application, reducing memory pressure on one GPU is the more relevant next step than distributing the current workload across multiple GPUs.

Planned extensions include:

reduce memory use during temporal FFT handling,
generate correction factors on demand,
support larger single-GPU reconstructions through streaming or chunked operators,
test against real acoustic recordings,
include realistic transducer geometry,
integrate reconstruction into a closed-loop targeting workflow.

Getting Started

For more detailed setup instructions, workflow notes, and CLI parameter references, see the hosted documentation: https://magisv.github.io/hasa-pam/.

Command line entry points

The repository exposes two primary command line entry points:

scripts/run_pam.jl — the main PAM workflow: simulate emissions, reconstruct source distributions, evaluate accuracy.
scripts/run_focus.jl — transcranial focusing: compute geometric or HASA transmit delays, optionally verify with k-Wave, for 2D and 3D domains.

Setup

Instantiate the Julia environment:

julia --project=. -e 'using Pkg; Pkg.instantiate()'

Python-side dependencies for k-Wave are managed through CondaPkg.toml. The first k-Wave run may resolve Python packages and k-Wave resources.

Data

The CT scan data used for skull-backed examples is private and is not distributed with the repository. Homogeneous water runs with --aberrator=none do not need CT data. For skull-backed runs, pass:

julia --project=. scripts/run_pam.jl --ct-path=/path/to/dicom-folder

Quick PAM example

Small homogeneous 2D point-source run:

julia --project=. scripts/run_pam.jl \
  --source-model=point \
  --sources-mm=30:0 \
  --aberrator=none \
  --recon-use-gpu=false

Simple 3D point-source run on CUDA:

julia --project=. scripts/run_pam.jl \
  --dimension=3 \
  --source-model=point \
  --sources-mm=30:2:-1 \
  --aberrator=none \
  --recon-use-gpu=true

Quick focusing examples

2D skull-backed HASA focus:

julia --project=. scripts/run_focus.jl \
  --estimator=hasa \
  --aberrator=skull \
  --slice-index=250

3D HASA focus in water (no k-Wave, fast smoke test):

julia --project=. scripts/run_focus.jl \
  --dimension=3 \
  --estimator=hasa \
  --aberrator=water \
  --focus-mm=60:0:0 \
  --receiver-aperture-y-mm=60 \
  --receiver-aperture-z-mm=60 \
  --run-kwave=false

Tests

Run the standard test suite:

julia --project=. -e 'using Pkg; Pkg.test()'

Enable optional k-Wave smoke tests:

TRANSCRANIALFUS_RUN_KWAVE_TESTS=1 julia --project=. -e 'using Pkg; Pkg.test()'

Enable the heavier CT-backed integration tests as well:

TRANSCRANIALFUS_RUN_INTEGRATION=1 TRANSCRANIALFUS_RUN_KWAVE_TESTS=1 julia --project=. -e 'using Pkg; Pkg.test()'

References

Fink, M. (1993). Time-reversal mirrors. Journal of Physics D: Applied Physics, 26(9), 1333. https://doi.org/10.1088/0022-3727/26/9/001
Ozdas, M. S., Shah, A. S., Johnson, P. M., Patel, N., Marks, M., Yasar, T. B., Stalder, U., Bigler, L., von der Behrens, W., Sirsi, S. R., & Yanik, M. F. (2020). Non-invasive molecularly-specific millimeter-resolution manipulation of brain circuits by ultrasound-mediated aggregation and uncaging of drug carriers. Nature Communications, 11, 4929. https://doi.org/10.1038/s41467-020-18059-7
Schoen, S., & Arvanitis, C. D. (2020). Heterogeneous Angular Spectrum Method for Trans-skull Imaging and Focusing. IEEE Transactions on Medical Imaging, 39(5), 1605–1614. https://doi.org/10.1109/TMI.2019.2953872
Schoen, S., Dash, P., & Arvanitis, C. D. (2022). Experimental Demonstration of Trans-Skull Volumetric Passive Acoustic Mapping With the Heterogeneous Angular Spectrum Approach. IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, 69(2), 534–542. https://doi.org/10.1109/TUFFC.2021.3125670

License

This project is licensed under the MIT License.

AI Usage

This project was developed with AI assistance. Claude and Codex were used during implementation for code generation, refactoring, debugging, documentation drafting, and test iteration. All AI-generated changes were reviewed and integrated by me.

Name		Name	Last commit message	Last commit date
Latest commit History 96 Commits
.github/workflows		.github/workflows
benchmark		benchmark
docs		docs
figures		figures
scripts		scripts
src		src
test		test
.gitignore		.gitignore
CondaPkg.toml		CondaPkg.toml
LICENSE		LICENSE
Manifest.toml		Manifest.toml
Project.toml		Project.toml
README.md		README.md
coverage.jl		coverage.jl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HASA-PAM: GPU-accelerated transcranial passive acoustic mapping

Table of Contents

Introduction

What this repository does

Background

1. Outward acoustic emission through the skull

2. Inward reconstruction: naive vs skull-corrected

Methods

Simulation and reconstruction pipeline

Cluster source model

HASA propagation

Results

Numerical validation against reference

3D reconstruction accuracy

Reconstructing multiple clusters

Reconstruction speed

Scaling

Limitations and future work

Getting Started

Command line entry points

Setup

Data

Quick PAM example

Quick focusing examples

Tests

References

License

AI Usage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

HASA-PAM: GPU-accelerated transcranial passive acoustic mapping

Table of Contents

Introduction

What this repository does

Background

1. Outward acoustic emission through the skull

2. Inward reconstruction: naive vs skull-corrected

Methods

Simulation and reconstruction pipeline

Cluster source model

HASA propagation

Results

Numerical validation against reference

3D reconstruction accuracy

Reconstructing multiple clusters

Reconstruction speed

Scaling

Limitations and future work

Getting Started

Command line entry points

Setup

Data

Quick PAM example

Quick focusing examples

Tests

References

License

AI Usage

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages