Skip to content

S3ObjectSummaryLookup lists without a delimiter, over-listing sibling prefixes #7224

@rnaidu-seqera

Description

@rnaidu-seqera

Bug report

Expected behavior and actual behavior

When resolving an S3 directory path (Files.exists / isDirectory / nf-schema format: path), S3ObjectSummaryLookup.lookup() lists by the bare key with no delimiter. Since S3Path strips the trailing slash, s3://bucket/sampleA/ is listed as prefix="sampleA", which also matches siblings like sampleA-extra/. matchName() rejects them afterward, but only after every sibling key is paginated at maxKeys=250.

On large layouts this inflates head-node S3 concurrency enough to reliably trigger the upstream CRT progress-accounting crash (aws/aws-sdk-java-v2#4790), which kills the event-loop thread and leaves the head job hung in submitted state with an empty .nextflow.log.

This is the only nf-amazon listing path missing the delimiter; S3Iterator.buildRequest() uses .prefix(key).delimiter("/").

// S3ObjectSummaryLookup.lookup()
request.prefix(s3Path.getKey());   // bare key, slash stripped
request.maxKeys(250);
// no .delimiter("/")

Steps to reproduce the problem

  1. Create sibling prefixes where one is a leading substring of the other, e.g. s3://mybucket/sampleA/ and s3://mybucket/sampleA-extra/, each with many objects.
  2. Run a pipeline that resolves s3://mybucket/sampleA/ on the head node (e.g. nf-schema input with format: path).
  3. Both prefixes are listed; the run hangs in submitted with an empty log.

Renaming to a lexicographically isolated prefix (e.g. sampleA-bc/) avoids the over-listing and the run starts normally.

Program output

Exception in thread "AwsEventLoop 7" java.lang.IllegalArgumentException: transferredBytes (481256976) must not be greater than totalBytes (481256866)
  at software.amazon.awssdk.transfer.s3.internal.progress.DefaultTransferProgressSnapshot.<init>(DefaultTransferProgressSnapshot.java:45)
  at software.amazon.awssdk.transfer.s3.internal.progress.TransferProgressUpdater.incrementBytesTransferred(TransferProgressUpdater.java:213)
  at software.amazon.awssdk.services.s3.internal.crt.S3CrtResponseHandlerAdapter.onProgress(S3CrtResponseHandlerAdapter.java:290)

Environment

  • Nextflow version: 25.10.5
  • Operating system: Linux (head job, Fusion-enabled CE)

Additional context

nf-amazon pins awssdk:s3 / s3-transfer-manager / aws-crt-client = 2.33.2. Likely fix: append / to the prefix and/or set .delimiter("/") as S3Iterator does; matchName() semantics are preserved. The CRT crash is upstream (aws/aws-sdk-java-v2#4790).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions