Skip to content

[weather-sp]: Skip creating split files that already exist #538

Description

@j9sh264

Problem

Currently in weather-sp, the file splitter skips processing an input file only when all its split children already exist. If even one of the split files is missing (for example, due to a partial failure or interrupted run), the splitter re-splits the entire file and overwrites all the previously created children.

This leads to redundant processing and unnecessary I/O, especially when re-running pipelines to fill in missing data.

Proposed Solution

Add functionality to skip the creation of individual split files if they already exist in the output location.

Instead of an all-or-nothing approach at the input file level, the splitter should:

  1. Identify which specific split files are expected to be produced.
  2. Check if each file already exists.
  3. Only generate and upload the missing files, leaving the existing ones intact.

Metadata

Metadata

Assignees

No one assigned

    Type

    No fields configured for Task.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions