Skip to content

assigning multiple files per worker in parallel_import #49

@thricedotted

Description

@thricedotted

(NB: Not sure if this is an upstream bug or upstream intended behavior, but I was directed here for this issue!)

I have data spread across a number of files that is larger than the number of workers, but I had trouble using parallel_import to upload them to Myria. See the following queries:

https://rest.myria.cs.washington.edu:1776/query/query-70837 -- in this query, I've assigned five files to three workers, and get edu.washington.escience.myria.DbException: Query #70837.0 failed: ErrorCode: 0, SQLState: 42P07, Msg: ERROR: relation "public:adhoc:supertinyngramtest" already exists

https://rest.myria.cs.washington.edu:1776/query/query-70838 -- exactly the same query, except each file is assigned to a unique worker. This one runs successfully.

Both queries have "argOverwriteTable": true, since at first I thought I was double-ingesting -- however, an earlier query where this was false also failed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions