Skip to content

Performance issues in /scripts/imagenet_utils.py (by P3) #16

@DLPerf

Description

@DLPerf

Hello! I've found a performance issue in /scripts/imagenet_utils.py: batch() should be called before map(), which could make your program more efficient. Here is the tensorflow document to support it.

Detailed description is listed below:

  • dataset.batch(batch_size)(here) should be called before dataset.map(_parse_function, num_parallel_calls=num_threads)(here).
  • dataset.batch(batch_size)(here) should be called before dataset.map(_parse_function, num_parallel_calls=num_threads)(here).

Besides, you need to check the function called in map()(e.g., _parse_function called in dataset.map(_parse_function, num_parallel_calls=num_threads)) whether to be affected or not to make the changed code work properly. For example, if _parse_function needs data with shape (x, y, z) as its input before fix, it would require data with shape (batch_size, x, y, z).

Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions