Skip to content

Releases: cezbloch/pytorchures

Timing synchronization for accelerators

29 Oct 14:18
54d97f2

Choose a tag to compare

Synchronization after Module timing for any accelerator.

Previously only CUDA was synchronized. Now CPU, CUDA and XPU are synchronized, so that profiling timings are correct.

Multirun profiling to dictionary

04 Oct 14:16
2b0b3e9

Choose a tag to compare

Additions:

Change the way models are wrapped - now it is done by the TimedLayer constructor.
Add possibility to save profiling data as dictionary and dump to file as json.
store profiling results of multiple runs
store information on which device is each layer location - eg. cpu or cuda

Basic wall time profiling

27 Sep 11:13
68dba26

Choose a tag to compare

Pre-release

Measure time and print it into logs for every model layer