This repository collects some projects for my GCN assembly learning roads.
Clone this repository, cd to it, and then
mkdir build
cd build
cmake .. -DCMAKE_CXX_COMPILER=hipcc -DCMAKE_C_COMPILER=hipcc -DCMAKE_PREFIX_PATH=/opt/rocm/lib/cmake
make -j- How to set up the host part to launch assembly kernel and use 1 thread to set a value with specified value in assembly code.
- Set up buffer resource descriptor
- How to set up buffer resource descriptor
- Set exec mask to make lanes active/inactive
- Implement relu and leaky relu in both hip and assembly code
- Demonstrate how to use reduction method to get the maximum value via multiple threads in GPU
- Please refer to my diagram to understand the assembly algorithm if you are new to assembly.
