Skip to content

Update AMD build doc AND Make rocshmem device-bitcode arch configurable#174

Open
WhatGhost wants to merge 1 commit into
ByteDance-Seed:mainfrom
WhatGhost:dev-amd-build
Open

Update AMD build doc AND Make rocshmem device-bitcode arch configurable#174
WhatGhost wants to merge 1 commit into
ByteDance-Seed:mainfrom
WhatGhost:dev-amd-build

Conversation

@WhatGhost

Copy link
Copy Markdown

Summary

I follow the AMD build flow on 8x AMD Instinct MI350X (gfx950) inside the
rocm/pytorch:rocm7.1_ubuntu24.04_py3.12_pytorch_release_2.7.1 container,
using the steps in docs/build.md. I had met some errors and fix them

  • shmem/rocshmem_bind/scripts/build_rocshmem_device_bc.sh hard-codes
    BITCODE_LIB_ARCH=gfx942 (MI300X), so the resulting rocshmem device
    bitcode does not fit MI350 (gfx950) or any other non-MI300 GPU.
    So i make it configurable and keep gfx942 as default . so that it wont affet users using MI300

  • docs/build.md (AMD section): fill in a couple of missing/typo steps —
    add step 0 to detect and export BITCODE_LIB_ARCH=<gfxXXX>, quote
    'hip-python>=7.1' (otherwise the shell parses >=7.1 as a
    redirection), and pip uninstall -y triton before pip install -e python
    to uninstall the origin triton

TEST

  • bash ./scripts/launch_amd.sh ./python/triton_dist/test/amd/test_ag_gemm_intra_node.py 8192 8192 29568✅ all close!
  • TRITON_DIST_SHMEM_BACKEND=mori_shmem bash ./scripts/launch_amd.sh ./python/triton_dist/test/amd/test_ep_a2a.py --checkRANK[*]: all checks passed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant