Benchmarks - Add deepseek megatron-lm benchmark by yukirora · Pull Request #713 · microsoft/superbenchmark

yukirora · 2025-05-23T11:57:09Z

Description
Add deepseek megatron-lm benchmark.

codecov · 2025-06-26T05:48:06Z

Codecov Report

Attention: Patch coverage is 88.42105% with 22 lines in your changes missing coverage. Please review.

Project coverage is 86.44%. Comparing base (a56356d) to head (55ffb8b).
Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
...bench/benchmarks/model_benchmarks/megatron_gpt3.py	88.42%	22 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #713      +/-   ##
==========================================
+ Coverage   86.19%   86.44%   +0.25%     
==========================================
  Files         100      100              
  Lines        7263     7406     +143     
==========================================
+ Hits         6260     6402     +142     
- Misses       1003     1004       +1

Flag	Coverage Δ
cpu-python3.10-unit-test	`72.20% <88.42%> (+0.54%)`	⬆️
cpu-python3.12-unit-test	`72.20% <88.42%> (+0.54%)`	⬆️
cpu-python3.7-unit-test	`71.79% <88.42%> (+0.54%)`	⬆️
cuda-unit-test	`83.95% <88.42%> (+0.30%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Description Add release note for v0.12.0 # Main Features ## SuperBench Improvement 1. - [x] Update Image Build Pipeline (#659) 2. - [x] Add support for arm64 build (#660) 3. - [x] Upgrade dependency versions in pipeline (#671) 4. - [x] Fix installation and lint issues (#684) 5. - [x] Update Flake8 repo (#683) 6. - [x] Init latest python support. (#687) 7. - [x] Add image build on arm64 arch (#690) 8. - [x] Enhancement of ignoring errors for import pkg_resources (#692) 9. - [x] Update label in the ROCm image build (#693) 10. - [x] Support cuda12.8 for Blackwell arch (#682) 11. - [x] Merge multi-arch image (#696) 12. - [x] Update OS of runner to the latest. (#702) 13. - [x] cuda arch flag for cublaslt (#701) ## Micro-benchmark Improvement 1. - [x] Bug Fix - Fix numa error on grace cpu in gpu-copy (#658) 2. - [x] Dependency - Bump onnxruntime-gpu version from 1.10.0 to 1.12.0 (#663) 3. - [x] Benchmarks: micro benchmarks - add general CPU bandwidth and latency benchmark (#662) 4. - [x] Benchmarks: micro benchmarks - add nvbandwidth build and benchmark (#665 and #669) 5. - [x] Fix stderr message in gpu-copy benchmark (#673) 6. - [x] Add arch support for 10.0 in gemm-flops (#680) 7. - [x] Fix tensorrt-inference parsing (#674) 8. - [x] nvbandwidth benchmark need to handle N/A value (#675) 9. - [x] Avoid Unintended nvbandwidth Function Calls in All Benchmarks (#685) 10. - [x] Add GPU Stream Micro Benchmark (#697) 11. - [x] Cuda arch flag for cublaslt (#701) 12. - [x] Support autotuning in cublaslt gemm (#706) 14. - [x] Add FP4 GEMM FLOPS support for cublaslt_gemm benchmark (#711) 15. - [x] CPU Stream Benchmark Revise (#712) 16. - [x] Add cuda12.9 docker image (#716) 17. - [x] Add Grace CPU support for CPU Stream (#719) ## Model Benchmark Improvement 1. - [x] Add LLaMA-2 Models (#668) 2. - [x] Fix typos in documentation and code files (#686) 3. - [x] Add Mixture of Experts Model (#679) 4. - [ ] Add DeepSeek Training Benchmark 5. - [x] Add DeepSeek Inference Benchmark (AMD GPU) (#713) ## Documentation 1. - [x] Update CODEOWNERS (#670) 2. - [x] Update CODEOWNERS (#718) ## Result Analysis 1. - [x] Enhance logging information for diagnosis rule op baseline errors. (#689)

yukirora and others added 5 commits May 23, 2025 06:09

init

7086eb4

add test

f30d255

fix bug

eaa1811

restore

fd784ae

fix bug

f6b4c46

yukirora requested a review from a team as a code owner May 23, 2025 11:57

polarG requested review from abuccts and guoshzhao June 4, 2025 23:40

abuccts approved these changes Jun 5, 2025

View reviewed changes

polarG self-assigned this Jun 5, 2025

guoshzhao approved these changes Jun 10, 2025

View reviewed changes

polarG and others added 2 commits June 25, 2025 21:36

Merge branch 'main' into yutji/rocm-ds

efbcba3

Fix lint and unittest issues.

55ffb8b

polarG enabled auto-merge (squash) June 26, 2025 05:40

polarG merged commit deef9a3 into main Jun 26, 2025
24 checks passed

polarG deleted the yutji/rocm-ds branch June 26, 2025 23:09

guoshzhao mentioned this pull request Jul 2, 2025

V0.12.0 Release Plan #710

Closed

40 tasks

polarG mentioned this pull request Aug 6, 2025

Docs - Upgrade version and release note #727

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmarks - Add deepseek megatron-lm benchmark#713

Benchmarks - Add deepseek megatron-lm benchmark#713
polarG merged 7 commits into
mainfrom
yutji/rocm-ds

yukirora commented May 23, 2025

Uh oh!

codecov Bot commented Jun 26, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

yukirora commented May 23, 2025

Uh oh!

codecov Bot commented Jun 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov Bot commented Jun 26, 2025 •

edited

Loading