Skip to content

add vector scale a and b in sample code#1163

Open
tsenwang wants to merge 1 commit into
ROCm:developfrom
tsenwang:develop2
Open

add vector scale a and b in sample code#1163
tsenwang wants to merge 1 commit into
ROCm:developfrom
tsenwang:develop2

Conversation

@tsenwang

Copy link
Copy Markdown
Contributor

No description provided.

@tsenwang

Copy link
Copy Markdown
Contributor Author

Running tests...
Test project /Workspace/hipBLASLt/build/release
Start 1: sample_hipblaslt_gemm
1/40 Test #1: sample_hipblaslt_gemm ............................... Passed 1.50 sec
Start 2: sample_hipblaslt_gemm_ext
2/40 Test #2: sample_hipblaslt_gemm_ext ........................... Passed 1.11 sec
Start 3: sample_hipblaslt_gemm_batched
3/40 Test #3: sample_hipblaslt_gemm_batched ....................... Passed 1.39 sec
Start 4: sample_hipblaslt_gemm_batched_ext
4/40 Test #4: sample_hipblaslt_gemm_batched_ext ................... Passed 1.83 sec
Start 5: sample_hipblaslt_gemm_tuning_splitk_ext
5/40 Test #5: sample_hipblaslt_gemm_tuning_splitk_ext ............. Passed 13.14 sec
Start 6: sample_hipblaslt_gemm_bias
6/40 Test #6: sample_hipblaslt_gemm_bias .......................... Passed 1.07 sec
Start 7: sample_hipblaslt_gemm_bias_ext
7/40 Test #7: sample_hipblaslt_gemm_bias_ext ...................... Passed 0.89 sec
Start 8: sample_hipblaslt_gemm_get_all_algos
8/40 Test #8: sample_hipblaslt_gemm_get_all_algos ................. Passed 13.67 sec
Start 9: sample_hipblaslt_gemm_get_all_algos_ext
9/40 Test #9: sample_hipblaslt_gemm_get_all_algos_ext ............. Passed 12.01 sec
Start 10: sample_hipblaslt_gemm_get_algo_by_index_ext
10/40 Test #10: sample_hipblaslt_gemm_get_algo_by_index_ext ......... Passed 11.57 sec
Start 11: sample_hipblaslt_gemm_alphavec_ext
11/40 Test #11: sample_hipblaslt_gemm_alphavec_ext .................. Passed 1.06 sec
Start 12: sample_hipblaslt_gemm_gelu_aux_bias
12/40 Test #12: sample_hipblaslt_gemm_gelu_aux_bias ................. Passed 0.85 sec
Start 13: sample_hipblaslt_gemm_gelu_aux_bias_ext
13/40 Test #13: sample_hipblaslt_gemm_gelu_aux_bias_ext ............. Passed 0.67 sec
Start 14: sample_hipblaslt_gemm_amax
14/40 Test #14: sample_hipblaslt_gemm_amax .......................... Passed 0.69 sec
Start 15: sample_hipblaslt_gemm_amax_ext
15/40 Test #15: sample_hipblaslt_gemm_amax_ext ...................... Passed 0.73 sec
Start 16: sample_hipblaslt_gemm_amax_with_scale
16/40 Test #16: sample_hipblaslt_gemm_amax_with_scale ............... Passed 0.71 sec
Start 17: sample_hipblaslt_gemm_amax_with_scale_ext
17/40 Test #17: sample_hipblaslt_gemm_amax_with_scale_ext ........... Passed 0.70 sec
Start 18: sample_hipblaslt_gemm_bgradb
18/40 Test #18: sample_hipblaslt_gemm_bgradb ........................ Passed 0.95 sec
Start 19: sample_hipblaslt_gemm_ext_bgradb
19/40 Test #19: sample_hipblaslt_gemm_ext_bgradb .................... Passed 0.67 sec
Start 20: sample_hipblaslt_gemm_dgelu_bgrad
20/40 Test #20: sample_hipblaslt_gemm_dgelu_bgrad ................... Passed 0.69 sec
Start 21: sample_hipblaslt_gemm_dgelu_bgrad_ext
21/40 Test #21: sample_hipblaslt_gemm_dgelu_bgrad_ext ............... Passed 1.01 sec
Start 22: sample_hipblaslt_gemm_is_tuned_ext
22/40 Test #22: sample_hipblaslt_gemm_is_tuned_ext .................. Passed 0.76 sec
Start 23: sample_hipblaslt_gemm_tuning_wgm_ext
23/40 Test #23: sample_hipblaslt_gemm_tuning_wgm_ext ................ Passed 12.93 sec
Start 24: sample_hipblaslt_gemm_with_scale_a_b
24/40 Test #24: sample_hipblaslt_gemm_with_scale_a_b ................ Passed 12.65 sec
Start 25: sample_hipblaslt_gemm_with_scale_a_b_ext
25/40 Test #25: sample_hipblaslt_gemm_with_scale_a_b_ext ............ Passed 0.84 sec
Start 26: sample_hipblaslt_groupedgemm_ext
26/40 Test #26: sample_hipblaslt_groupedgemm_ext .................... Passed 2.69 sec
Start 27: sample_hipblaslt_groupedgemm_fixed_mk_ext
27/40 Test #27: sample_hipblaslt_groupedgemm_fixed_mk_ext ........... Passed 11.65 sec
Start 28: sample_hipblaslt_groupedgemm_get_all_algos_ext
28/40 Test #28: sample_hipblaslt_groupedgemm_get_all_algos_ext ...... Passed 13.10 sec
Start 29: sample_hipblaslt_gemm_mix_precision
29/40 Test #29: sample_hipblaslt_gemm_mix_precision ................. Passed 0.71 sec
Start 30: sample_hipblaslt_gemm_mix_precision_ext
30/40 Test #30: sample_hipblaslt_gemm_mix_precision_ext ............. Passed 1.99 sec
Start 31: sample_hipblaslt_gemm_mix_precision_with_amax_ext
31/40 Test #31: sample_hipblaslt_gemm_mix_precision_with_amax_ext ... Passed 0.99 sec
Start 32: sample_hipblaslt_gemm_attr_tciA_tciB
32/40 Test #32: sample_hipblaslt_gemm_attr_tciA_tciB ................ Passed 0.65 sec
Start 33: sample_hipblaslt_ext_op_layernorm
33/40 Test #33: sample_hipblaslt_ext_op_layernorm ................... Passed 0.50 sec
Start 34: sample_hipblaslt_gemm_ext_deprecated
34/40 Test #34: sample_hipblaslt_gemm_ext_deprecated ................ Passed 1.07 sec
Start 35: sample_hipblaslt_ext_op_amax
35/40 Test #35: sample_hipblaslt_ext_op_amax ........................ Passed 0.53 sec
Start 36: sample_hipblaslt_ext_op_amax_with_scale
36/40 Test #36: sample_hipblaslt_ext_op_amax_with_scale ............. Passed 0.74 sec
Start 37: sample_hipblaslt_gemm_scale_a_b_vector
37/40 Test #37: sample_hipblaslt_gemm_scale_a_b_vector .............. Passed 1.14 sec
Start 38: sample_hipblaslt_gemm_scale_a_b_vector_ext
38/40 Test #38: sample_hipblaslt_gemm_scale_a_b_vector_ext .......... Passed 1.20 sec
Start 39: hipblaslt-bench-groupedgemm-fixed-mk
39/40 Test #39: hipblaslt-bench-groupedgemm-fixed-mk ................ Passed 236.44 sec
Start 40: hipblaslt-test
40/40 Test #40: hipblaslt-test ...................................... Passed 2602.43 sec

100% tests passed, 0 tests failed out of 40

Total Test time (real) = 2969.91 sec

if(returnedAlgoCount == 0)
{
std::cerr << "No valid solution found!" << std::endl;
return;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please also free the allocated memory and descriptor before early return

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will create another PR to free the allocated memory and descriptor before early return in every sample code

CHECK_HIPBLASLT_ERROR(gemm.initialize(heuristicResult[0].algo, d_workspace));
CHECK_HIPBLASLT_ERROR(gemm.run(stream));

std::cout << "Matrix multiplication completed successfully." << std::endl;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

free memory

std::vector<float> scale_a(128, 0.5f); // Scale A vector setting
std::vector<float> scale_b(128, 2.0f); // Scale B vector setting

std::cout << "Running with Scale A = [0.5, ..., 0.5], Scale B = [2.0, ..., 2.0]" << std::endl;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to use different value for each scale A element since this example is scale A/B VECTOR but not scalar.


int main()
{
// This is an example using hipblaslt extension API: ScaleA & ScaleB & ScaleC

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no ScaleC in this example

@tsenwang tsenwang reopened this Nov 27, 2024
@tsenwang

tsenwang commented Dec 4, 2024

Copy link
Copy Markdown
Contributor Author

root@banff-cyxtera-s73-1:/Workspace/hipBLASLt/build/release/clients# make test
Running tests...
Test project /Workspace/hipBLASLt/build/release/clients
Start 1: sample_hipblaslt_gemm
1/41 Test #1: sample_hipblaslt_gemm ............................... Passed 1.96 sec
Start 2: sample_hipblaslt_gemm_ext
2/41 Test #2: sample_hipblaslt_gemm_ext ........................... Passed 1.75 sec
Start 3: sample_hipblaslt_gemm_batched
3/41 Test #3: sample_hipblaslt_gemm_batched ....................... Passed 1.81 sec
Start 4: sample_hipblaslt_gemm_batched_ext
4/41 Test #4: sample_hipblaslt_gemm_batched_ext ................... Passed 1.71 sec
Start 5: sample_hipblaslt_gemm_tuning_splitk_ext
5/41 Test #5: sample_hipblaslt_gemm_tuning_splitk_ext ............. Passed 12.43 sec
Start 6: sample_hipblaslt_gemm_bias
6/41 Test #6: sample_hipblaslt_gemm_bias .......................... Passed 1.08 sec
Start 7: sample_hipblaslt_gemm_bias_ext
7/41 Test #7: sample_hipblaslt_gemm_bias_ext ...................... Passed 0.82 sec
Start 8: sample_hipblaslt_gemm_get_all_algos
8/41 Test #8: sample_hipblaslt_gemm_get_all_algos ................. Passed 11.65 sec
Start 9: sample_hipblaslt_gemm_get_all_algos_ext
9/41 Test #9: sample_hipblaslt_gemm_get_all_algos_ext ............. Passed 11.82 sec
Start 10: sample_hipblaslt_gemm_get_algo_by_index_ext
10/41 Test #10: sample_hipblaslt_gemm_get_algo_by_index_ext ......... Passed 10.31 sec
Start 11: sample_hipblaslt_gemm_alphavec_ext
11/41 Test #11: sample_hipblaslt_gemm_alphavec_ext .................. Passed 1.08 sec
Start 12: sample_hipblaslt_gemm_gelu_aux_bias
12/41 Test #12: sample_hipblaslt_gemm_gelu_aux_bias ................. Passed 1.03 sec
Start 13: sample_hipblaslt_gemm_gelu_aux_bias_ext
13/41 Test #13: sample_hipblaslt_gemm_gelu_aux_bias_ext ............. Passed 0.90 sec
Start 14: sample_hipblaslt_gemm_amax
14/41 Test #14: sample_hipblaslt_gemm_amax .......................... Passed 0.64 sec
Start 15: sample_hipblaslt_gemm_amax_ext
15/41 Test #15: sample_hipblaslt_gemm_amax_ext ...................... Passed 0.63 sec
Start 16: sample_hipblaslt_gemm_amax_with_scale
16/41 Test #16: sample_hipblaslt_gemm_amax_with_scale ............... Passed 0.63 sec
Start 17: sample_hipblaslt_gemm_amax_with_scale_ext
17/41 Test #17: sample_hipblaslt_gemm_amax_with_scale_ext ........... Passed 0.63 sec
Start 18: sample_hipblaslt_gemm_bgradb
18/41 Test #18: sample_hipblaslt_gemm_bgradb ........................ Passed 0.57 sec
Start 19: sample_hipblaslt_gemm_ext_bgradb
19/41 Test #19: sample_hipblaslt_gemm_ext_bgradb .................... Passed 0.59 sec
Start 20: sample_hipblaslt_gemm_dgelu_bgrad
20/41 Test #20: sample_hipblaslt_gemm_dgelu_bgrad ................... Passed 0.61 sec
Start 21: sample_hipblaslt_gemm_dgelu_bgrad_ext
21/41 Test #21: sample_hipblaslt_gemm_dgelu_bgrad_ext ............... Passed 0.62 sec
Start 22: sample_hipblaslt_gemm_is_tuned_ext
22/41 Test #22: sample_hipblaslt_gemm_is_tuned_ext .................. Passed 0.71 sec
Start 23: sample_hipblaslt_gemm_tuning_wgm_ext
23/41 Test #23: sample_hipblaslt_gemm_tuning_wgm_ext ................ Passed 12.17 sec
Start 24: sample_hipblaslt_gemm_with_scale_a_b
24/41 Test #24: sample_hipblaslt_gemm_with_scale_a_b ................ Passed 11.75 sec
Start 25: sample_hipblaslt_gemm_with_scale_a_b_ext
25/41 Test #25: sample_hipblaslt_gemm_with_scale_a_b_ext ............ Passed 0.58 sec
Start 26: sample_hipblaslt_groupedgemm_ext
26/41 Test #26: sample_hipblaslt_groupedgemm_ext .................... Passed 1.95 sec
Start 27: sample_hipblaslt_groupedgemm_fixed_mk_ext
27/41 Test #27: sample_hipblaslt_groupedgemm_fixed_mk_ext ........... Passed 13.80 sec
Start 28: sample_hipblaslt_groupedgemm_get_all_algos_ext
28/41 Test #28: sample_hipblaslt_groupedgemm_get_all_algos_ext ...... Passed 12.74 sec
Start 29: sample_hipblaslt_gemm_mix_precision
29/41 Test #29: sample_hipblaslt_gemm_mix_precision ................. Passed 0.63 sec
Start 30: sample_hipblaslt_gemm_mix_precision_ext
30/41 Test #30: sample_hipblaslt_gemm_mix_precision_ext ............. Passed 1.43 sec
Start 31: sample_hipblaslt_gemm_mix_precision_with_amax_ext
31/41 Test #31: sample_hipblaslt_gemm_mix_precision_with_amax_ext ... Passed 1.03 sec
Start 32: sample_hipblaslt_gemm_attr_tciA_tciB
32/41 Test #32: sample_hipblaslt_gemm_attr_tciA_tciB ................ Passed 1.01 sec
Start 33: sample_hipblaslt_ext_op_layernorm
33/41 Test #33: sample_hipblaslt_ext_op_layernorm ................... Passed 0.72 sec
Start 34: sample_hipblaslt_gemm_ext_deprecated
34/41 Test #34: sample_hipblaslt_gemm_ext_deprecated ................ Passed 1.18 sec
Start 35: sample_hipblaslt_ext_op_amax
35/41 Test #35: sample_hipblaslt_ext_op_amax ........................ Passed 0.47 sec
Start 36: sample_hipblaslt_ext_op_amax_with_scale
36/41 Test #36: sample_hipblaslt_ext_op_amax_with_scale ............. Passed 0.70 sec
Start 37: sample_hipblaslt_gemm_with_TF32
37/41 Test #37: sample_hipblaslt_gemm_with_TF32 ..................... Passed 0.58 sec
Start 38: sample_hipblaslt_gemm_scale_a_b_vector
38/41 Test #38: sample_hipblaslt_gemm_scale_a_b_vector .............. Passed 1.01 sec
Start 39: sample_hipblaslt_gemm_scale_a_b_vector_ext
39/41 Test #39: sample_hipblaslt_gemm_scale_a_b_vector_ext .......... Passed 1.01 sec
Start 40: hipblaslt-bench-groupedgemm-fixed-mk
40/41 Test #40: hipblaslt-bench-groupedgemm-fixed-mk ................ Passed 217.69 sec
Start 41: hipblaslt-test
41/41 Test #41: hipblaslt-test ...................................... Passed 890.54 sec

100% tests passed, 0 tests failed out of 41

Total Test time (real) = 1234.98 sec

@vin-huang vin-huang left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@eidenyoshida eidenyoshida added the noCI Disable testing on supported CI systems: math libraries CI has this feature enabled.. label May 15, 2025
@jayhawk-commits

Copy link
Copy Markdown
Contributor

Please resolve merge conflicts or close this PR to complete the task of importing PRs from this repo to the monorepo.

assistant-librarian Bot pushed a commit that referenced this pull request Aug 18, 2025
[hipblaslt] Docs: Add new FP6/FP4 data types to reference
 materials (#1163)

## Motivation

<!-- Explain the purpose of this PR and the goals it aims to achieve.
-->
Adds information on the new hipBLASLt data types
## Technical Details

<!-- Explain the changes along with any relevant GitHub links. -->
Update the API reference and data type support docs. Import the header
documentation to the data types reference page using Doxygen (required a
formatting fix to hipblaslt.h to display properly)
## Test Plan

<!-- Explain any relevant testing done to verify this PR. -->
NA
## Test Result

<!-- Briefly summarize test outcomes. -->
NA
## Submission Checklist

- [x ] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

noCI Disable testing on supported CI systems: math libraries CI has this feature enabled..

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants