Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .bazelrc
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,9 @@ build:release --compilation_mode=opt
build:release --linkopt=-Wl,--strip-all
build:release --stamp

build:linux/x86_64 --platforms //riva:linux_x86_64
build:linux/aarch64 --platforms //riva:linux_aarch64
build:l4t/aarch64 --platforms //riva:linux_aarch64
build:linux/x86_64 --platforms //nemotronspeech:linux_x86_64
build:linux/aarch64 --platforms //nemotronspeech:linux_aarch64
build:l4t/aarch64 --platforms //nemotronspeech:linux_aarch64

build:asan --strip=never
build:asan --copt=-fsanitize=address
Expand Down
8 changes: 4 additions & 4 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,10 +40,10 @@ jobs:
name: "Collect artifacts"
command: |
mkdir /tmp/artifacts;
cp bazel-bin/riva/clients/asr/riva_asr_client /tmp/artifacts/riva_asr_client-linux-amd64;
cp bazel-bin/riva/clients/asr/riva_streaming_asr_client /tmp/artifacts/riva_streaming_asr_client-linux-amd64;
cp bazel-bin/riva/clients/tts/riva_tts_client /tmp/artifacts/riva_tts_client-linux-amd64;
cp bazel-bin/riva/clients/tts/riva_tts_perf_client /tmp/artifacts/riva_tts_perf_client-linux-amd64;
cp bazel-bin/nemotronspeech/clients/asr/nemotron_asr_client /tmp/artifacts/nemotron_asr_client-linux-amd64;
cp bazel-bin/nemotronspeech/clients/asr/nemotron_streaming_asr_client /tmp/artifacts/nemotron_streaming_asr_client-linux-amd64;
cp bazel-bin/nemotronspeech/clients/tts/nemotron_tts_client /tmp/artifacts/nemotron_tts_client-linux-amd64;
cp bazel-bin/nemotronspeech/clients/tts/nemotron_tts_perf_client /tmp/artifacts/nemotron_tts_perf_client-linux-amd64;
- store_artifacts:
path: /tmp/artifacts

Expand Down
62 changes: 31 additions & 31 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,42 +1,42 @@
[![CircleCI](https://circleci.com/gh/nvidia-riva/cpp-clients.svg?style=shield)](https://circleci.com/gh/nvidia-riva/cpp-clients) [![License](https://img.shields.io/badge/license-MIT-green)](https://opensource.org/licenses/MIT)
# NVIDIA Riva Clients
# NVIDIA Nemotron Speech Clients

NVIDIA Riva is a GPU-accelerated SDK for building Speech AI applications that are customized for your use case and deliver real-time performance. This repo provides performant client example command-line clients.
NVIDIA Nemotron Speech is a GPU-accelerated SDK for building Speech AI applications that are customized for your use case and deliver real-time performance. This repo provides performant client example command-line clients.

## Features

- **Automatic Speech Recognition (ASR)**
- `riva_streaming_asr_client`
- `riva_asr_client`
- `nemotron_streaming_asr_client`
- `nemotron_asr_client`
- **Speech Synthesis (TTS)**
- `riva_tts_client`
- `riva_tts_perf_client`
- `nemotron_tts_client`
- `nemotron_tts_perf_client`
- **Natural Language Processing (NLP)**
- `riva_nlp_punct`
- `nemotron_nlp_punct`

## Requirements

1. Meet the Quick Start [prerequisites](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/quick-start-guide.html#prerequisites)
2. A NVIDIA Riva Server (Set one up using the [quick start guide](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/quick-start-guide.html#local-deployment-using-quick-start-scripts))
2. A NVIDIA Nemotron Speech Server (Set one up using the [quick start guide](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/quick-start-guide.html#local-deployment-using-quick-start-scripts))
3. Docker (for Docker build)
4. Bazel 5.0.0 (for local build)

## Build

### Docker

To avoid needing to manually build the clients yourself, Riva comes with a ready to use client docker image. This allows you to run the clients through an interactive docker container.
To avoid needing to manually build the clients yourself, Nemotron Speech comes with a ready to use client docker image. This allows you to run the clients through an interactive docker container.

The clients will need access to a Riva Server. If your server is running locally all you need to do is allow the client container access to your local network.
The clients will need access to a Nemotron Speech Server. If your server is running locally all you need to do is allow the client container access to your local network.
If your server is not running locally, all clients come with a command line option `--riva_uri`. This defaults to `localhost:50051`, which is also the default server configuration. As the server is not local, run the client using `--riva_uri [IP]:[PORT]` with your configuration.

To build the docker image simply run
```
DOCKER_BUILDKIT=1 docker build . --tag riva-client
DOCKER_BUILDKIT=1 docker build . --tag nemotronspeech-client
```
To start an interactive docker container, with access to your local network, you can then run
```
docker run -it --net=host riva-client
docker run -it --net=host nemotronspeech-client
```

Then you can run the clients as command line programs
Expand All @@ -57,65 +57,65 @@ bazel build ...

To build a specific client, you can run:
```
bazel build //riva/clients/[asr/tts/nlp]:[CLIENT_NAME]
bazel build //nemotronspeech/clients/[asr/tts/nlp]:[CLIENT_NAME]
```

For example, to build the `riva_streaming_asr_client` you would run:
For example, to build the `nemotron_streaming_asr_client` you would run:
```
bazel build //riva/clients/asr:riva_streaming_asr_client
bazel build //nemotronspeech/clients/asr:nemotron_streaming_asr_client
```

You can find the built binaries in `bazel-bin/riva/clients`
You can find the built binaries in `bazel-bin/nemotronspeech/clients`

## Usage

### Speech Recognition (ASR) Clients
Riva comes with 2 ASR clients:
1. `riva_asr_client` for offline usage. Using this client, the server will wait until it receives the full audio file before transcribing it and sending it back to the client.
2. `riva_streaming_asr_client` for online usage. Using this client, the server will start transcribing after it receives a sufficient amount of audio data, "streaming" intermediate transcripts as it goes on back to the client. By default, it is set to transcribe after every `100ms`, this can be changed using the `--chunk_duration_ms` command line flag.
Nemotron Speech comes with 2 ASR clients:
1. `nemotron_asr_client` for offline usage. Using this client, the server will wait until it receives the full audio file before transcribing it and sending it back to the client.
2. `nemotron_streaming_asr_client` for online usage. Using this client, the server will start transcribing after it receives a sufficient amount of audio data, "streaming" intermediate transcripts as it goes on back to the client. By default, it is set to transcribe after every `100ms`, this can be changed using the `--chunk_duration_ms` command line flag.

To use the clients, simply pass in a folder containing audio files or an individual audio file name with the `audio_file` flag:
```
$ riva_streaming_asr_client --audio_file individual_audio_file.wav
$ nemotron_streaming_asr_client --audio_file individual_audio_file.wav
```
or
```
$ riva_asr_client --audio_file audio_folder
$ nemotron_asr_client --audio_file audio_folder
```

Note that only single-channel audio files in the `.wav` format are currently supported.

Other options and information can be found by running the clients with `-help`

### Speech Synthesis (TTS) Client
Riva comes with 2 TTS clients:
1. `riva_tts_client`
2. `riva_tts_perf_client`
Nemotron Speech comes with 2 TTS clients:
1. `nemotron_tts_client`
2. `nemotron_tts_perf_client`

Both clients support an `online` flag, which is similar to the `streaming` ASR client. Enabling the flag will stream the audio back to the client as soon as it is generated on the server, otherwise will send the entire batch at once.

Language can also be specified using a BCP-47 language tag, which is default to `en-US`

To use the `riva_tts_client` simply run the client passing in text with the `--text` flag:
To use the `nemotron_tts_client` simply run the client passing in text with the `--text` flag:
```
$ riva_tts_client --text="Text to be synthesized"
$ nemotron_tts_client --text="Text to be synthesized"
```

The `riva_tts_perf_client` performs the same as the `riva_tts_client` however provides additional information about latency and throughput. Run the client passing in a file containing the text input using the `--text_file` flag.
The `nemotron_tts_perf_client` performs the same as the `nemotron_tts_client` however provides additional information about latency and throughput. Run the client passing in a file containing the text input using the `--text_file` flag.
```
$ riva_tts_perf_client --text_file=/text_files/input.txt
$ nemotron_tts_perf_client --text_file=/text_files/input.txt
```

Other options and information can be found by running the clients with `-help`

### NLP Client

Riva comes with `riva_nlp_punct` NLP client for Punctuation. The `examples` folder contains example queries to test out the API.
Nemotron Speech comes with `nemotron_nlp_punct` NLP client for Punctuation. The `examples` folder contains example queries to test out the API.

To run the Punctuation client, simply pass in a text file containing queries using the `--queries` flag

```
$ riva_nlp_punct --queries=examples/punctuation_queries.txt
$ nemotron_nlp_punct --queries=examples/punctuation_queries.txt
Done sending 3 requests
1: Punct text: Do you have any red Nvidia shirts?
0: Punct text: Add punctuation to this sentence.
Expand All @@ -126,7 +126,7 @@ Other options and information can be found by running the client with `-help`

## Documentation

Additional documentation on the Riva Speech Skills SDK can be found [here](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/).
Additional documentation on the Nemotron Speech Skills SDK can be found [here](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/).


## License
Expand Down
6 changes: 3 additions & 3 deletions WORKSPACE
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
workspace(name = "com_nvidia_riva_api")
workspace(name = "com_nvidia_nemotronspeech_api")
load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")
load("@bazel_tools//tools/build_defs/repo:git.bzl", "git_repository", "new_git_repository")

Expand Down Expand Up @@ -76,8 +76,8 @@ grpc_extra_deps()

git_repository(
name = "nvriva_common",
remote = "https://github.com/nvidia-riva/common.git",
commit = "71df98266725320a6b6b3a9f32a6da832dc93691"
remote = "https://gitlab-master.nvidia.com/sarane/common.git",
commit = "d7276290d1031e69145140015542c6bd4c13f09a"
)

http_archive(
Expand Down
File renamed without changes.
46 changes: 23 additions & 23 deletions riva/clients/asr/BUILD → nemotronspeech/clients/asr/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ package(

cc_library(
name = "asr_client_helper",
srcs = ["riva_asr_client_helper.h", "riva_asr_client_helper.cc"],
srcs = ["nemotron_asr_client_helper.h", "nemotron_asr_client_helper.cc"],
deps = select({
"@platforms//cpu:aarch64": [
"@alsa_aarch64//:libasound"
Expand All @@ -23,7 +23,7 @@ cc_library(
],
}) + [
"@com_github_grpc_grpc//:grpc++",
"@nvriva_common//riva/proto:riva_grpc_asr",
"@nvriva_common//nemotronspeech/proto:nemotron_grpc_asr",
],
)

Expand All @@ -40,9 +40,9 @@ cc_library(
}) + [
":asr_client_helper",
"@com_github_grpc_grpc//:grpc++",
"@nvriva_common//riva/proto:riva_grpc_asr",
"@nvriva_common//nemotronspeech/proto:nemotron_grpc_asr",
"@glog//:glog",
"//riva/utils/wav:reader",
"//nemotronspeech/utils/wav:reader",
],
)

Expand All @@ -54,8 +54,8 @@ cc_library(
deps = [
":asr_client_helper",
":client_call",
"//riva/utils/wav:reader",
"//riva/utils/opus",
"//nemotronspeech/utils/wav:reader",
"//nemotronspeech/utils/opus",
"@glog//:glog",
] + select({
"@platforms//cpu:aarch64": [
Expand All @@ -65,43 +65,43 @@ cc_library(
"@alsa//:libasound"
],
}) + [
"//riva/utils:stamping",
"//nemotronspeech/utils:stamping",
"@com_github_grpc_grpc//:grpc++",
"@com_github_gflags_gflags//:gflags",
"@nvriva_common//riva/proto:riva_grpc_asr",
"//riva/utils:thread_pool",
"@nvriva_common//nemotronspeech/proto:nemotron_grpc_asr",
"//nemotronspeech/utils:thread_pool",
],
)

cc_binary(
name = "riva_asr_client",
srcs = ["riva_asr_client.cc"],
name = "nemotron_asr_client",
srcs = ["nemotron_asr_client.cc"],
deps = [
":asr_client_helper",
":client_call",
"@nvriva_common//riva/proto:riva_grpc_asr",
"//riva/utils:stamping",
"//riva/utils/files:files",
"//riva/utils/wav:reader",
"@nvriva_common//nemotronspeech/proto:nemotron_grpc_asr",
"//nemotronspeech/utils:stamping",
"//nemotronspeech/utils/files:files",
"//nemotronspeech/utils/wav:reader",
"@glog//:glog",
"@com_github_grpc_grpc//:grpc++",
"@com_github_gflags_gflags//:gflags",
"//riva/clients/utils:grpc",
"//nemotronspeech/clients/utils:grpc",
]
)

cc_binary(
name = "riva_streaming_asr_client",
srcs = ["riva_streaming_asr_client.cc"],
name = "nemotron_streaming_asr_client",
srcs = ["nemotron_streaming_asr_client.cc"],
deps = [
":asr_client_helper",
":client_call",
":streaming_recognize_client",
"@nvriva_common//riva/proto:riva_grpc_asr",
"//riva/utils/files:files",
"//riva/utils/wav:reader",
"@nvriva_common//nemotronspeech/proto:nemotron_grpc_asr",
"//nemotronspeech/utils/files:files",
"//nemotronspeech/utils/wav:reader",
"@glog//:glog",
"//riva/clients/utils:grpc",
"//nemotronspeech/clients/utils:grpc",
] + select({
"@platforms//cpu:aarch64": [
"@alsa_aarch64//:libasound"
Expand All @@ -122,7 +122,7 @@ cc_test(
deps = [
":streaming_recognize_client",
"@googletest//:gtest_main",
"@nvriva_common//riva/proto:riva_grpc_asr",
"@nvriva_common//nemotronspeech/proto:nemotron_grpc_asr",
],
tags = ["needs_alsa"]
)
Original file line number Diff line number Diff line change
Expand Up @@ -24,9 +24,9 @@
#include <string>
#include <thread>

#include "riva/proto/riva_asr.grpc.pb.h"
#include "riva/utils/wav/wav_reader.h"
#include "riva_asr_client_helper.h"
#include "nemotronspeech/proto/nemotron_asr.grpc.pb.h"
#include "nemotronspeech/utils/wav/wav_reader.h"
#include "nemotron_asr_client_helper.h"

using grpc::Status;
using grpc::StatusCode;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,12 +22,12 @@
#include <string>
#include <thread>

#include "riva/clients/utils/grpc.h"
#include "riva/proto/riva_asr.grpc.pb.h"
#include "riva/utils/files/files.h"
#include "riva/utils/stamping.h"
#include "riva/utils/wav/wav_reader.h"
#include "riva_asr_client_helper.h"
#include "nemotronspeech/clients/utils/grpc.h"
#include "nemotronspeech/proto/nemotron_asr.grpc.pb.h"
#include "nemotronspeech/utils/files/files.h"
#include "nemotronspeech/utils/stamping.h"
#include "nemotronspeech/utils/wav/wav_reader.h"
#include "nemotron_asr_client_helper.h"

using grpc::Status;
using grpc::StatusCode;
Expand Down Expand Up @@ -448,7 +448,7 @@ main(int argc, char** argv)
FLAGS_logtostderr = 1;

std::stringstream str_usage;
str_usage << "Usage: riva_asr_client " << std::endl;
str_usage << "Usage: nemotron_asr_client " << std::endl;
str_usage << " --audio_file=<filename or folder> " << std::endl;
str_usage << " --automatic_punctuation=<true|false>" << std::endl;
str_usage << " --max_alternatives=<integer>" << std::endl;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
*/


#include "riva_asr_client_helper.h"
#include "nemotron_asr_client_helper.h"

#include <regex>

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@

#include "absl/strings/str_replace.h"
#include "absl/strings/str_split.h"
#include "riva/proto/riva_asr.grpc.pb.h"
#include "nemotronspeech/proto/nemotron_asr.grpc.pb.h"

namespace nr = nvidia::riva;
namespace nr_asr = nvidia::riva::asr;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,12 +25,12 @@
#include <thread>

#include "client_call.h"
#include "riva/clients/utils/grpc.h"
#include "riva/proto/riva_asr.grpc.pb.h"
#include "riva/utils/files/files.h"
#include "riva/utils/stamping.h"
#include "riva/utils/wav/wav_reader.h"
#include "riva_asr_client_helper.h"
#include "nemotronspeech/clients/utils/grpc.h"
#include "nemotronspeech/proto/nemotron_asr.grpc.pb.h"
#include "nemotronspeech/utils/files/files.h"
#include "nemotronspeech/utils/stamping.h"
#include "nemotronspeech/utils/wav/wav_reader.h"
#include "nemotron_asr_client_helper.h"
#include "streaming_recognize_client.h"

using grpc::Status;
Expand Down Expand Up @@ -122,7 +122,7 @@ main(int argc, char** argv)
FLAGS_logtostderr = 1;

std::stringstream str_usage;
str_usage << "Usage: riva_streaming_asr_client " << std::endl;
str_usage << "Usage: nemotron_streaming_asr_client " << std::endl;
str_usage << " --audio_file=<filename or folder> " << std::endl;
str_usage << " --audio_device=<device_id (such as hw:5,0)> " << std::endl;
str_usage << " --automatic_punctuation=<true|false>" << std::endl;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

#include "streaming_recognize_client.h"

#include "riva/utils/opus/opus_client_decoder.h"
#include "nemotronspeech/utils/opus/opus_client_decoder.h"

#define clear_screen() printf("\033[H\033[J")
#define gotoxy(x, y) printf("\033[%d;%dH", (y), (x))
Expand Down
Loading