Skip to content

telia-oss/ace-audio-stream-forwarding-api

Repository files navigation

Telia ACE Audio Stream Forwarding API

This repository contains the Telia ACE Audio Stream Forwarding API gRPC service definition and a simulator of the sender (client). An example receiver (server) is also provided.

The primary purpose is to act as a resource for the consumers of the API, aiding them with the implementation and validation of their custom receiver (server) implementations.

Introduction

Telia ACE Audio Stream Forwarding API provides customers with the capability to receive copies of the audio streams involved in voice calls, enabling a wide variety of use cases. As an example, with the rise of generative AI, customers can use it to apply their own AI capabilities on the voice call audio streams for transcription, summarization, intent analysis, and more. Other possible use cases include fraud detection and custom recording solutions.

API Overview

Telia ACE Audio Stream Forwarding API is based on gRPC over HTTP/2 with enforced transport encryption using TLS. The gRPC service definition and the related RPC requests, responses, and messages are specified as Protocol Buffers v3 (proto 3) in ace.audio_stream_forwarding.v1

Due to the sensitive nature of the data, authentication requires Mutual TLS (mTLS) with client certificates, where both parties mutually authenticate each other before any sensitive data is transmitted.

The receiving end on the customer’s side (receiver/server) must implement a gRPC service capable of bidirectional gRPC streaming RPC messages according to the specification. The sending end in Telia ACE (sender/client) connects to an endpoint provided by the customer and establishes secure, encrypted, and mutually authenticated TLS transports that carry gRPC channels.

graph LR
    A(Telia ACE) -- gRPC-encapsulated audio streams<br><br>gRPC/mTLS --> B(Customer Receiver)
Loading

API Flow

  1. Connection Establishment: The client (sender) establishes a secure, mutually authenticated TLS connection to the server (receiver).

  2. Stream Initialization: The client initiates a bidirectional gRPC streaming RPC (AudioStreamForward) and sends an initialization request (InitRequest) containing ACE contact data and audio configuration details.

  3. Initialization Acknowledgment: The client waits up to five (5) seconds for an acknowledgment from the server whether it wants to proceed with the audio stream or not. The server must repond with an InitResponse containing a status field (InitStatus) that tells the client whether it should start streaming audio (OK) or not (REJECTED), which you can use the ACE contact data provided to help decide.

  4. Audio Streaming: If the server's InitStatus is OK, the client streams audio data in chunks using AudioDataRequest to the server over the established stream. There is no acknowledgment or response that the server must, or even can, perform in response to AudioDataRequest.

    The server may process, store, or analyze the audio data as needed. If the status is REJECTED, the client must not send audio data and will terminate the stream.

  5. Stream Termination: Either party can close the stream when the session is complete or if an error occurs. The client will close the connection gracefully when the session is complete. The server should close the connection gracefully should it wish to abort before completion. Non-graceful connection terminations will be considered as a network or service failure.

Sequence Diagram

sequenceDiagram
    participant Client as Sender
    participant Server as Receiver
    Client->>Server: Establish TLS connection (mTLS)
    Client->>Server: Start bi-directional gRPC stream:<br>AudioStreamForward
    Client->>Server: InitRequest
    Note over Client,Server: Five (5) seconds timeout window for the<br>server to decide to proceed or reject
    Server-->>Client: InitResponse (InitStatus: OK/REJECTED)
    alt InitStatus == OK
        loop Audio Streaming
            Client->>Server: AudioDataRequest (audio chunk)
            opt Status Updates
                Client->>Server: StatusRequest (STATUS_PAUSED)
                Note over Client: Client may pause streaming
                Client->>Server: StatusRequest (STATUS_ACTIVE)
                Note over Client: Client resumes streaming
            end
            Note over Client,Server: Either party may terminate the connection<br>gracefully at any time
        end
        alt Successful Completion
            Client->>Server: StatusRequest (STATUS_COMPLETED)
        else Failure
            Client->>Server: StatusRequest (STATUS_FAILED)
        end
    else InitStatus == REJECTED
        Client-->>Server: Terminate stream
    end
Loading

Audio Data

ACE Audio Stream Forwarding API currently supports what is often called Linear16. More precisely, it is signed 16-bit little endian integer PCM, or pcm_s16le. This is indicated by the LINEAR16 value in AudioEncoding, a part of AudioConfig, provided in InitRequest.

The two channels in the audio data are sample-wise interleaved, with the first two bytes belonging to the first/left channel, next two to the second/right, etc.

The channel order is defined by the order of the Participants in InitRequest.

The audio data originating from the ACE service will currently have a sample rate of either 8 kHz or 16 kHz. The sample rate is indicated by the sample_rate_hertz field in AudioConfig, provided in InitRequest. It is recommended to use these two specific sample rates in testing your implementation with the help of the simulator.

The Simulator

Important

Use for evaluation and development purposes only. For support and production integration, reach out to your technical contact at Telia ACE.

Simulator Audio File and Audio Data Verification

The simulator expects a WAV-file containing two channels of pcm_s16le.

A sample audio file is included in data/audio-sample.wav.

See the README inside of the data directory for help with verifying the audio data handling.

Preparing TLS Certificates for the Simulator

You can use the following to create a certificate authority and to sign client and server certificates with their respective keys using openssl, for use with the simulator for development purposes.

The provided configuration supports server certificate verification for the DNS name localhost, IP address 127.0.0.1, and a DNS name of ace-audio-stream-forwarding-example-server.

Warning

In this configuration used for development, the same certificate is used to issue both client and server certificates. For a production setup that is neither required nor recommended.

# Check that `openssl` is available
which openssl > /dev/null 2>&1 || (echo "openssl is not installed. Please install it to generate certificates." && exit 1)

# Ensure the data/tls directory exists
mkdir -p data/tls

# CA
openssl genpkey -algorithm ed25519 -out data/tls/ca.key
openssl req -x509 -new -key data/tls/ca.key -out data/tls/ca.crt -days 3650 -subj "/CN=ACE Audio Stream Forwarding Simulator CA"   

# Client
openssl genpkey -algorithm ed25519 -out data/tls/client.key
echo "subjectAltName=DNS:ace-audio-stream-forwarding-simulator" > data/tls/client-ext.cnf
openssl req -new -key data/tls/client.key -out data/tls/client.csr -subj "/CN=ACE Audio Stream Forwarding Simulator" -config data/tls/client-ext.cnf
openssl x509 -req -in data/tls/client.csr -CA data/tls/ca.crt -CAkey data/tls/ca.key -CAcreateserial -out data/tls/client.crt -days 365 \
    -extfile data/tls/client-ext.cnf

# Server
openssl genpkey -algorithm ed25519 -out data/tls/server.key
echo "subjectAltName=DNS:localhost,IP:127.0.0.1,DNS:ace-audio-stream-forwarding-example-server,DNS:host.docker.internal" > data/tls/server-ext.cnf
openssl req -new -key data/tls/server.key -out data/tls/server.csr -subj "/CN=ACE Audio Stream Forwarding Example Server" \
    -config data/tls/server-ext.cnf
openssl x509 -req -in data/tls/server.csr -CA data/tls/ca.crt -CAkey data/tls/ca.key -CAcreateserial -out data/tls/server.crt -days 365 \
    -extfile data/tls/server-ext.cnf

Using the Simulator

Run the simulator with the --help flag to print the available options:

$ ace-audio-stream-forwarding-simulator --help

Usage: ace-audio-stream-forwarding-simulator [OPTIONS] --cert <PATH> --key <PATH> <--server-ca-cert <PATH>|--system-trust-roots> <AUDIO_FILE> <DESTINATION>

<output redacted for brevity>

It requires the follow options and arguments:

--server-ca-cert: path to the certificate to use to verify the server certificate
or
--system-root-trusts: this enables the use of the system trust roots instead

and two positional arguments: the path to the audio file and the destination URL.

Additional options:

--cert: path to the client certificate for mutual TLS authentication

--key: path to the key file matching the client certificate for mutual TLS authentication

--contact-id: sets the ACE contact id, unique within a tenant

-c/--contact-data: Sets an ACE Contact Data key-value pair. Can be used multiple times.

--agent-name <AGENT_NAME>: sets the agent name in the agent participant

$ ace-audio-stream-forwarding-simulator \
    --server-ca-cert data/tls/ca.crt \
    --cert data/tls/client.crt \
    --key data/tls/client.key \
    --contact-id 12345 \
    --contact-data A=B --contact-data C=D \
    -- \
    data/audio-sample.wav \
    https://localhost:50051

Pause Simulation

Simulating pauses can be done by adding one or more --pause <START_CHUNK>:<CHUNK_COUNT>.

Example:

--pause 5:2 --pause 15:3: pause at chunk 5 for 2 chunks, then at chunk 15 for 3 chunks

Audio chunks during pauses will be dropped by default. If you would like to preserve them to be resumed after the pause ends, supply --preserve-audio-during-pauses.

This will trigger a StatusRequest with status set to STATUS_PAUSED. After the pause is up, a StatusRequest with status set to STATUS_ACTIVE will be issued. If audio chunks are dropped during pauses, which is the default, and we run out of chunks, it will be successfully terminated with a StatusRequest with status set to STATUS_COMPLETED.

Failed Status Simulation

You can trigger a failed status and terminate the stream by specifying --fail-at-chunk <CHUNK_NUMBER>. When the specified audio chunk is reached, a StatusRequest with status set to STATUS_FAILED will be issued and the streaming will be terminated.

Running the Example Server

Run the example server with the --help flag to print the available options, and it can be started as such:

$ ace-audio-stream-forwarding-example-server \
    --cert data/tls/server.crt \
    --key data/tls/server.key \
    --client-ca-cert data/tls/ca.crt

Logging Output

The simulator (and example server) are implemented in Rust and they adhere to the convention of configurable logging levels per crate, module, and even submodule, using the RUST_LOG environment variable. The available levels are error, warn, info, debug, trace, and off.

If not set, the applications will default to info for both our crates and our dependencies.

To set the logging level to trace for our own crates but info for our dependencies, you can set RUST_LOG as such:

RUST_LOG=ace_audio_stream_forwarding=trace,info

See the env_logger documentation for more information.

Building and Running Locally with Cargo

To build the simulator locally, ensure you have Rust and the Protocol Buffer compiler installed.

As per rust-toolchain.toml, this repository targets Rust 1.88.0.

Build the Simulator

From the repository root, run:

cargo build --release

The compiled simulator binary will be located at target/release/ace-audio-stream-forwarding-simulator, and the example server at target/release/ace-audio-stream-forwarding-example-server,

License

The code in this repository is licensed under the Apache 2.0 License.

Disclaimer

This is not intended for production use. For support and production integration, reach out to your technical contact at Telia ACE.

About

Telia ACE Audio Stream Forwarding API

Resources

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors