NiCE CXone Transcription Hub and TTS Hub (Cloud TTS Hub) How To

STT Turn-by-Turn Transcription - How Does It Work?

Turn-by-Turn Transcription is a real-time, speech-to-text service that transcribes conversation audio into segments based on each party's turns in the conversation. It is ideal for use with virtual agents and other applications that require segmented transcription / transcription by utterance.

Transcription Process:

  • Transcribes a conversation one utterance at a time.

  • Sends the transcription in real-time to the destination application.

  • Allows applications to analyze and respond to each utterance separately.

Integration:

  • Can be used with various sources like voicemails, IVRs, and virtual agents.

  • Supports multiple languages via Google Transcription Service.

  • Provides an option to use Google’s enhanced model for better speech recognition.

Configuration:

  • Administrators create transcription profiles specifying language and provider.

  • Profiles are added to Studio scripts using the CLOUD TRANSCRIBE action.

  • Custom scripting is required for integrating with text-only virtual agents.

Hint Phrases:

  • Hint phrases can be specified to improve transcription accuracy.

  • Useful for scenarios where specific vocabulary is common.

Why is it Important?

  • Improved Accuracy: By breaking down conversations into segments, it allows more accurate and context-aware transcription.

  • Enhanced Virtual Agent Performance: Enables text-only virtual agents to handle voice interactions effectively by providing real-time transcriptions.

  • Flexibility: Can be used with various applications and supports multiple languages, making it versatile for different business needs.

  • Customization: Supports hint phrases and enhanced models to tailor transcription accuracy to specific scenarios.

STT Continuous Stream Transcription - How Does It Work?

Continuous Stream Transcription is a real-time, self-service, speech-to-text transcription service that provides continuous audio transcription throughout a conversation. It is designed for applications requiring immediate access to transcribed text, such as agent assist tools.

Transcription Process:

  • Transcribes conversations as they happen, sending the transcription in real time with minimal delay.

  • The transcription appears almost instantly in the destination.

Integration:

  • Requires setting up a transcription profile in the Transcription Hub, specifying the transcription service and language.

  • Integrates with agent assist applications, such as Salesforce Assist, which display transcriptions during interactions.

Provider Options:

  • Currently supports: NiCE CXone Transcription and Google Transcription.

  • Profiles can be created for each language to transcribe.

Why is it Important?

  • Real-Time Assistance: Provides immediate transcription, allowing agents to confirm details with customers instantly, improving interaction quality and accuracy.

  • Enhanced Efficiency: Helps agents and applications access and use conversation details as they occur, facilitating better service and decision-making.

  • Language Support: Offers transcription in multiple languages, ensuring support for a diverse customer base.

Cloud TTS Transcription - How Does It Work?

Create Profiles

  • Administrators create TTS profiles specifying voice and language.

  • Profiles use providers like AWS Polly, Google TTS, or Google Custom Voice TTS.

Integration with Studio

  • Profiles are added to Studio scripts.

  • Different profiles can be used in the same or different scripts.

SSML Support

  • SSML (Speech Synthesis Markup Language) allows fine-tuning of speech attributes such as pronunciation, rate, pitch, and volume (this is a supplier-dependent capability).

Transcription Hub, showing the Transcription Hub configuration.

Why is it Important?

  • Customization: Offers a wide range of voices and languages, enhancing customer interactions with personalized and clear communication.

  • Multilingual Support: Essential for businesses operating in multiple regions, ensuring customers receive support in their preferred language.

  • Enhanced Customer Experience: Real-time, accurate speech synthesis improves IVR interactions and reduces the need for live agents.