The Conversational Insights devkit is a high-level API designed to extend the Google Python client for Conversational Insights. As a developer or someone maintaining code, you can use the Conversational Insights devkit to scale a wide range of tasks.
Use cases
Actions you can perform include the following:
- Ingest single conversations with metadata.
- Ingest many conversations in bulk with metadata.
- Transcribe mono audio files using Speech-to-Text (STT) V1.
- Create recognizers using STT V2.
- Set up BigQuery export.
- Change Conversational Insights global settings.
- Transform transcript data format from Genesys Cloud to Conversational Insights.
- Transform transcript data format from AWS to Conversational Insights.
Get started
To get started with the devkit, follow these steps for environment setup and authentication.
Step 1: Set up a virtual environment
Before using the devkit, follow these steps to set up your Google Cloud credentials and install the necessary dependencies.
- Use the following code to authenticate your account with the Google Cloud CLI.
gcloud auth login gcloud auth application-default login
- Set your Google Cloud project with the following code. Replace project name with your actual Google Cloud project ID.
gcloud config set project
- Create and activate a Python virtual environment with the following code. You must use a virtual environment to manage your project dependencies.
python3 -m venv venv source ./venv/bin/activate
- Install your dependencies. In your project directory, check for a
requirements.txt
file containing the devkit's dependencies, and install them by running the following code.pip install -r requirements.txt
Step 2: Account authentication
Authentication methods for the Python devkit vary depending on your environment.
Google Colaboratory
If you're using the Conversational Insights devkit in a Google Colaboratory notebook, you can authenticate by adding the following code to the top of your user-managed notebooks:
project_id = '
' Doing so launches an interactive prompt letting you authenticate with Google Cloud in a browser.
!gcloud auth application-default login --no-launch-browser
Authenticating sets your active project to the
project_id
.!gcloud auth application-default set-quota-project $project_id
Cloud Run functions
When using the Conversational Insights devkit with Cloud Run functions, the devkit automatically picks up the default environment credentials that these services use.
- Add devkit to the requirements TXT file: Verify that it's listed in your Cloud Run functions service's
requirements.txt
file. - Identity and Access Management role: Verify that the Cloud Run functions service account has the appropriate Dialogflow IAM role assigned.
Local Python environment
Similar to Cloud Run functions, this devkit picks up your local authentication credentials if you're using the gcloud CLI.
- Install gcloud CLI. If you haven't already, install the Google Cloud SDK. It includes the gcloud CLI.
- Initialize gcloud CLI with the following code.
gcloud init
- Sign in to Google Cloud with the following code.
gcloud auth login
- Verify an active account with the following code. This command authenticates your principal Google Cloud account with the gcloud CLI. The Conversational Insights devkit can then fetch the credentials from your setup.
gcloud auth list
Devkit contents
The Conversational Insights devkit contains the following resources.
Core folder
The core folder is synonymous with the core resource types found in most classes. It has the high-level building blocks of the devkit and general functionalities like authentication and global configurations. These classes and methods are used to build higher-level methods or custom tools and applications.
Common folder
The common folder contains wrappers built around the methods implemented in the Google Cloud SDK. The idea here is to add a new level of simplicity to existing implementations. You can perform the following actions with the Common folder.
- Manipulate global configurations from Conversational Insights.
- Ingest conversations (single and bulk) with metadata.
- Create or list blobs in a Cloud Storage bucket.
- Create transcriptions from audios using STT V1 and V2.
- Transcribe mono audio files with STT V1.
Workflows folder
The workflows folder contains classes and methods designed to perform actions that Conversational Insights doesn't support, such as the following.
- Format transcripts from Genesys Cloud to Conversational Insights.
- Format transcripts from AWS to Conversational Insights.
- Recognize roles in a transcript using Gemini.
Annotate an audio conversation
The following Python script transcribes an audio file, assigns speaker roles, then sends the enriched conversation data to Conversational Insights for analysis.
def audio_with_role_recognition():
# 1. Reset Insights Settings
reset_insights_settings()
# Verifies a clean state for Conversational Insights settings before starting the test.
# This function is assumed to be defined elsewhere and handles resetting global configurations.
# 2. Initialize Speech-to-Text V2 Client
sp = speech.V2(
project_id = _PROBER_PROJECT_ID
)
# Initializes a client for interacting with the Google Cloud Speech-to-Text API (V2).
# _PROBER_PROJECT_ID: The Google Cloud Project ID where the STT recognizer resides.
# 3. Create Transcription
transcript = sp.create_transcription(
audio_file_path = _MONO_SHORT_AUDIO_LOCATION,
recognizer_path = 'projects/<project_id>/locations/<region>/recognizers/<recognizer_id>'
)
# Sends an audio file for transcription using a specific STT V2 recognizer.
# audio_file_path: Local path to the mono audio file to be transcribed.
# recognizer_path: Full resource path to the STT V2 recognizer to be used for transcription.
# Example: 'projects/YOUR_PROJECT_NUMBER/locations/global/recognizers/YOUR_RECOGNIZER_ID'
# Verifies that the returned 'transcript' object is of the expected type from the Speech-to-Text V2 API.
# 4. Format Transcription
ft = format.Speech()
# Initializes a formatting utility for speech-related data.
transcript = ft.v2_recognizer_to_dict(transcript)
# Transforms the STT V2 `RecognizeResponse` object into a more manageable Python dictionary format,
# which is often easier to work with for subsequent processing steps like role recognition.
# 5. Initialize Google Cloud Storage Client
gcs = storage.Gcs(
project_name = _PROBER_PROJECT_ID,
bucket_name = _TMP_PROBER_BUCKET
)
# Initializes a Google Cloud Storage client.
# project_name: The Google Cloud Project ID.
# bucket_name: The name of the Google Cloud Storage bucket where the processed transcript will be temporarily stored.
# 6. Generate Unique File Name
file_name = f'{uuid.uuid4()}.json'
# Creates a unique file name for the JSON transcript using a UUID (Universally Unique Identifier).
# This prevents naming conflicts when uploading multiple transcripts to the same Google Cloud Storage bucket.
# 7. Perform Role Recognition
role_recognizer = rr.RoleRecognizer()
# Initializes the RoleRecognizer component, likely part of the Conversational Insights DevKit,
# responsible for identifying speaker roles (e.g., agent, customer) within a conversation.
roles = role_recognizer.predict_roles(conversation=transcript)
# Predicts the roles of speakers in the transcribed conversation.
# conversation: The transcribed conversation in a dictionary format.
transcript = role_recognizer.combine(transcript, roles)
# Integrates the recognized roles back into the original transcript data structure.
# This step enriches the transcript with speaker role metadata.
# 9. Upload Processed Transcript to Google Cloud Storage
gcs.upload_blob(
file_name = file_name,
data = transcript
)
# Uploads the enriched transcript (as JSON data) to the specified Google Cloud Storage bucket with the generated unique file name.
# This Google Cloud Storage path will be used as the source for Conversational Insights ingestion.
# 10. Construct Google Cloud Storage Path
gcs_path = f"gs://{_TMP_PROBER_BUCKET}/{file_name}"
# Forms the full Google Cloud Storage URI for the uploaded transcript, which is required by the Conversational Insights Ingestion API.
# 11. Initialize Conversational Insights Ingestion
ingestion = insights.Ingestion(
parent = _PARENT,
transcript_path = gcs_path
)
# Initializes an Ingestion client for Conversational Insights.
# parent: The parent resource path for Conversational Insights, typically in the format:
# 'projects/YOUR_PROJECT_NUMBER/locations/YOUR_LOCATION'
# transcript_path: The Google Cloud Storage URI where the conversation transcript is stored.
# 12. Ingest Single Conversation
operation = ingestion.single()
# Initiates the ingestion of the single conversation into Conversational Insights.
# This returns an operation object, which can be used to monitor the status of the ingestion.
Contributions and feature requests
To make contributions or feature requests, follow these steps:
- Create a repository fork on GitHub.
- Create your feature branch with the following code.
git checkout -b feature/AmazingFeature
- Commit your changes with the following code.
git commit -m 'Add some AmazingFeature'
- Push changes to the branch with the following code.
git push origin feature/AmazingFeature
- Open a pull request and submit it from your feature branch to the main repository.
License
The distribution of the Conversational Insights devkit is under Apache 2.0 License. For details, see the LICENSE
file in the project repository.