Google’s Dialogflow CX allows developers to build powerful agents for advanced conversational scenarios. With the advent of generative AI, Dialogflow CX’s “generators” have been one of its most exciting features, enabling more dynamic and context-aware agent behavior at run-time.
What is a generator?
Generators use Google’s large language models (LLMs) to generate responsive behavior. Generators can be used to answer an end-user’s question, or to manage internal processes like information retrieval, escalation, and conversation summarization.
What is a text prompt?
A text prompt defines the generator’s function, and instructs the LLM to perform the task. Text prompts can include placeholders to factor information from the conversation into the instructions.
Working with generators requires developing text prompts that will perform under different conversational scenarios. In this blog, we’ll demonstrate how to engineer prompts to align the generator’s performance with your expectations–and those of your customers.
Developing a Prompt for Conversation Summarization
The prompt should state the generator’s goal in plain language. If we want to summarize the conversation, the core instruction might be:
Within HumanFirst, you can test that prompt on a handful of transcripts and assess whether the given summary captures the right information.
By running the prompt on a handful of conversations, and comparing the summary against the real transcripts, we can see where the prompt needs work. Here’s a second iteration:
Testing the prompt in HumanFirst, and again in Dialogflow CX’s simulator, we can see if the summaries improve and continue to fine-tune the prompt until the output aligns with our expectations.
Using Generators to Handle Speech to Text Transcription Errors
If you’re using a voicebot as part of your support strategy, it’s probably collecting order numbers or addresses as part of the verification process. Speech-to-text transcription systems struggle to decipher alphanumeric strings. Order number “1234BC” gets lost in the translation, and the case is escalated to a human agent to handle verification.
Hand-offs create friction on the customer’s end, and take the human agent away from more complex interactions.
In the video above, Stephen Broadhurst demonstrates an end-to-end process of engineering a prompt to remedy speech to text transcription errors using real conversation histories. This prompt is designed to be used in a route on an intent match when the user provides an address.
We started with the following prompt:
Upon testing, we made an adjustment:
Instructing the LLM to return ‘0’ for an undetermined number will limit the variability of its response, so it doesn’t return an unexpected text string. When we re-ran that prompt, we found it performed the way we expected. We pasted it into the generator and tested it in the simulator to ensure it handled the transcription error correctly when the intent matched the scenario.
Once you’ve fine-tuned a prompt against real conversations, and you’re confident that the generator will perform, you can copy the prompt back to Dialogflow CX and adjust the placeholder to correspond with your session parameters. Dialogflow’s built-in placeholders include $conversation, to reference the conversation history, and $last-user-utternace, to reference the last output of the user. Session parameters can be customized to capture any information needed; see Google’s documentation.
Incorporating Hybrid Agents
Dialogflow CX offers developers the option of hybrid flows. Hybrid flows blend generators into structured dialogue paths to transition between predefined responses and LLM-generated content. The hybrid approach is particularly useful in scenarios where standard responses might fall short, enabling the agent to handle complex queries, provide detailed information, or adapt to user inputs in real time.
Dialogflow CX’s generators are a big step forward in building responsive conversations. With a workspace to design and test the prompts, you can build highly customized and effective conversational experiences. Whether you’re summarizing conversations, handling speech-to-text errors, or responding to questions, generators can help you build agile interactions to support more efficient customer support.