Version: Latest

Dialogue Understanding

Dialogue Understanding aims to understand how the end user of an AI assistant wants to progress the conversation.

New in 3.7

The Command Generator is part of Rasa's new Conversational AI with Language Models (CALM) approach and available starting with version 3.7.0.

CommandGenerator

The CommandGenerator component performs Dialogue Understanding. In the CALM approach, Dialogue Understanding aims to represent how the end user wants to progress the conversation. Currently, there are two command generators available: LLMCommandGenerator and NLUCommandAdapter. You can find all generated commands in the command reference below.

Using the LLMCommandGenerator

To use this component in your AI assistant, add the LLMCommandGenerator to your NLU pipeline in the config.yml file. Read more about the config.yml file here.

config.yml
pipeline:
# - ...
- name: LLMCommandGenerator
# - ...

The LLMCommandGenerator requires access to an LLM API. You can use any OpenAI model that supports the /chat endpoint such as "gpt-3.5-turbo" or "gpt-4". We are working on expanding the list of supported models and model providers.

How the LLMCommandGenerator Works

The job of a LLMCommandGenerator is to ingest information about a conversation so far. It outputs a sequence of commands that represent how the user wants to progress the conversation.

For example, if you defined a flow called transfer_money, and a user starts a conversation by saying "I need to transfer some money", the correct command output would be StartFlow("transfer_money").

If you asked the user a yes/no question (using a collect step) and they say "yes.", the correct command output is SetSlot(slot_name, True).

If the user answers the question but also requests something new, like "yes. Oh what's my balance?", the command output might be [SetSlot(slot_name, True), StartFlow("check_balance")].

By generating a sequence of commands, Dialogue Understanding is a better way to represent what the user wants than a classification-based NLU system.

To interpret the user's message in context, the current implementation of the LLMCommandGenerator uses: in-context learning, information about the current state of the conversation, and flows defined in your assistant. Descriptions and slot definitions of each flow is included in the prompt as relevant information. However, to scale to a large number of flows, the LLMCommandGenerator includes only the flows that are relevant to the current state of the conversation.

Retrieving Relevant Flows

The ability to retrieve relevant flows has a training component attached to it. During training, all defined flows with flow guards potentially evaluating to true are transformed into documents containing flow descriptions and (optionally) slot descriptions and allowed slot values. These documents are then transformed into vectors using the embedding model and stored in a vector store.

When talking to the assistant, i.e. during inference, the current conversation context is transformed into a vector and compared against the flows in the vector store. This comparison identifies the flows that are most similar to the current conversation context and includes them into the prompt of the LLMCommandGenerator.

However, additional rules are applied to select or discard certain flows:

  • Any flow with a flow guard evaluating to False is excluded.
  • Any flow marked with the always_include_in_prompt property set to true is always included, provided that the flow guard(if defined) evaluates to true.
  • All flows that are active during the current conversation context are always included.

This feature of retrieving only the relevant flows and including them in the prompt is enabled by default. Read more about configuring the options here.

The performance of the flow retrieval depends on the quality of flow descriptions. Good descriptions improve the differentiation among flows covering similar topics but also boost the alignment between the intended user actions and the flows. For tips on how to write good descriptions, you can check out our guidelines.

Prompt Template

The default prompt template serves as a dynamic framework enabling CommandGenerator to render prompts. The template consists of a static component, as well as dynamic components that get filled in when rendering a prompt:

  • Current state of the conversation - This part of the template captures the ongoing dialogue.
  • Defined flows and slots - This part of the template provides the context and structure for the conversation. It outlines the overarching theme, guiding the model's understanding of the conversation's purpose.
  • Active flow and slot - Active elements within the conversation that require the model's attention.

Customization

You can customize the Command Generator as much as you wish.

LLM configuration

To specify the OpenAI model to use for the LLMCommandGenerator, set the llm.model_name property in the config.yml file:

config.yml
pipeline:
# - ...
- name: LLMCommandGenerator
llm:
model_name: "gpt-4"
request_timeout: 7
temperature: 0.0
# - ...

The model_name defaults to gpt-4. The model name should be set to a chat model of OpenAI.

Similarly, you can specify the request_timeout and temperature parameters for the LLM. The request_timeout defaults to 7 seconds and the temperature defaults to 0.0.

If you want to use Azure OpenAI Service, configure the necessary parameters as described in the Azure OpenAI Service section.

Using Other LLMs

By default, OpenAI is used as the underlying LLM provider.

The LLM provider you want to use can be configured in the config.yml file. To use another provider, like cohere:

config.yml
pipeline:
# - ...
- name: LLMCommandGenerator
llm:
type: "cohere"
# - ...

For more information, see the LLM setup page on llms and embeddings

Customizing The Prompt

Because the LLMCommandGenerator uses in-context learning, one of the primary ways to tweak or improve performance is to customize the prompt.

In most cases, you can achieve what you need by customizing the description fields in your flows. Every flow has its own description field; optionally, every step in your flow can also have one. If you notice a flow is triggered when it shouldn't, or a slot is not extracted correctly, adding more detail to the description will often solve the issue.

For example, if you have a transfer_money flow with a collect step for the slot amount, you can add a description to extract the value more reliably:

flows.yml
flows:
transfer_money:
description: |
This flow lets users send money to friends
and family, in US Dollars.
steps:
- collect: recipient
- collect: amount
description: the amount of money to send. extract only the numerical value, ignoring the currency.

Use the following guidelines to write informative and contextually rich flow descriptions.

  1. Provide information-dense descriptions: Ensure flow descriptions are precise and informative, directly outlining the flow's purpose and scope. Aim for a balance between brevity and the density of information, using imperative language and avoiding unnecessary words to prevent ambiguity. The goal is to convey essential information as clearly as possible.
  2. Use clear and standard language: Avoid unusual phrasing or choice of words. Stick to clear, universally understood language.
  3. Explicitly define context: Explicitly define the flow context to increase the models situational awareness. The embedding models used for retrieving only the relevant flows lacks situational awareness. It can't figure out the context or read between the lines beyond what's directly described in the flow.
  4. Clarify implicit knowledge: Clarify any specialized knowledge in descriptions (e.g. if there are brand names mentioned: what is brand domain; if the product name is mentioned: what is the product about). The embedding model that is used for retrieving only the relevant flows is unlikely to produce good embeddings regarding brands and their products.
  5. (Optional) Adding example user utterances: While strictly not required, adding example user utterances can add more context to the flow descriptions. This can also ensure that the embeddings will closely match the user inputs. This should be considered more as a remedy, rather than a cure. If user utterances improve performance, it suggests they provide new information that could be directly incorporated into flow descriptions.

Customizing the Prompt Template

If you cannot get something to work via editing your yaml files, you can go one level deeper and customise the prompt template used to drive the LLMCommandGenerator. To do this, write your own prompt as a jinja2 template and provide it to the component as a file:

config.yml
pipeline:
- name: LLMCommandGenerator
prompt: prompts/command-generator.jinja2

The prompt template also allows the utilization of variables to incorporate dynamic information. You can access the comprehensive list of available variables here to use in your custom prompt template.

VariableTypeDescription
available_flowsList[Dict[str, Any]]A list of all flows available in the assistant.
current_conversationstrA readable representation of the current conversation.

a simple example:
USER: hello\nAI: hi\nUSER: I need to send money
current_flowstrID of the current active flow.

example:
transfer_money
current_slotstrName of the currently asked slot.

example:
transfer_money_recipient
current_slot_descriptionstrDescription of the currently asked collect step.

example:
the name of the person
flow_slotsList[Dict[str, Any]]A list of slots from the current active flow.
user_messagestrThe latest message sent by the user.

example:
I want to transfer money
  • Iterating over the flow_slots variable can be useful to create a prompt that lists all the slots of the current active flow,
{% for slot in flow_slots -%}
- name: {{ slot.name }}, value: {{ slot.value }}, type: {{ slot.type }}, description: {{ slot.description}}{% if slot.allowed_values %}, allowed values: {{ slot.allowed_values }}{% endif %}
{% endfor %}
VariableTypeDescription
slot.namestrName of the slot.

example:
transfer_money_has_sufficient_funds
slot.descriptionstrDescription of the slot.

example:
Checks if there is sufficient balance
slot.valuestrValue of the slot.

example:
True
slot.typestrType of the slot.

example:
bool
slot.allowed_valuesList[str]List of allowed values for the slot.

example:
[True, False]
  • Iterating over the available_flows variable can be useful to create a prompt that lists all the flows,
{% for flow in available_flows %}
{{ flow.name }}: {{ flow.description }}
{% for slot in flow.slots -%}
slot: {{ slot.name }}{% if slot.description %} ({{ slot.description }}){% endif %}{% if slot.allowed_values %}, allowed values: {{ slot.allowed_values }}{% endif %}
{% endfor %}
{%- endfor %}
VariableTypeDescription
flow.namestrName of the flow.

example:
transfer_money
flow.descriptionstrDescription of the flow.

example:
This flow lets users send money.
flow.slotsList[Dict[str, Any]]A list of slots from the flow.

Customizing the maximum length of user input

To restrict the length of user messages, set the user_input.max_characters (default value 420 characters).

config.yml
pipeline:
- name: LLMCommandGenerator
user_input:
max_characters: 420

Customizing flow retrieval

The ability to retrieve only the relevant flows for inclusion in the prompt at inference time is activated by default. To configure it, you can modify the settings under the flow_retrieval property. The default configuration uses text-embedding-ada-002 embedding model from OpenAI:

config.yml
pipeline:
- name: LLMCommandGenerator
...
flow_retrieval:
embeddings:
type: openai
model: text-embedding-ada-002
...
...

You can adjust the embedding provider and model. More on supported embeddings and how to configure those can be found here.

Additionally, you can also configure:

  • turns_to_embed - The number of conversation turns to be transformed into a vector and compared against the flows in the vector store. Setting the value to 1 means that only the latest conversation turn is used. Increasing the number of turns expands the conversation context window.
  • should_embed_slots - Whether to embed the slot descriptions along with the flow description during training (True / False).
  • num_flows - The maximum number of flows to be retrieved from the vector store.

Below is a configuration with default values:

config.yml
pipeline:
- name: LLMCommandGenerator
...
flow_retrieval:
turns_to_embed: 1
should_embed_slots: true
num_flows: 20
...
Number of retrieved flows

The number of flows specified by num_flows does not directly correspond to the actual number of flows included into the prompt. The total number of included flows also depends on the flows marked as always_include_in_prompt and those previously active. For more information, check the Retrieving Relevant Flows section.

The flow retrieval can also be disabled by setting the flow_retrieval.active field to false:

config.yml
pipeline:
- name: LLMCommandGenerator
...
flow_retrieval:
active: false
...

:::warn Disabling the ability to retrieve only the flows that are relevant to the current conversation context will restrict the command generator's capacity to manage a large number of flows. Due to the command generator's limited prompt size, exceeding this limit will lead to its inability to create effective commands, leaving the assistant unable to provide meaningful responses to user requests. Additionally, a high number of tokens in the prompt can result in increased costs and latency, further impacting the responsiveness of the system. :::

Using the NLUCommandAdapter

To use this component in your assistant, add the NLUCommandAdapter to your NLU pipeline in the config.yml file. You also need to have an intent classifier listed in your NLU pipeline. Read more about the config.yml file here.

config.yml
pipeline:
# - ...
- name: NLUCommandAdapter
# - ...

How the NLUCommandAdapter Works

The NLUCommandAdapter uses the classic way to start flows, such as using predicted intents by an intent classifier. It looks at the predicted intent from the intent classifier and tries to find a flow with a corresponding NLU trigger defined. If a flow has a NLU trigger matching the predicted intent and the confidence is larger than the given threshold defined in the NLU trigger, the NLUCommandAdapter will return a StartFlow command to begin the corresponding flow.

When to use the NLUCommandAdapter

We recommend to use the NLUCommandAdapter in two scenarios:

  • You want to use NLU data containing intent and examples along with the CALM paradigm. Using the NLUCommandAdapter you can initiate a flow based on a predicted intent, given you already have a solid intent classifier in place. Once the flow is initiated, the business logic would be executed as usual in the CALM paradigm with commands predicted by the LLMCommandGenerator and policies predicting the next best action.
  • You want to minimize the costs by not making an API call to the LLM each time. The NLUCommandAdapter does not make any API call to a LLM compared to the LLMCommandGenerator. Using the NLUCommandAdapter saves some costs. Make sure you have a solid intent classifier in place when using the NLUCommandAdapter; otherwise, incorrect flows will begin.

Customization

To restrict the length of user messages, you can set the user_input.max_characters (default value is 420 characters).

config.yml
pipeline:
- name: NLUCommandAdapter
user_input:
max_characters: 420

Using multiple Command Generators

If you want to add both Command Generators to your NLU pipeline, add the NLUCommandAdapter before the LLMCommandGenerator.

config.yml
pipeline:
# - ...
- name: NLUCommandAdapter
- name: LLMCommandGenerator
# - ...

The components are executed one after another. If the first component (i.e. NLUCommandAdapter) successfully predicts a StartFlow command, LLMCommandGenerator will be skipped (i.e. no calls to the LLM are made).

In general, if the first Command Generator predicts a command, all other Command Generators that come next in the pipeline are skipped. Keep that in mind when adding a custom Command Generator to the pipeline.

Command reference

Like its name indicates, the CommandGenerator generates "commands" that are then internally processed to trigger operations on the current conversation. Below are references to all supported commands, indicating the AI assistant should:

Start Flow

Start a new flow.

Cancel Flow

Cancel the current flow. It powers the Conversation Repair's Cancellation use case.

Skip Question

Intercepting user messages intending to bypass the current collect step in the flow. It powers the Conversation Repair's Skipping collect steps use case.

Set Slot

Set a slot to a given value.

Correct Slots

Change the value of a given slot to a new value. It powers the Conversation Repair's Correction use case.

Clarify

Ask for clarification. It powers the Conversation Repair's Clarification use case.

Chit-Chat Answer

Respond with answers in a chitchat style, whether they are predefined or free-form. It powers the Conversation Repair's Chitchat use case.

Knowledge Answer

Reply a knowledge-based free-form answer. It works together with the Enterprise Search policy.

Human Handoff

Hand off the conversation to a human.

Error

This command indicates the AI assistant failed to handle the dialogue due to an internal error.

Cannot Handle

This command indicates that the command generator failed to generate any commands. It powers the Conversation Repair's Cannot handle use case. By default, this command is not included in the prompts provided to an LLM.