Rasa Pro Change Log
All notable changes to Rasa Pro will be documented in this page. This product adheres to Semantic Versioning starting with version 3.3 (initial version).
Rasa Pro consists of two deployable artifacts: Rasa Pro and Rasa Pro Services. You can read the change log for both artifacts below.
[3.12.7] - 2025-04-28
Rasa Pro 3.12.7 (2025-04-28)
Improvements
-
Adds two optional properties on Jambonz channel connector,
username
andpassword
which can be used to enable Basic Access Authentication -
Added support for basic authentication in Twilio channels (Voice Ready and Voice Streaming). This allows users to authenticate their Twilio channels using basic authentication credentials, enhancing security and access control for voice communication. To use this feature, set
username
andpassword
in the Twilio channel configuration.credentials.yamltwilio_voice:
username: your_username
password: your_password
...
twilio_media_streams:
username: your_username
password: your_password
...At Twilio, configure the webhook URL to include the basic authentication credentials:
# twilio voice webhook
https://<username>:<password>@yourdomain.com/webhooks/twilio_voice/webhook
# twilio media streams webhook
https://<username>:<password>@yourdomain.com/webhooks/twilio_media_streams/webhook
Bugfixes
- Fail rasa commands (
rasa run
,rasa inspect
,rasa shell
) when model file path doesn't exist instead of defaulting to the latest model file from the default directory/models
. - Fix the behaviour of
action_hangup
on Voice Inspector (browser_audio channel). Display that the session has ended - Display a helpful error message in case of invalid API Key with Azure TTS
- Fixes Audiocodes Channel's event to intent mapping. All audiocode events are now mapped to an intent in the format
vaig_event_<event>
. That is, if Audiocodes sends an eventnoUserInput
, the assistant will receive the intent/vaig_event_noUserInput
[3.12.6] - 2025-04-15
Rasa Pro 3.12.6 (2025-04-15)
Deprecations and Removals
- Remove the behaviour handling digressions as eligible flows that can be started while handling an active collect step.
These properties have been removed from the flow and collect step:
ask_confirm_digressions
block_digressions
Remove the new patternpattern_handle_digressions
.
Features
- Introduce a new boolean property
force_slot_filling
for thecollect
flow step. This property allows you to suppress incorrect predictions of the command generator or user digressions that are not relevant to the current slot filling. To enable this behavior, you should set theforce_slot_filling
property toTrue
in thecollect
step of your flow configuration.Whenflows:
order_pizza:
name: order pizza
description: user asks for a pizza
steps:
- collect: pizza_type
- collect: quantity
- collect: address
force_slot_filling: trueforce_slot_filling
is set toTrue
, the command generator will only process theSetSlot
command for the specified slot. By default, the property is set toFalse
.
Bugfixes
- Security patch for Audiocodes and Genesys Channel connector
Adds
api_key
(required) andclient_secret
(optional) properties to Genesys channel configuration Addstoken
(optional) property to Audiocodes-Stream channel - Fix execution of custom validation action
action_validate_slot_mappings
by passing the extracted slot events to the tracker used when running this action. - Make sure training always fail when domain is invalid.
- Add channel name to UserMessage created by the Audiocodes channel.
Miscellaneous internal changes
Miscellaneous internal changes.
[3.12.5] - 2025-04-07
Rasa Pro 3.12.5 (2025-04-07)
Bugfixes
- Fix
ChitChatAnswerCommand
command replaced withCannotHandleCommand
if there are no e2e stories defined. Improve validation thatIntentlessPolicy
has applicable responses: either responses in the domain that are not part of any flow, or if there are e2e stories. This validation performed during the training time and during cleanup of theChitChatAnswerCommand
command. -
- Fixes an issue with prompt rendering where minified JSON structures were displayed without properly escaping newlines, tabs, and quotes.
- Introduced a new Jinja
filter to_json_encoded_string
that escapes newlines (\n
), tabs (\t
), and quotes (\"
) for safe JSON rendering.to_json_encoded_string
filter preserves other special characters (e.g., umlauts) without encoding them. - Updated the default prompts for
gpt-4o
andclaude-sonnet-3.5
[3.12.4] - 2025-04-01
Rasa Pro 3.12.4 (2025-04-01)
Bugfixes
- Send error event to the Kafka broker, when the original message size is too large, above the configured broker limit. This error handling mechanism was added to prevent Rasa-Pro server crashes.
- Update the following dependencies to versions that contain the latest patches for security vulnerabilities:
jinja2
werkzeug
cryptography
pyarrow
langchain
langchain-community
- Fix intermittent crashes on Inspector app when Enterprise Search or Chitchat Policies were triggered
[3.12.3] - 2025-03-26
Rasa Pro 3.12.3 (2025-03-26)
Bugfixes
- Fixes a bug in Inspector that raised a TypeError when serialising
numpy.float64
from Tracker
[3.12.2] - 2025-03-25
Rasa Pro 3.12.2 (2025-03-25)
No significant changes.
[3.12.1] - 2025-03-21
Rasa Pro 3.12.1 (2025-03-21)
Bugfixes
- Fix filtering of StartFlow commands from LLM-based command generators during the overlap check with prior commands. When prior commands do not contain any StartFlow or HandleDigression commands, we should not filter out the StartFlow command from the LLM-based command generator.
- Remove:
- cancel command when digression handling is defined
- duplicate digression handling commands during command processing
Miscellaneous internal changes
Miscellaneous internal changes.
[3.12.0] - 2025-03-19
Rasa Pro 3.12.0 (2025-03-19)
Deprecations and Removals
- Deprecate MultiStepLLMCommandGenerator and schedule for removal in Rasa
4.0.0
. - Remove the beta feature flag check from the e2e testing with assertions feature.
The
RASA_PRO_BETA_E2E_ASSERTIONS
environment variable is no longer needed as the feature is GA in 3.12.0. - Deprecate the
custom
slot mapping type and itsaction
slot mapping property which has been replaced withrun_action_every_turn
property name to retain backwards-compatible behavior. - Deprecate the former list of dictionaries format for the
condition
key in a conditional response variation.
Features
-
Add capability to use OAuth over Azure Entra ID for OpenAI instances deployed on Azure.
-
Added Voice Stream Channel Connector for Genesys Cloud (AudioConnector Integration)
-
Prevent unwanted digressions at collect flow steps by using one of the following new attributes available at both flow and collect step level:
ask_confirm_digressions
: Asks the user to confirm if to continue with the current flow. Can be set totrue
or to a list of flow ids for which this behaviour should be activated.block_digressions
: Blocks any digression from the current flow and informs the user that they will return to the digression once the current flow is completed. Can be set totrue
or to a list of flow ids for which this behaviour should be activated.
The above-mentioned behaviour is governed by a new pattern
pattern_handle_digressions
which is triggered only when the above attributes are used. -
Implement real-time validation of slot values.
Add an optional property
validation
to slots to configure validations that should be run immediately. This property expects a list ofrejections
and arefill_utter
which will be used to prompt users to provide a new value when validation fails.These validations are limited to common, reusable and universal checks that work independently of conversation context. For more complex validations that depend on conversation state, business logic, or external data, implement them in your flow definitions or custom actions instead.
-
Introducing the
CompactLLMCommandGenerator
component, an enhancement over theSingleStepLLMCommandGenerator
. This new component utilizes the highest-performing prompts for the modelsgpt-4o-2024-11-20
andclaude-3-5-sonnet-20240620
.To incorporate the
CompactLLMCommandGenerator
into your pipeline, simply add the following:pipeline:
...
- name: CompactLLMCommandGenerator
... -
Multi-language support was implemented to enable the assistant to deliver localized responses and flow names that dynamically adjust to the user's language preference. In particular:
- The default language is defined using the
language
key inconfig.yml
, while additional supported languages are specified underadditional_languages
. - A
translation
section was introduced for responses to provide language-specific versions of the response text. - A
translation
section was added for flows to define localized flow names. - The rephraser prompt now accommodates the selected language.
- Validation mechanisms were implemented to ensure proper use of translations; the CLI command
rasa validate data translations
is available for verification. - A new slot type,
StrictCategoricalSlot
, was developed to restrict its values to a predefined set. - A built-in
language
slot of theStrictCategoricalSlot
type was added for managing translations effectively.
- The default language is defined using the
-
[beta] Added Voice Stream channel connector to Audiocodes (audiocodes_stream)
Improvements
-
Added validation to issue warnings for non-existent fixture/metadata names referenced in end-to-end tests.
-
Implemented a fail fast mechanism that fails the end-to-end test case on the first failure, whether in user/bot turns or while using the assertions, to provide faster feedback.
-
Added
utterance_end_ms
configuration to deepgram asr to handle noisy environments better -
Replace the optional dependency
mlflow
leveraged in the beta release of E2E testing with assertions when evaluating generative answers with custom prompts for each of the two generative metrics:generative_response_is_relevant
andgenerative_response_is_grounded
. This change now enables the usage of different LLM model providers and allows for a more flexible evaluation of generative components.Additionally, these generative assertions can make use of a new property
utter_source
(i.e. Enterprise Search, Contextual Rephraser or Intentless). This enables the assertion to be applied to a specific bot message source. It also for example prevents phrases such asIs there anything else i can help you with?
triggered bypattern_completed
to be checked for groundedness when the assertion should not be applied to it. Remove applying the same generative assertion to multiple bot messages in the same turn, however one bot message can be evaluated by multiple generative assertions in the same turn. -
Remove unnecessary deepcopy to improve performance in
undo_fallback_prediction
method ofFallbackClassifier
-
Add capability to control whether
pattern_completed
should execute when flow completes its execution. To control this behavior, a new parameterrun_pattern_completed
is added to the flow definition. By default this parameter is set toTrue
which meanspattern_completed
will be executed when flow completes its execution (backward compatible). If this parameter is set toFalse
,pattern_completed
will not be executed when flow completes its execution. -
Allow slots to be filled by different slot extraction mechanisms (e.g. from_llm, predefined NLU-based mappings, custom actions etc.). Add new
from_llm
slot mapping boolean propertyallow_nlu_correction
(by default set toFalse
), which gives permission for LLM-issued SetSlot commands to correct slots previously filled via NLU-based mechanisms.Allow LLM-based command generators to issue other commands after
NLUCommandAdapter
has issued commands. Introduce a new LLM-based command generator config propertyminimize_num_calls
(by default set toFalse
) which maintains backwards compatibility with previous behaviour where LLM-based command generators were blocked from invoking the LLM afterNLUCommandAdapter
had issued commands.Update the default utterance
utter_corrected_previous_input
to use a new context propertynew_slot_values
. -
Set the default priority for StartFlow commands issued by different command generator types i.e. NLUCommandAdapter or LLM-based command generator: When the different command generators issue StartFlow commands for different flows in the same user turn, the NLUCommandAdapter will always take priority while the LLM-based start flow command will be discarded.
Remove the limitation that the NLUCommandAdapter must always precede the LLM-based command generator in the config pipeline.
-
Introduce a new slot mapping type
controlled
that can be assigned to slots that are set via button payloads,set_slots
flow steps or custom actions.Slots that solely use the new
controlled
slot mapping will not be available to be filled probabilistically by the NLU or LLM components. Note that this slot mapping can still be used alongside the other slot mapping types, however this comes with the risk of the slot being filled by the NLU or LLM components in a probabilistic manner. -
Slots with mappings of type
controlled
(formerly the now deprecatedcustom
mapping type) can be set at every turn without its custom action having to be called explicitly by the user flow.If you are building a coexistence assistant where different
controlled
slots are set by custom actions in different subsystems, you must indicate which coexistence system is allowed to fill the slot. This is done by setting thecoexistence_system
property in the slot mapping configuration. This property is a string that must match one of the available categorical values:NLU
,CALM
,SHARED
(when either system can set the slot). -
Support user to send a preformatted message to the
invoke_llm
method. This let's the user switch between theuser
andsystem
roles when invoking the LLM model. -
Add support for
pypred
predicates in conditional response variations, similar to the usage of predicates in flows. Thecondition
key in a response variation can now also be a string predicate that supports only theslots
namespace. One of many logical operators supported isnot
. -
Make current slot type and its allowed values available for the prompt template rendering in
SingleStepLLMCommandGenerator
andCompactLLMCommandGenerator
classes. Now you can use{{ current_slot_type }}
and{{ current_slot_allowed_values }}
placeholders in your custom prompt template. -
Made azure ASR
endpoint
andhost
, azure ttsendpoint
and cartesia ttsendpoint
configurable. -
Support usage of custom commands in the fine-tuning recipe.
-
add support for
mstts
markups on azure TTS for improved SSML usage
Bugfixes
- Add the possibility to pass a
transform
callable parameter when writing yaml. This allows passing a custom function to transform endpoints before uploading to Studio. This was required to fix the issue where yaml wraps in quotes any string that doesn't start with an alphabetic character such as unexpanded environment variables in the endpoints yml file. - Fixed the accuracy calculation to prevent 100% assertion reporting when a test case fails before any assertions are reached.
- Fixed regression on training time for projects with a lot of YAML files.
- Fix AvailableEndpoints to read from the default
endpoints.yaml
, if no endpoint is specified. - Update domain yaml schema for conditional response condition
type
key to specify valid enum type asslot
only. -
- Fixed an issue where the
pattern_continue_interrupted
was not correctly triggered when the flow digressed to a step containing a link.
- Fixed an issue where the
- Add the flow ID as a prefix to step ID to ensure uniqueness. This resolves a rare bug where steps in a child flow with a structure similar to those in a parent flow (using a "call" step) could result in duplicate step IDs. In this case duplicates previously caused incorrect next step selection.
- Enable default action
action_extract_slots
to set slots that should be shared for coexistence in a NLU-based system, when the same slot can be requested and filled by a flow in the CALM system too. - Fixed conversation stalling in AudioCodes channel by handling activities in background tasks. Previously, activities were processed synchronously which blocked responses to AudioCodes, causing request timeouts and activity retries. These retries would cancel ongoing processing and get rejected as duplicates. Now activities are processed asynchronously while responding immediately to AudioCodes requests.
- Improved error handling for Deepgram and Cartesia connection failures to display more meaningful error messages when authentication fails or other connection issues occur.
- Modify Enterprise Search Citation Prompt Template to use
doc.text
- Fixes ClarifyCommand syntax in the fine-tuning recipe.
- Fixed a bug that lead to the response to silence timeouts being cut off
- Fixed a bug in Voice Inspector where the tracker (hence the conversation transcript) was only updated after the prediction loop was complete. The bug resulted in a perceived delay in case of slow custom actions where transcript was rendered after processing the complete conversation turn. Now the tracker is sent to the Inspector app after every iteration of prediction loop, which conveys a more accurate conversation state and transcript on the inspector app
- Fix passing the incorrect input type (user question text instead of bot answer text) to the prompt used by the
generative_response_is_relevant
assertion. Add instructions to both relevance and groundedness prompts to not add any more explanations to the LLM output apart from the expected json output to prevent parsing errors. - Make real-time validation work with all slot types.
Fixes bug where the same
ValidateSlotPatternFlowStackFrame
was being triggered multiple times. - Handle multiple duplicate digressing flows occurring within the same flow:
- if
action_block_digressions
runs for a found duplicate digressing flow already on the stack, it will not add it again - if
action_continue_digressions
runs for a found duplicate digressing flow already on the stack, it first removes it from the stack before pushing it to the top of the stack.
- if
- Do not push the clarification pattern when the top user frame is an interruption frame.
- Consider linked and called flows as active flows when processing
StartFlow
commands. - Fixed slot value injection in translated responses by updating response keys to interpolate.
- Updated language code parsing to enforce BCP 47 standard.
- Fix validation check that ensures that the slot used in the response condition is defined in the domain file.
Miscellaneous internal changes
Miscellaneous internal changes.
[3.11.7] - 2025-04-14
Rasa Pro 3.11.7 (2025-04-14)
Bugfixes
- Fix
ChitChatAnswerCommand
command replaced withCannotHandleCommand
if there are no e2e stories defined. Improve validation thatIntentlessPolicy
has applicable responses: either responses in the domain that are not part of any flow, or if there are e2e stories. This validation performed during the training time and during cleanup of theChitChatAnswerCommand
command. - Security patch for Audiocodes channel connector. Audiocodes channel was only checking for the existence of authentication token. It now does a constant-time string comparison of the authentication token in connection request with that provided in channel configruation.
- Make sure training always fail when domain is invalid.
- Add channel name to UserMessage created by the Audiocodes channel.
[3.11.6] - 2025-04-02
Rasa Pro 3.11.6 (2025-04-02)
Bugfixes
- Improved error handling for Deepgram and Cartesia connection failures to display more meaningful error messages when authentication fails or other connection issues occur.
- Modify Enterprise Search Citation Prompt Template to use
doc.text
- Fixes ClarifyCommand syntax in the fine-tuning recipe.
- Send error event to the Kafka broker, when the original message size is too large, above the configured broker limit. This error handling mechanism was added to prevent Rasa-Pro server crashes.
- Update the following dependencies to versions that contain the latest patches for security vulnerabilities:
jinja2
werkzeug
requests
cryptography
pyarrow
langchain
langchain-community
[3.11.5] - 2025-02-18
Rasa Pro 3.11.5 (2025-02-18)
Bugfixes
- Updated
Inspector
dependent packages (cross-spawn, mermaid, dom-purify, vite, braces, ws, axios and rollup) to address security vulnerabilities. - Enable default action
action_extract_slots
to set slots that should be shared for coexistence in a NLU-based system, when the same slot can be requested and filled by a flow in the CALM system too. - Fixed conversation stalling in AudioCodes channel by handling activities in background tasks. Previously, activities were processed synchronously which blocked responses to AudioCodes, causing request timeouts and activity retries. These retries would cancel ongoing processing and get rejected as duplicates. Now activities are processed asynchronously while responding immediately to AudioCodes requests.
Miscellaneous internal changes
Miscellaneous internal changes.
[3.11.4] - 2025-01-30
Rasa Pro 3.11.4 (2025-01-30)
Improvements
- Remove unnecessary deepcopy to improve performance in
undo_fallback_prediction
method ofFallbackClassifier
Bugfixes
-
- Fixed an issue where the
pattern_continue_interrupted
was not correctly triggered when the flow digressed to a step containing a link.
- Fixed an issue where the
- Add the flow ID as a prefix to step ID to ensure uniqueness. This resolves a rare bug where steps in a child flow with a structure similar to those in a parent flow (using a "call" step) could result in duplicate step IDs. In this case duplicates previously caused incorrect next step selection.
- Optimized the
DirectCustomActionExecutor
by registering custom actions only once. - Updated
cryptography
andanyio
to resolve security vulnerabilities.
Miscellaneous internal changes
Miscellaneous internal changes.
[3.11.3] - 2025-01-14
Rasa Pro 3.11.3 (2025-01-14)
Improvements
- Enhances YAML parser to validate environment variable resolution for sensitive keys.
Bugfixes
- Add flow yaml validation when using the HTTP API
/model/train
endpoint. An invalid flow yaml will return a 400 response status code with a message describing the error. - Make
pattern_session_start
work withrasa inspector
to allow the assistant proactively start the conversation with a user. - Fix writing the test cases obtained via the e2e test case conversion command to file, where
test_cases
key was written as a list item, instead of a dict key. This caused running the test cases to fail because it didn't comply with the e2e test schema. This PR fixes the issue by writing the test cases as a dict key. - Fixed Inspector's Tracker State view not updating in real-time by moving story fetch logic into WebSocket message handler. Previously, story updates were only triggered on session ID changes, causing stale tracker state after the first conversation turn.
- Add pre-training custom validation to the domain responses that would raise a Rasa Pro validation error when a domain response is an empty sequence.
- Fixes a critical security vulnerability with
jsonpickle
dependency by upgrading to the patched version. - Updated
pymilvus
andminio
to address security vulnerability.
Miscellaneous internal changes
Miscellaneous internal changes.
[3.11.2] - 2024-12-19
Rasa Pro 3.11.2 (2024-12-19)
Bugfixes
- Validate that
api_type
key is only used for supported providers (Azure and OpenAI). - Enable asserting events returned by
action_session_start
when running end-to-end testing with assertions format. The following assertions can be used:slot_was_set
slot_was_not_set
bot_uttered
bot_did_not_utter
action_executed
- Fixed voice inspector to work with any URL by dynamically constructing WebSocket URL from current domain. This enables voice testing in GitHub Codespaces and other remote environments.
-
- Fixed an error in
rasa llm finetune prepare-data
when using a subclass ofSingleStepLLMCommandGenerator
. - Resolved an issue where
rasa llm finetune prepare-data
did not support model groups.
- Fixed an error in
- Fix AvailableEndpoints to read from the default
endpoints.yaml
, if no endpoint is specified.
Miscellaneous internal changes
Miscellaneous internal changes.
[3.11.1] - 2024-12-13
Rasa Pro 3.11.1 (2024-12-13)
Bugfixes
- Add the possibility to pass a
transform
callable parameter when writing yaml. This allows passing a custom function to transform endpoints before uploading to Studio. This was required to fix the issue where yaml wraps in quotes any string that doesn't start with an alphabetic character such as unexpanded environment variables in the endpoints yml file. - Pass flow human-readable name instead of flow id when the cancel pattern stack frame is pushed during flow policy validation checks of collect steps.
- Fixed the accuracy calculation to prevent 100% assertion reporting when a test case fails before any assertions are reached.
- Fixed regression on training time for projects with a lot of YAML files.
[3.11.0] - 2024-12-11
Rasa Pro 3.11.0 (2024-12-11)
Deprecations and Removals
- Removed
UnexpecTEDIntentPolicy
from the default config.yml. It is an experimental policy and not suitable for default configuration - The
reset_after_flow_ends
property of collect steps is now deprecated and will be removed in Rasa Pro 4.0.0. Please use thepersisted_slots
property at the flow level instead.
Features
-
Added Twilio Media Streams channel which can be configured to use arbitrary Text-To-Speech and Speech-To-Text services. Added Voice Stream Channel Interface which makes it easier to add voice channels that directly integrate with audio streams. Added support for Deepgram Speech-To-Text and Azure Text-To-Speech in Voice Stream Channels.
-
Added default action
action_hangup
it can be used to hang up a phone call from a flow. AddedSessionEnded
event andSessionEndCommand
command Updated Audiocodes, Jambonz and Twilio Voice channels to send/session_end
if the phone call is disconnected by user. -
Added support for Cartesia Text-To-Speech in Voice Stream Channels.
-
Implement Rasa Pro native model service that takes care of training and running an assistant model in Studio. To find out more about this service, read more in the Studio documentation.
-
Added a feature to be able to use voice to interact with the bot in the inspector.
-
Multi-LLM Routing:
-
Decoupled LLM Configuration from Components
- The previous integration of LLMs within CALM is closely tied to the components where they are used. However, this is no longer necessary, as we no longer perform training within the individual components that interact with external LLM endpoints.
- As a result, LLM and embedding client configurations have been moved to
endpoints.yml
. To define LLM configurations inendpoints.yml
, use themodel_groups
as shown below:model_groups:
- id: gpt-4-direct
models:
- provider: openai
model: gpt-4
timeout: 7
temperature: 0.0
- id: text-embedding-3-small-direct
models:
- provider: openai
model: text-embedding-3-small - These
model_groups
can then be referenced inconfig.yml
as follows:pipeline:
...
- name: SingleStepLLMCommandGenerator
llm:
model_group: gpt-4-direct
flow_retrieval:
embeddings:
model_group: text-embedding-3-smal-direct
...
-
Support for Multiple Subscription Deployments
- Allows customers to use deployments from different subscriptions for the same provider.
- Resolved the limitation of API key configuration being tied exclusively to a single environment variable.
Example configuration in
endpoints.yml
for Azure deployments:model_groups:
- id: azure-gpt-model-eu
models:
- provider: azure
deployment: azure-eu-deployment
api_base: https://api.azure-europe.example.com
api_version: 2024-08-01-preview
api_key: ${AZURE_API_KEY_EU}
timeout: 7
temperature: 0.0
...
- id: azure-gpt-model-us
models:
- provider: azure
deployment: azure-us-deployment
api_base: https://api.azure-us.example.com
api_version: 2024-08-01-preview
api_key: ${AZURE_API_KEY_US}
timeout: 7
temperature: 0.0
...
... -
Seamless Model Configuration Across Environments Without Retraining
- Added support for using different model configurations in different environments, such as
dev
,staging
, andprod
, without requiring the bot to be retrained for each environment. - Extended the
${...}
syntax todeployment
,api_base
, andapi_version
inmodel_groups
, allowing these values to change dynamically based on the environment.
model_groups:
- id: azure-gpt-4
models:
- provider: azure
deployment: ${AZURE_DEPLOYMENT_GPT4}
api_base: ${AZURE_API_BASE_GPT4}
api_key: ${AZURE_API_KEY_GPT4}
...
- id: azure-text-embeddings-3-small
models:
- provider: azure
deployment: ${AZURE_DEPLOYMENT_EMBEDDINGS_3_SMALL}
api_base: ${AZURE_API_BASE_EMBEDDINGS_3_SMALL}
api_key: ${AZURE_API_EMBEDDINGS_3_SMALL}
... - Added support for using different model configurations in different environments, such as
-
Supporting Multiple Deployments for Load Balancing
- Enabled targeting of multiple LLM deployments for a single Rasa component.
- Implemented the routing feature that supports load balancing to handle rate limits and improve scalability. When multiple models are defined within a model group, you can specify the
router
key with arouting_strategy
to control how requests are distributed among the models.
Example configuration in
endpoints.yml
for Azure deployments with load balancing:model_groups:
- id: azure-gpt-models
models:
- provider: azure
deployment: azure-eu-deployment
api_base: https://api.azure-europe.example.com
api_version: 2024-08-01-preview
api_key: ${AZURE_API_KEY_EU}
timeout: 7
temperature: 0.0
...
- provider: azure
deployment: azure-us-deployment
api_base: https://api.azure-us.example.com
api_version: 2024-08-01-preview
api_key: ${AZURE_API_KEY_US}
timeout: 7
temperature: 0.0
...
router:
routing_strategy: least-busy
...Example of usage in
config.yml
:pipeline:
...
- name: SingleStepLLMCommandGenerator
llm:
model_group: azure-gpt-models
... -
Backward Compatibility
- Existing configurations that couple LLMs to specific Rasa components remain unaffected by this change.
- However, this configuration method is now deprecated and scheduled for removal in version 4.0.0.
-
-
Added support for Azure Speech-To-Text in Voice Stream Channels.
-
Added
UserSilenceCommand
andpattern_user_silence
which is triggered by Voice Stream channels when the user is silent for more than a silence timeout. These values are configurable with the newly added slotssilence_timeout
andconsecutive_silence_timeouts
. Silence Monitoring is disabled by default and can be enabled using the configurationmonitor_silence: true
in the relevant Voice Stream Channel configuration. -
The inspector is not its own input / output channel anymore. Rather, it can be attached to other channels. This way, it isn't limited to conversations going through the socketio channel anymore, but can be used with other text channels or voice channels.
You can attach it to any channel(s) configured in your credentials.yml by adding a flag to rasa run: rasa run --inspect.
In addition to that, the conenience cli command rasa inspect is retained, which starts the inspector with the socketio channel as usual.
Improvements
-
In Audiocodes channel,
/vaig_event_start
is replaced by/session_start
. This intent marks the beginning of conversation and it is sent when the phone call is connected. -
Introduced the environment variable
MAX_NUMBER_OF_PREDICTIONS_CALM
to configure the CALM-specific limit for the number of predictions. This variable defaults to 1000, providing a higher prediction limit compared to the default value of 10 for nlu-based assistants. -
In Audiocodes and Twilio Voice channel connector, the call metadata received from the providers can be accessed in the slot
session_started_metadata
. The call metadata parameter names have been standardised with CallParameters dataclass Twilio Voice Channel Connector sends/session_start
intent at the beginning of conversation and the channel parameterinitial_prompt
has been removed -
Enable configurability of Vault secret manager's mount point property in the endpoints yaml file or as an environment variable.
-
In Twilio Media Streams channel connector, call metadata is availble in
session_start_metadata
slot. It also supports default actionaction_hangup
-
Catch API connection errors, and validate the correctness of the values present in model configuration at model training time by making a test API request. This feature is enabled by default and can be disabled by setting the environment variable
LLM_API_HEALTH_CHECK
toFalse
. -
Socketio
channel connector now sends the websocket messagestracker_state
andrasa_events
with each bot response.tracker_state
contains the tracker store state at that point in conversation and includes slots, events, stack, latest message and latest action.rasa_events
contains a list of new events that have happened since the last message. -
Speech-To-Text and Text-To-Speech Services can be configured for Voice Stream Channel Connectors Added tests for voice components and redefined code structure
-
Add support for Python 3.11
-
Removed JSON response validation except when HTTP protocol and E2E Stub is used for Custom Action execution.
-
Optimized JSON response validation by initializing the
Draft202012Validator
once and caching it. -
Add an optional property
persisted_slots
at the flow level. This property configures whether slots collected or set across any of the flow steps should be persisted after the flow ends. This property expects a list of slot names. -
Added support for custom Automatic Speech Recognition (ASR) or Text To Speech (TTS) providers to a Rasa Assistant. This allows developers to bring their own speech providers to Rasa by subclassing classes
ASREngine
andTTSEngine
-
If flow retrieval is disabled, a warning is raised only if the number of user flows exceed 20.
-
Added validation to the
TestCase
class to issue a warning when duplicate user messages lack metadata or have incorrect metadata. This enhancement provides clear guidance to users on the issue and how to resolve it. -
Fixed global
should-hangup
variable in Voice Stream Channels by moving to a context variable CallState that stores the session variables -
Run Rasa Pro data validation before uploading to Studio. This is to avoid uploading invalid assistant data that would raise errors during Rasa Pro model training in Studio.
-
Added
vector_name
to Qdrant's configuration to enable customization of the vector field name for storing embeddings. -
Enhanced
YamlValidationException
error messages to include the line number and a relevant YAML snippet showing where the validation error occurred. Line numbers start from 1 (1-based indexing).The error-handling behavior has been modified so that only one validation error is displayed. This exception is raised when the YAML content does not comply with the defined YAML schema.
-
Added a new assertion type
bot_did_not_utter
to allow testing that the bot does not utter specific messages or include certain buttons during conversations. -
Ensure that the model service fails properly if the minimum disk space requirement is not met.
-
Do not expand environment variables when reading yaml files during
rasa studio upload
execution. -
Stream model files to Studio rather than providing full files. Provide a HEAD endpoint for Studio to check if a model is available and what its size is. Add an environment variable to set the port of the model service. This makes the development with Studio easier, previously the port was hard coded making it harder to use a separately deployed model service now that Studio includes that in its development deployment.
-
Add flag
--skip-yaml-validation
to skip YAML validation during Rasa run. User can use it to skip domain YAML validation during Rasa run. Do not instantiate multiple instances of TrainingDataImporter class for validation and training. -
Introduced a
summarize_history
flag for the contextual response rephraser, defaulting toTrue
. When set toFalse
, the conversation transcript instead of the summary is included in the prompt of the contextual response rephraser. This saves a separate summarization call to an LLM. The number of conversation turns to be used whensummarize_history
is set toFalse
can be set viamax_historical_turns
. By default this value is set to 5.Example:
nlg:
- type: rephrase
summarize_history: False
max_historical_turns: 5
Bugfixes
-
Fix OpenAI LLM client ignoring API base and API version arguments if set.
-
Fix
AttributeError
with the instrumentation of therun
method of theCustomActionExecutor
class. -
Throw DuplicatedFlowIdException during
rasa data validate
andrasa train
if there are duplicate flows defined. -
Replace
pickle
andjoblib
with safer alternatives, e.g.json
,safetensors
, andskops
, for serializing components.Note: This is a model breaking change. Please retrain your model.
If you have a custom component that inherits from one of the components listed below and modified the
persist
orload
method, make sure to update your code. Please contact us in case you encounter any problems.Affected components:
CountVectorFeaturizer
LexicalSyntacticFeaturizer
LogisticRegressionClassifier
SklearnIntentClassifier
DIETClassifier
CRFEntityExtractor
TrackerFeaturizer
TEDPolicy
UnexpectedIntentTEDPolicy
-
Avoid filling slots that have
ask_before_filling = True
and utilize afrom_text
slot mapping during other steps in the flow. Ensure that theNLUCommandAdapter
only fills these types of slots when the flow reaches the designated collection step. -
Check for the metadata's
step_id
andactive_flow
keys when adding theActionExecuted
event to the flows paths stack. -
Fixed a bug on Windows where flow files with names starting with 'u' would fail to load due to improper path escaping in YAML content processing
-
Fixes OpenAIException - AsyncClient.init() got an unexpected keyword argument 'proxies'
-
Fix retrieval of model file stored in the cloud storage by the model service. This change consisted in uploading only the model file instead of the full model path during training when
--remote-storage
CLI flag is used. -
Fix issue in e2e testing when customising
action_session_start
would lead to AttributeError, because theoutput_channel
was not set. This is now fixed by setting theoutput_channel
toCollectingOutputChannel()
.
Miscellaneous internal changes
Miscellaneous internal changes.
[3.10.19] - 2025-04-15
Rasa Pro 3.10.19 (2025-04-15)
Bugfixes
- Fix
ChitChatAnswerCommand
command replaced withCannotHandleCommand
if there are no e2e stories defined. Improve validation thatIntentlessPolicy
has applicable responses: either responses in the domain that are not part of any flow, or if there are e2e stories. This validation performed during the training time and during cleanup of theChitChatAnswerCommand
command. - Security patch for Audiocodes channel connector. Audiocodes channel was only checking for the existence of authentication token. It now does a constant-time string comparison of the authentication token in connection request with that provided in channel configruation.
- Make sure training always fail when domain is invalid.
- Add channel name to UserMessage created by the Audiocodes channel.
[3.10.18] - 2025-04-02
Rasa Pro 3.10.18 (2025-04-02)
Bugfixes
- Updated
Inspector
dependent packages (cross-spawn, mermaid, dom-purify, vite, braces, ws, axios and rollup) to address security vulnerabilities. - Modify Enterprise Search Citation Prompt Template to use
doc.text
- Fixes ClarifyCommand syntax in the fine-tuning recipe.
- Send error event to the Kafka broker, when the original message size is too large, above the configured broker limit. This error handling mechanism was added to prevent Rasa-Pro server crashes.
- Update the following dependencies to versions that contain the latest patches for security vulnerabilities:
jinja2
werkzeug
requests
cryptography
pyarrow
langchain
langchain-community
[3.10.17] - 2025-01-30
Rasa Pro 3.10.17 (2025-01-30)
Improvements
- Remove unnecessary deepcopy to improve performance in
undo_fallback_prediction
method ofFallbackClassifier
Bugfixes
-
- Fixed an issue where the
pattern_continue_interrupted
was not correctly triggered when the flow digressed to a step containing a link.
- Fixed an issue where the
- Add the flow ID as a prefix to step ID to ensure uniqueness. This resolves a rare bug where steps in a child flow with a structure similar to those in a parent flow (using a "call" step) could result in duplicate step IDs. In this case duplicates previously caused incorrect next step selection.
Miscellaneous internal changes
Miscellaneous internal changes.
[3.10.16] - 2025-01-15
Rasa Pro 3.10.16 (2025-01-15)
Bugfixes
-
- Fixed an error in
rasa llm finetune prepare-data
when using a subclass ofSingleStepLLMCommandGenerator
.
- Fixed an error in
- Make
pattern_session_start
work withrasa inspector
to allow the assistant proactively start the conversation with a user. - Fix writing the test cases obtained via the e2e test case conversion command to file, where
test_cases
key was written as a list item, instead of a dict key. This caused running the test cases to fail because it didn't comply with the e2e test schema. This PR fixes the issue by writing the test cases as a dict key. - Add pre-training custom validation to the domain responses that would raise a Rasa Pro validation error when a domain response is an empty sequence.
- Fixes a critical security vulnerability with
jsonpickle
dependency by upgrading to the patched version. - Updated
pymilvus
andminio
to address security vulnerability.
Miscellaneous internal changes
Miscellaneous internal changes.
[3.10.15] - 2024-12-18
Rasa Pro 3.10.15 (2024-12-18)
Bugfixes
- Validate that
api_type
key is only used for supported providers (Azure and OpenAI). - Fix issue in e2e testing when customising
action_session_start
would lead to AttributeError, because theoutput_channel
was not set. This is now fixed by setting theoutput_channel
toCollectingOutputChannel()
. - Fixed the accuracy calculation to prevent 100% assertion reporting when a test case fails before any assertions are reached.
- Pass flow human-readable name instead of flow id when the cancel pattern stack frame is pushed during flow policy validation checks of collect steps.
- Try to instantiate LLM/embeddings client when loading component to validate environment variables.
- Enable asserting events returned by
action_session_start
when running end-to-end testing with assertions format. The following assertions can be used:slot_was_set
slot_was_not_set
bot_uttered
action_executed
[3.10.14] - 2024-12-04
Rasa Pro 3.10.14 (2024-12-04)
Bugfixes
- Avoid filling slots that have
ask_before_filling = True
and utilize afrom_text
slot mapping during other steps in the flow. Ensure that theNLUCommandAdapter
only fills these types of slots when the flow reaches the designated collection step. - Fixes OpenAIException - AsyncClient.init() got an unexpected keyword argument 'proxies'
- Fix validation for LLM/Embedding clients when the api_base is configured in the config itself but not as an environment variable.
[3.10.13] - 2024-11-29
Rasa Pro 3.10.13 (2024-11-29)
Bugfixes
- Implement
eq
andhash
functions forChangeFlowCommand
to fixerror=unhashable type: 'ChangeFlowCommand'
error inMultiStepCommandGenerator
. - Fixed an issue on Windows where flow files with names starting with 'u' would fail to load due to improper path escaping in YAML content processing
- Store the value of the
--disable-verify
CLI flag in thedisable_verify
attribute of theStudioConfig
object, so it can be reused across other studio commands.
[3.10.12] - 2024-11-25
Rasa Pro 3.10.12 (2024-11-25)
Bugfixes
-
Replace
pickle
andjoblib
with safer alternatives, e.g.json
,safetensors
, andskops
, for serializing components.Note: This is a model breaking change. Please retrain your model.
If you have a custom component that inherits from one of the components listed below and modified the
persist
orload
method, make sure to update your code. Please contact us in case you encounter any problems.Affected components:
CountVectorFeaturizer
LexicalSyntacticFeaturizer
LogisticRegressionClassifier
SklearnIntentClassifier
DIETClassifier
CRFEntityExtractor
TrackerFeaturizer
TEDPolicy
UnexpectedIntentTEDPolicy
[3.10.11] - 2024-11-20
Rasa Pro 3.10.11 (2024-11-20)
Bugfixes
- Fix parsing of commands in case the LLM response surrounds flow names, slot names, or slot values with single or double quotes.
Miscellaneous internal changes
Miscellaneous internal changes.
[3.10.10] - 2024-11-14
Rasa Pro 3.10.10 (2024-11-14)
Bugfixes
- Check for the metadata's
step_id
andactive_flow
keys when adding theActionExecuted
event to the flows paths stack.
[3.10.9] - 2024-11-13
Rasa Pro 3.10.9 (2024-11-13)
Bugfixes
- Introduced the environment variable
MAX_NUMBER_OF_PREDICTIONS_CALM
to configure the CALM-specific limit for the number of predictions. This variable defaults to 1000, providing a higher prediction limit compared to the default value of 10 for nlu-based assistants. - Filter out comments from e2e test input files when writing e2e results to file.
- Specified UTF-8 encoding to correctly read test cases on Windows.
[3.10.8] - 2024-10-24
Rasa Pro 3.10.8 (2024-10-24)
Bugfixes
- The user message "/restart" is now restarting the session again after adding a proper implementation
(stack frame and command) for
pattern_restart
. - Only infer and set the provider to
azure
for our LLM clients in case NOprovider
is specified, but thedeployment
key is set. - Fix OPENAI_API_KEY authentication error when using self-hosted provider.
Miscellaneous internal changes
Miscellaneous internal changes.
[3.10.7] - 2024-10-17
Rasa Pro 3.10.7 (2024-10-17)
Improvements
- Change default response of
utter_free_chitchat_response
from"placeholder_this_utterance_needs_the_rephraser"
to"Sorry, I'm not able to answer that right now."
.
Bugfixes
- Disallow using the command payload syntax to set slots not filled by any of the active or startable flow(s)
collect
steps. - Add flow name to error message
validator.verify_flows_steps_against_domain.collect_step
. - Update e2e test results output files on each test run so that, for example, when all tests pass on subsequent runs after failing previously, the failed results output file is emptied.
- Disable strict SSL verification to the Rasa Studio authentication server via the
--disable-verify
or-x
CLI argument added to therasa studio config
command. - Upgrade
zipp
dependency version to fix a security vulnerability: CVE-2024-5569.
[3.10.6] - 2024-10-04
Rasa Pro 3.10.6 (2024-10-04)
Bugfixes
- Fix cleanup of
SetSlot
commands issued by the LLM-based command generator for slots that define a slot mapping other than thefrom_llm
slot mapping. The command processor now correctly removes the SetSlot command in these scenarios and instead adds aCannotHandleCommand
. - Fix
UnicodeDecodeError
while reading Windows path from yaml files. - Fix model loading from remote storage by correcting the handling of remote storage enum during the creation of the persistor object.
Miscellaneous internal changes
Miscellaneous internal changes.
[3.10.5] - 2024-10-01
Rasa Pro 3.10.5 (2024-10-01)
Bugfixes
-
Fix the case where IntentlessPolicy is triggered while no e2e stories were written to guide it. In this situation a CannotHandleCommand will be issued.
-
Update litellm to version 1.45.0 to fix security vulnerability (CVE-2024-6587). Update gitpython to version 3.1.41 to fix security vulnerability (CVE-2024-22190). Update certifi to version 2024.07.04 to fix security vulnerability (CVE-2024-39689).
-
Prevent invalid domain with incorrectly defined intent from throwing stack trace. Throw InvalidDomain exception and send message to the user instead. The message looks like this:
Detected invalid intent definition: {'intent': 'ask_help'}.
Please make sure all intent definitions are valid. -
Support text completions endpoint when using self hosted models.
The
use_chat_completions_endpoint
parameter is now supported when using self-hosted models. This parameter is used to enable the use of the chat completions endpoint when using a self-hosted model. This parameter is set toTrue
by default. To use the text completions endpoint, setuse_chat_completions_endpoint
toFalse
in thellm
section of the component.Usage:
llm:
provider: self-hosted
model: meta-llama/Meta-Llama-3-8B
api_base: "https://my-endpoint/v1"
use_chat_completions_endpoint: false -
Fixes an issue where the
CountVectorsFeaturizer
andLogisticRegressionClassifier
would throw error during inference when no NLU training data is provided. -
Added tracing explicitly to
GRPCCustomActionExecutor.run
in order to pass the tracing context to the action server.
[3.10.4] - 2024-09-25
Rasa Pro 3.10.4 (2024-09-25)
Bugfixes
- Fix failing validation of categorical slots when slot values contain Apostrophe.
[3.10.3] - 2024-09-20
Rasa Pro 3.10.3 (2024-09-20)
No significant changes.
[3.10.2] - 2024-09-19
Rasa Pro 3.10.2 (2024-09-19)
Deprecations and Removals
- Dropped support for Python 3.8 ahead of Python 3.8 End of Life in October 2024. In Rasa Pro versions 3.10.0, 3.9.11 and 3.8.13, we needed to pin the TensorFlow library version to 2.13.0rc1 in order to remove critical vulnerabilities; this resulted in poor user experience when installing these versions of Rasa Pro with
uv pip
. Removing support for Python 3.8 will make it possible to upgrade to a stabler version of TensorFlow.
Improvements
- Update Keras and Tensorflow to version 2.14.
This will eliminate the need to use the
--prerelease allow
flag when installing Rasa Pro usinguv pip
tool.
Bugfixes
-
Revert the old behavior when loading trained model by supplying a path to the model on the remote storage by using the model path (
-m
) argument whenREMOTE_STORAGE_PATH
environment variable is not set. Resulting path on the remote storage will be the same as the model path (-m
) argument.Additionally, entire model path (
-m
) argument wil be used when trained model is being uploaded to the remote storage withREMOTE_STORAGE_PATH
environment variable not set. Resulting path on the remote storage will be the same as the model path (-m
) argument.If
REMOTE_STORAGE_PATH
environment variable is set, only the file name part of the model path (-m
) argument is used in both loading and storage from/to the remote storage. Resulting path on the remote storage will be:REMOTE_STORAGE_PATH
+ file name part of the model path (-m
) argument. -
Fixed UnexpecTEDIntentlessPolicy training errors that resulted from a change to batching behavior. Changed the batching behavior back to the original for all components. Made the changed batching behavior accessible in DietClassifier using
drop_small_last_batch: True
.
Miscellaneous internal changes
Miscellaneous internal changes.
[3.10.1] - 2024-09-11
Rasa Pro 3.10.1 (2024-09-11)
Bugfixes
- Fix OpenAI LLM client ignoring API base and API version arguments if set.
- Fix
FileNotFound
error when runningrasa studio
commands and no pre-existing local assistant project exists. - Fixed telemetry collection for the components Rephraser, LLM Intent Classifier, Intentless Policy and Enterprise Search Policy to ensure that the telemetry data is only collected when it is enabled
- Update the default config for E2E test conversion to use the
provider
key instead ofapi_type
. - Fix inconsistent recording of telemetry events for llm-based command generators.
- Throw deprecation warning when REQUESTS_CA_BUNDLE env var is used.
[3.10.0] - 2024-09-04
Rasa Pro 3.10.0 (2024-09-04)
Deprecations and Removals
- Remove experimental
LLMIntentClassifier
. Use Rasa CALM instead.
Features
-
Implement the shell output of accuracy rate by assertion type as a table when running end-to-end testing with assertions.
-
Implement E2E testing assertions that measure metrics such as grounded-ness and answer relevance of generative responses issued by either Enterprise Search or the Contextual Response Rephraser.
You must specify a threshold which must be reached for the generative evaluation assertion to pass. In addition, you can also specify
ground_truth
if you prefer providing this in the E2E test rather than relying on the retrieved context from the vector store (in the case of Enterprise Search) or from the domain (in the case of Contextual Response Rephraser) that is stored in the bot utterance event metadata. For rephrased answers, you must specifyutter_name
to run the assertion.These assertions can be specified for user steps only and cannot be used alongside the former E2E test format. You can learn more about this new feature in the documentation sections for grounded and relevant assertion types.
To enable this feature, please set the environment variable
RASA_PRO_BETA_E2E_ASSERTIONS
totrue
.export RASA_PRO_BETA_E2E_ASSERTIONS=true
-
You can now produce a coverage report of your e2e tests via the following command:
rasa test e2e <e2e-test-folder> --coverage-report [--coverage-output-path <output-folder>]
The coverage report contains the number of steps and the number of tested steps per flow. Untested steps are referenced by line numbers.
Flow Name Coverage Num Steps Missing Steps Line Numbers
flow_1 0.00% 1 1 [10-10]
flow_2 100.00% 4 0 []
Total 80.00% 5 1Additionally, we also create a histogram of command coverage showing how many and what commands are produced in your e2e tests.
To enable this feature, please set the environment variable
RASA_PRO_BETA_FINETUNING_RECIPE
totrue
.export RASA_PRO_BETA_FINETUNING_RECIPE=true
More information can be found on the documentation of the feature.
-
Create a self-hosted LLM client compatible with OpenAI format. Users can connect to their own self-hosted LLM server that is compatible with OpenAI format.
Sample basic usage:
llm:
provider: self-hosted
model: <deployment_name>
api_base: <deployment_url>
api_type: openai [Optional] -
Add a new CLI command
rasa llm finetune prepare-data
to create a dataset from e2e tests that can be used to fine-tune a base model for the task of command generation.To enable this feature, please set the environment variable
RASA_PRO_BETA_FINETUNING_RECIPE
totrue
.export RASA_PRO_BETA_FINETUNING_RECIPE=true
-
It is now allowed to link to
pattern_human_handoff
from any pattern and user flow. -
Allow links from all patterns to user flows except for
pattern_internal_error
. -
- LiteLLM Integration & Reduced LangChain Reliance:
- Introduced
LLMClient
andEmbeddingClient
protocols for standardized client interfaces. - Created lightweight client wrappers for LiteLLM to streamline model instantiation, management, and inference.
- Updated
llm_factory
andembedder_factory
to utilize these LiteLLM client wrappers. - Added dedicated clients for Azure OpenAI and OpenAI to support both LLMs and embedding models.
- Added a HuggingFace client to compute embeddings using locally stored transformer models via the
sentence-transformers
package.
- Introduced
- LangChain Update: Upgraded to the latest version (0.2.x) for improved compatibility and features. To understand the implications on your assistant, please refer to the feature documentation and the migration guide.
- LiteLLM Integration & Reduced LangChain Reliance:
-
Implement as part of E2E testing a new type of evaluation specifically designed to increase confidence in CALM. This evaluation runs assertions on the assistant's actual events and generative responses. New assertions include the ability to check for the presence of specific events, such as:
- flow started, flow completed or flow cancelled events
- whether
pattern_clarification
was triggered for specific flows - whether buttons rendered well as part of the bot uttered event
- whether slots were set correctly or not
- whether the bot text response matches a provided regex pattern
- whether the bot response matches a provided domain response name
These assertions can be specified for user steps only and cannot be used alongside the former E2E test format. You can learn more about this new feature in the documentation.
To enable this feature, please set the environment variable
RASA_PRO_BETA_E2E_ASSERTIONS
totrue
.export RASA_PRO_BETA_E2E_ASSERTIONS=true
-
Configure LLM-as-Judge settings in the
llm_as_judge
section of theconftest.yml
file. These settings will be used to evaluate the groundedness and relevance of generated bot responses. Theconftest.yml
is discoverable as long as it is in the root directory of the assistant project, at the same level as theconfig.yml
file.If the
conftest.yml
file is not present in the root directory, the default LLM judge settings will be used. -
Implement automatic E2E test case conversion from sample conversation data.
This feature includes:
- A CLI command to convert sample conversation data (CSV, XLSX) into executable E2E test cases.
- Conversion of sample data using an LLM to generate YAML formatted test cases.
- Export of generated test cases into a specified YAML file.
Usage:
rasa data convert e2e <path>
To enable this feature, please set the environment variable
RASA_PRO_BETA_E2E_CONVERSION
totrue
.export RASA_PRO_BETA_E2E_CONVERSION=true
For more details, please refer to this documentation page.
Improvements
-
Implemented custom action stubbing for E2E test cases. To define custom action stubs, add
stub_custom_actions
to the test case file.Stubs can be defined in two ways:
- Test file level: Define each action by its name (
action_name
). - Test case level: Define the stub using the test case ID as a prefix (
test_case_id::action_name
).
To learn more about this feature, please refer to the documentation.
To enable this feature, set the environment variable
RASA_PRO_BETA_STUB_CUSTOM_ACTION
totrue
:export RASA_PRO_BETA_STUB_CUSTOM_ACTION=true
- Test file level: Define each action by its name (
-
Add
max_messages_in_query
parameter to Enterprise Search Policy, it allows controlling the number of past messages that are used in the search query for retrieval -
Configure LLM E2E test converter settings in the
llm_e2e_test_conversion
section of theconftest.yml
file.These settings will be used to configure the LLM used to convert sample conversation data into E2E test cases.
The
conftest.yml
is discoverable as long as it is in the root directory of the tests output path.If the
conftest.yml
file is not present in the root directory, the default LLM settings will be used. -
Add the datetime of Rasa Pro license expiry to
rasa --version
command Add/license
API endpoint that also returns the same information -
Suppress LiteLLM info and debug log messages in the console.
-
Cache llm_factory and embedder_factory methods to avoid client instantiation and validation for every user utterance.
-
Added E2E Test Conversion Completed telemetry event with file type and test case count properties.
-
Separate writing of failed and passed e2e test results to distinct file paths.
-
Implement support for evaluating IntentlessPolicy responses with generative response assertions.
-
Use direct custom action execution in tutorial and CALM templates. Skip action server health check in e2e testing if direct custom action execution is configured.
-
Modified the type of flows which are included into the import CLI (previously only user flows were enabled, now patterns are included). Use case: This is needed for Studio 1.7, since that release is enabling modification and management of patterns inside Studio, and needs the ability to import patterns from yaml files.
-
Improve events and responses sub-schemas used by the
stub_custom_actions
sub-schema of end-to-end testing. The events sub-schema only allows the usage of events which are supported by therasa-sdk
. These are documented in the action server API documentation. -
Change default model of conversation rephraser to 'gpt-4o-mini'.
-
Add
file_path
toFlow
so that we can show the full name, e.g.path/to/flow.py::flow name
in the e2e test coverage report. -
Introduced remote storage to upload trained model to persistors(AWS, GCP, Azure)
-
Add ability to download training data from remote storage(gcs, aws, azure)
-
Allow saving models to and retrieving from sub folders in cloud storage.
-
Introduced
DirectCustomActionExecutor
for executing custom actions directly through the assistant.Introduced
actions_module
variable underaction_endpoint
inendpoints.yml
to explicitly specify the path to custom actions module.If
actions_module
is set, custom actions will be executed directly through the assistant. -
Add validation for the values against which categorical and boolean slots are checked in the if conditional steps. An error will be thrown when a slot is compared to an invalid/non-existent value for boolean and categorical slots.
-
Add user query and retrieved document results to the metadata of
action_send_text
predicted by EnterpriseSearchPolicy. In addition, add domain ground truth responses to theBotUttered
event metadata when rephrasing is enabled. These changes were required to allow evaluations of generative responses against the ground truth stored in the metadata ofBotUttered
events.
Bugfixes
-
Fix problem with custom action invocation when model is loaded from remote storage.
-
Ensure certificates for openai based clients.
-
Mark the first slot event as seen when the user turn in a E2E test case contains multiple slot events for the same slot. This fixes the issue when the
assertion_order_enabled
is set totrue
and the user step in a test case contained multipleslot_was_set
assertions for the same slot, the last slot event was marked as seen when the first assertion was running. This caused the test to fail for subsequentslot_was_set
assertions for the same slot with errorSlot <slot_name> was not set
. -
Validate the LLM configuration during training for the following components:
Contextual Response Rephraser
Enterprise Search Policy
Intentless Policy
LLM Based Command Generator
LLM Based Router
Additionally, update the
get_provider_from_config
method to retrieve the provider using both themodel
andmodel_name
configuration parameters. -
Fixes throwing the deprecation warning if the setting for Azure OpenAI Embedding Client was not set through the deprecated environment variable.
-
Fix execution of stub custom actions when they contain test case name and the separator in its provided stub name. Test runner will now correctly execute the correct stub implementation for the same custom action dependent on the test name.
-
Add validation to conversation rephraser.
-
Ensure YAML files with datetime-formatted strings are read as plain strings instead of being converted to datetime objects.
-
Deprecate 'request_timeout' for OpenAI and Azure OpenAI clients in favor of 'timeout'
-
Forbid
stream
andn
parameters for clients. Having these parameters withinllm
andembeddings
configuration will result in error. -
Raise deprecation warning if
api_type
is set tohuggingface
instead ofhuggingface_local
for HuggingFace local embeddings. -
Fix resolving aliases for deprecated keys when instantiating LLM and embedding clients.
-
Fix detection of conftest file which contained custom LLM judge configuration.
-
Fix issue with Rasa Pro Studio download command exporting default flows which had not been customized by the Studio user. Rasa Pro Studio download command only exports user defined flows, customized patterns and user defined domain locally from the Studio instance.
Similarly, fix issue with Rasa Pro Studio upload command importing default flows which had not been customized to Studio. Rasa Pro Studio upload command only imports user defined flows, customized patterns and user defined domain to the Studio instance.
-
Disable auto-inferring provider from the config. Ensure the provider is explicitly read from the
provider
key. -
Fix writing e2e test cases to disk.
slot_was_set
andslot_was_not_set
are now written down correctly. -
The rephraser of the
rasa llm finetune data-prepare
command now compares the original user message and the user message returned in the LLM output case-insensitive. -
[rasa llm finetune prepare-data] Do not rephrase user messages that come from a button payload.
-
Separate commands in the expected LLM output by newlines.
-
Fix TypeError in PatternClarificationContainsAssertion hash function by converting sets to lists for successful JSON serialization.
-
Fix validation in case a link to
pattern_human_handoff
is used. -
[
rasa llm finetune prepare-data
] Skip paraphrasing module in casenum-rephrases
is set to 0. -
Update the handling of incorrect use of slash syntax. Messages with undefined intents do not automatically trigger
pattern_cannot_handle
; instead, they are sanitized (prepended slash(es) are removed) and passed through the graph. -
Allow suitable patterns to be properly started using nlu triggers
-
Fix API connection error for bedrock embedding endpoint.
Miscellaneous internal changes
Miscellaneous internal changes.
[3.9.20] - 2025-04-14
Rasa Pro 3.9.20 (2025-04-14)
Bugfixes
- Updated
Inspector
dependent packages (cross-spawn, mermaid, dom-purify, vite, braces, ws, axios and rollup) to address security vulnerabilities. - Modify Enterprise Search Citation Prompt Template to use
doc.text
- Security patch for Audiocodes channel connector. Audiocodes channel was only checking for the existence of authentication token. It now does a constant-time string comparison of the authentication token in connection request with that provided in channel configruation.
[3.9.19] - 2025-01-30
Rasa Pro 3.9.19 (2025-01-30)
Improvements
- Remove unnecessary deepcopy to improve performance in
undo_fallback_prediction
method ofFallbackClassifier
Bugfixes
-
- Fixed an issue where the
pattern_continue_interrupted
was not correctly triggered when the flow digressed to a step containing a link.
- Fixed an issue where the
- Add the flow ID as a prefix to step ID to ensure uniqueness. This resolves a rare bug where steps in a child flow with a structure similar to those in a parent flow (using a "call" step) could result in duplicate step IDs. In this case duplicates previously caused incorrect next step selection.
Miscellaneous internal changes
Miscellaneous internal changes.
[3.9.18] - 2025-01-15
Rasa Pro 3.9.18 (2025-01-15)
Bugfixes
- Fix issue in e2e testing when customising
action_session_start
would lead to AttributeError, because theoutput_channel
was not set. This is now fixed by setting theoutput_channel
toCollectingOutputChannel()
. - Pass flow human-readable name instead of flow id when the cancel pattern stack frame is pushed during flow policy validation checks of collect steps.
- Make
pattern_session_start
work withrasa inspector
to allow the assistant proactively start the conversation with a user. - Fixes a critical security vulnerability with
jsonpickle
dependency by upgrading to the patched version. - Updated
minio
to address security vulnerability.
[3.9.17] - 2024-12-05
Rasa Pro 3.9.17 (2024-12-05)
Bugfixes
- Implement
eq
andhash
functions forChangeFlowCommand
to fixerror=unhashable type: 'ChangeFlowCommand'
error inMultiStepCommandGenerator
.
[3.9.16] - 2024-11-26
Rasa Pro 3.9.16 (2024-11-26)
Bugfixes
-
Replace
pickle
andjoblib
with safer alternatives, e.g.json
,safetensors
, andskops
, for serializing components.Note: This is a model breaking change. Please retrain your model.
If you have a custom component that inherits from one of the components listed below and modified the
persist
orload
method, make sure to update your code. Please contact us in case you encounter any problems.Affected components:
CountVectorFeaturizer
LexicalSyntacticFeaturizer
LogisticRegressionClassifier
SklearnIntentClassifier
DIETClassifier
CRFEntityExtractor
TrackerFeaturizer
TEDPolicy
UnexpectedIntentTEDPolicy
[3.9.15] - 2024-10-18
Rasa Pro 3.9.15 (2024-10-18)
Improvements
- Change default response of
utter_free_chitchat_response
from"placeholder_this_utterance_needs_the_rephraser"
to"Sorry, I'm not able to answer that right now."
.
Bugfixes
- Fix cleanup of
SetSlot
commands issued by the LLM-based command generator for slots that define a slot mapping other than thefrom_llm
slot mapping. The command processor now correctly removes the SetSlot command in these scenarios and instead adds aCannotHandleCommand
. - Disallow using the command payload syntax to set slots not filled by any of the active or startable flow(s)
collect
steps.