rasa.core.featurizers
SingleStateFeaturizer Objects
Base class for mechanisms to transform the conversations state into ML formats.
Subclasses of SingleStateFeaturizer decide how the bot will transform the conversation state to a format which a classifier can read: feature vector.
prepare_from_domain
Helper method to init based on domain.
encode
Encode user input.
action_as_one_hot
Encode system action as one-hot vector.
create_encoded_all_actions
Create matrix with all actions from domain encoded in rows.
BinarySingleStateFeaturizer Objects
Assumes all features are binary.
All features should be either on or off, denoting them with 1 or 0.
__init__
Declares instant variables.
prepare_from_domain
Use Domain to prepare featurizer.
encode
Returns a binary vector indicating which features are active.
Given a dictionary of states (e.g. 'intent_greet',
'prev_action_listen',...) return a binary vector indicating which
features of self.input_features
are in the bag. NB it's a
regular double precision float array type.
For example with two active features out of five possible features
this would return a vector like [0 0 1 0 1]
If intent features are given with a probability, for example
with two active features and two uncertain intents out
of five possible features this would return a vector
like [0.3, 0.7, 1.0, 0, 1.0]
.
If this is just a padding vector we set all values to -1
.
padding vectors are specified by a None
or [None]
value for states.
create_encoded_all_actions
Create matrix with all actions from domain encoded in rows as bag of words
LabelTokenizerSingleStateFeaturizer Objects
Creates bag-of-words feature vectors.
User intents and bot action names are split into tokens and used to create bag-of-words feature vectors.
Arguments:
split_symbol
- The symbol that separates words in intets and action names.use_shared_vocab
- The flag that specifies if to create the same vocabulary for user intents and bot actions.
__init__
inits vocabulary for label bag of words representation
prepare_from_domain
Creates internal vocabularies for user intents and bot actions.
encode
Returns a binary vector indicating which tokens are present.
create_encoded_all_actions
Create matrix with all actions from domain encoded in rows as bag of words
TrackerFeaturizer Objects
Base class for actual tracker featurizers.
training_states_and_actions
Transforms list of trackers to lists of states and actions.
featurize_trackers
Create training data.
prediction_states
Transforms list of trackers to lists of states for prediction.
create_X
Create X for prediction.
load
Loads the featurizer from file.
FullDialogueTrackerFeaturizer Objects
Creates full dialogue training data for time distributed architectures.
Creates training data that uses each time output for prediction. Training data is padded up to the length of the longest dialogue with -1.
training_states_and_actions
Transforms list of trackers to lists of states and actions.
Training data is padded up to the length of the longest dialogue with -1.
prediction_states
Transforms list of trackers to lists of states for prediction.
MaxHistoryTrackerFeaturizer Objects
Slices the tracker history into max_history batches.
Creates training data that uses last output for prediction. Training data is padded up to the max_history with -1.
slice_state_history
Slices states from the trackers history.
If the slice is at the array borders, padding will be added to ensure the slice length.
training_states_and_actions
Transforms list of trackers to lists of states and actions.
Training data is padded up to the max_history with -1.
prediction_states
Transforms list of trackers to lists of states for prediction.