notice
This is unreleased documentation for Rasa Documentation Main/Unreleased version.
For the latest released documentation, see the latest version (3.x).
rasa.utils.tensorflow.model_data_utils
featurize_training_examples
Converts training data into a list of attribute to features.
Possible attributes are, for example, INTENT, RESPONSE, TEXT, ACTION_TEXT, ACTION_NAME or ENTITIES. Also returns sparse feature sizes for each attribute. It could look like this: {TEXT: {FEATURE_TYPE_SEQUENCE: [16, 32], FEATURE_TYPE_SENTENCE: [16, 32]}}.
Arguments:
training_examples
- the list of training examplesattributes
- the attributes to considerentity_tag_specs
- the entity specsfeaturizers
- the featurizers to considerbilou_tagging
- indicates whether BILOU tagging should be used or not
Returns:
A list of attribute to features. A dictionary of attribute to feature sizes.
get_tag_ids
Creates a feature array containing the entity tag ids of the given example.
Arguments:
example
- the messagetag_spec
- entity tag specbilou_tagging
- indicates whether BILOU tagging should be used or not
Returns:
A list of features.
convert_to_data_format
Converts the input into "Data" format.
"features" can, for example, be a dictionary of attributes (INTENT, TEXT, ACTION_NAME, ACTION_TEXT, ENTITIES, SLOTS, FORM) to a list of features for all dialogue turns in all training trackers. For NLU training it would just be a dictionary of attributes (either INTENT or RESPONSE, TEXT, and potentially ENTITIES) to a list of features for all training examples.
The "Data" format corresponds to Dict[Text, Dict[Text, List[FeatureArray]]]. It's a dictionary of attributes (e.g. TEXT) to a dictionary of secondary attributes (e.g. SEQUENCE or SENTENCE) to the list of actual features.
Arguments:
features
- a dictionary of attributes to a list of features for all examples in the training datafake_features
- Contains default feature values for attributesconsider_dialogue_dimension
- If set to false the dialogue dimension will be removed from the resulting sequence features.featurizers
- the featurizers to consider
Returns:
Input in "Data" format and fake features