rasa.nlu.test
CVEvaluationResult Objects
Stores NLU cross-validation results.
log_evaluation_table
Log the sklearn evaluation metrics.
remove_empty_intent_examples
Remove those examples without an intent.
Arguments:
intent_results
- intent evaluation resultsReturns
- intent evaluation results
remove_empty_response_examples
Remove those examples without a response.
Arguments:
response_results
- response selection evaluation results
Returns:
Response selection evaluation results
drop_intents_below_freq
Remove intent groups with less than cutoff instances.
Arguments:
training_data
- training datacutoff
- thresholdReturns
- updated training data
write_intent_successes
Write successful intent predictions to a file.
Arguments:
intent_results
- intent evaluation resultsuccesses_filename
- filename of file to save successful predictions to
write_response_successes
Write successful response selection predictions to a file.
Arguments:
response_results
- response selection evaluation resultsuccesses_filename
- filename of file to save successful predictions to
plot_attribute_confidences
Create histogram of confidence distribution.
Arguments:
results
- evaluation resultshist_filename
- filename to save plot totarget_key
- key of target in resultsprediction_key
- key of predictions in resultstitle
- title of plot
plot_entity_confidences
Creates histogram of confidence distribution.
Arguments:
merged_targets
- Entity labels.merged_predictions
- Predicted entities.merged_confidences
- Confidence scores of predictions.hist_filename
- filename to save plot totitle
- title of plot
evaluate_response_selections
Creates summary statistics for response selection.
Only considers those examples with a set response. Others are filtered out. Returns a dictionary of containing the evaluation result.
Arguments:
response_selection_results
- response selection evaluation resultsoutput_directory
- directory to store files tosuccesses
- if True success are written down to diskerrors
- if True errors are written down to diskdisable_plotting
- if True no plots are createdreport_as_dict
-True
if the evaluation report should be returned asdict
. IfFalse
the report is returned in a human-readable text format. IfNone
report_as_dict
is considered asTrue
in case anoutput_directory
is given.output_directory
3 - dictionary with evaluation results
evaluate_intents
Creates summary statistics for intents.
Only considers those examples with a set intent. Others are filtered out. Returns a dictionary of containing the evaluation result.
Arguments:
intent_results
- intent evaluation resultsoutput_directory
- directory to store files tosuccesses
- if True correct predictions are written to diskerrors
- if True incorrect predictions are written to diskdisable_plotting
- if True no plots are createdreport_as_dict
-True
if the evaluation report should be returned asdict
. IfFalse
the report is returned in a human-readable text format. IfNone
report_as_dict
is considered asTrue
in case anoutput_directory
is given.output_directory
3 - dictionary with evaluation results
merge_labels
Concatenates all labels of the aligned predictions.
Takes the aligned prediction labels which are grouped for each message and concatenates them.
Arguments:
aligned_predictions
- aligned predictionsextractor
- entity extractor name
Returns:
Concatenated predictions
merge_confidences
Concatenates all confidences of the aligned predictions.
Takes the aligned prediction confidences which are grouped for each message and concatenates them.
Arguments:
aligned_predictions
- aligned predictionsextractor
- entity extractor name
Returns:
Concatenated confidences
substitute_labels
Replaces label names in a list of labels.
Arguments:
labels
- list of labelsold
- old label name that should be replacednew
- new label nameReturns
- updated labels
collect_incorrect_entity_predictions
Get incorrect entity predictions.
Arguments:
entity_results
- entity evaluation resultsmerged_predictions
- list of predicted entity labelsmerged_targets
- list of true entity labelsReturns
- list of incorrect predictions
write_successful_entity_predictions
Write correct entity predictions to a file.
Arguments:
entity_results
- response selection evaluation resultmerged_predictions
- list of predicted entity labelsmerged_targets
- list of true entity labelssuccesses_filename
- filename of file to save correct predictions to
collect_successful_entity_predictions
Get correct entity predictions.
Arguments:
entity_results
- entity evaluation resultsmerged_predictions
- list of predicted entity labelsmerged_targets
- list of true entity labelsReturns
- list of correct predictions
evaluate_entities
Creates summary statistics for each entity extractor.
Logs precision, recall, and F1 per entity type for each extractor.
Arguments:
entity_results
- entity evaluation resultsextractors
- entity extractors to consideroutput_directory
- directory to store files tosuccesses
- if True correct predictions are written to diskerrors
- if True incorrect predictions are written to diskdisable_plotting
- if True no plots are createdreport_as_dict
-True
if the evaluation report should be returned asdict
. IfFalse
the report is returned in a human-readable text format. Ifextractors
0report_as_dict
is considered asTrue
in case anoutput_directory
is given.extractors
4 - dictionary with evaluation results
is_token_within_entity
Checks if a token is within the boundaries of an entity.
does_token_cross_borders
Checks if a token crosses the boundaries of an entity.
determine_intersection
Calculates how many characters a given token and entity share.
do_entities_overlap
Checks if entities overlap.
I.e. cross each others start and end boundaries.
Arguments:
entities
- list of entitiesReturns
- true if entities overlap, false otherwise.
find_intersecting_entities
Finds the entities that intersect with a token.
Arguments:
token
- a single tokenentities
- entities found by a single extractorReturns
- list of entities
pick_best_entity_fit
Determines the best fitting entity given intersecting entities.
Arguments:
token
- a single tokencandidates
- entities found by a single extractorattribute_key
- the attribute key of interest
Returns:
the value of the attribute key of the best fitting entity
determine_token_labels
Determines the token label for the provided attribute key given entities that do not overlap.
Arguments:
token
- a single tokenentities
- entities found by a single extractorextractors
- list of extractorsattribute_key
- the attribute key for which the entity type should be returned
Returns:
entity type
determine_entity_for_token
Determines the best fitting entity for the given token, given entities that do not overlap.
Arguments:
token
- a single tokenentities
- entities found by a single extractorextractors
- list of extractors
Returns:
entity type
do_any_extractors_not_support_overlap
Checks if any extractor does not support overlapping entities.
Arguments:
Names of the entitiy extractors
Returns:
True
if and only if CRFEntityExtractor or DIETClassifier is in extractors
align_entity_predictions
Aligns entity predictions to the message tokens.
Determines for every token the true label based on the prediction targets and the label assigned by each single extractor.
Arguments:
result
- entity evaluation resultextractors
- the entity extractors that should be consideredReturns
- dictionary containing the true token labels and token labels from the extractors
align_all_entity_predictions
Aligns entity predictions to the message tokens for the whole dataset using align_entity_predictions.
Arguments:
entity_results
- list of entity prediction resultsextractors
- the entity extractors that should be consideredReturns
- list of dictionaries containing the true token labels and token labels from the extractors
get_eval_data
Runs the model for the test set and extracts targets and predictions.
Returns intent results (intent targets and predictions, the original messages and the confidences of the predictions), response results ( response targets and predictions) as well as entity results (entity_targets, entity_predictions, and tokens).
Arguments:
processor
- the processortest_data
- test dataReturns
- intent, response, and entity evaluation results
run_evaluation
Evaluate intent classification, response selection and entity extraction.
Arguments:
data_path
- path to the test dataprocessor
- the processor used to process and predictoutput_directory
- path to folder where all output will be storedsuccesses
- if true successful predictions are written to a fileerrors
- if true incorrect predictions are written to a filedisable_plotting
- if true confusion matrix and histogram will not be renderedreport_as_dict
-True
if the evaluation report should be returned asdict
. IfFalse
the report is returned in a human-readable text format. Ifprocessor
0report_as_dict
is considered asTrue
in case anoutput_directory
is given.processor
4 - Path to the domain file(s).processor
5 - dictionary containing evaluation results
generate_folds
Generates n cross validation folds for given training data.
combine_result
Collects intent, response selection and entity metrics for cross validation folds.
If intent_results
, response_selection_results
or entity_results
is provided
as a list, prediction results are also collected.
Arguments:
intent_metrics
- intent metricsentity_metrics
- entity metricsresponse_selection_metrics
- response selection metricsprocessor
- the processordata
- training dataintent_results
- intent evaluation resultsentity_results
- entity evaluation resultsresponse_selection_results
- reponse selection evaluation resultsresponse_selection_results
1 - intent, entity, and response selection metrics
cross_validate
Stratified cross validation on data.
Arguments:
data
- Training Datan_folds
- integer, number of cv foldsnlu_config
- nlu config fileoutput
- path to folder where reports are storedsuccesses
- if true successful predictions are written to a fileerrors
- if true incorrect predictions are written to a filedisable_plotting
- if true no confusion matrix and historgram plates are createdreport_as_dict
-True
if the evaluation report should be returned asdict
. Ifn_folds
0 the report is returned in a human-readable text format. Ifn_folds
1report_as_dict
is considered asTrue
in case ann_folds
4 is given.
Returns:
dictionary with key, list structure, where each entry in list corresponds to the relevant result for one fold
compute_metrics
Computes metrics for intent classification, response selection and entity extraction.
Arguments:
processor
- the processortraining_data
- training dataReturns
- intent, response selection and entity metrics, and prediction results.
compare_nlu
Trains and compares multiple NLU models. For each run and exclusion percentage a model per config file is trained. Thereby, the model is trained only on the current percentage of training data. Afterwards, the model is tested on the complete test data of that run. All results are stored in the provided output directory.
Arguments:
configs
- config files needed for trainingdata
- training dataexclusion_percentages
- percentages of training data to exclude during comparisonf_score_results
- dictionary of model name to f-score results per runmodel_names
- names of the models to trainoutput
- the output directoryruns
- number of comparison runsReturns
- training examples per run
log_results
Logs results of cross validation.
Arguments:
results
- dictionary of results returned from cross validationdataset_name
- string of which dataset the results are from, e.g. test/train
log_entity_results
Logs entity results of cross validation.
Arguments:
results
- dictionary of dictionaries of results returned from cross validationdataset_name
- string of which dataset the results are from, e.g. test/train