Version: 3.x

rasa.nlu.utils._spacy_utils

SpacyNLP Objects

class SpacyNLP(Component)

The core component that links spaCy to related components in the pipeline.

load_model

| @staticmethod
| load_model(spacy_model_name: Text) -> "Language"

Try loading the model, catching the OSError if missing.

provide_context

| provide_context() -> Dict[Text, Any]

Creates a context dictionary from spaCy nlp object.

doc_for_text

| doc_for_text(text: Text) -> "Doc"

Makes a spaCy doc object from a string of text.

preprocess_text

| preprocess_text(text: Optional[Text]) -> Text

Processes the text before it is handled by spaCy.

merge_content_lists

| @staticmethod
| merge_content_lists(indexed_training_samples: List[Tuple[int, Text]], doc_lists: List[Tuple[int, "Doc"]]) -> List[Tuple[int, "Doc"]]

Merge lists with processed Docs back into their original order.

filter_training_samples_by_content

| @staticmethod
| filter_training_samples_by_content(indexed_training_samples: List[Tuple[int, Text]]) -> Tuple[List[Tuple[int, Text]], List[Tuple[int, Text]]]

Separates empty training samples from content bearing ones.

process_content_bearing_samples

| process_content_bearing_samples(samples_to_pipe: List[Tuple[int, Text]]) -> List[Tuple[int, "Doc"]]

Sends content bearing training samples to spaCy's pipe.

process_non_content_bearing_samples

| process_non_content_bearing_samples(empty_samples: List[Tuple[int, Text]]) -> List[Tuple[int, "Doc"]]

Creates empty Doc-objects from zero-lengthed training samples strings.

load

| @classmethod
| load(cls, meta: Dict[Text, Any], model_dir: Text, model_metadata: "Metadata" = None, cached_component: Optional["SpacyNLP"] = None, **kwargs: Any, ,) -> "SpacyNLP"

Loads trained component (see parent class for full docstring).

ensure_proper_language_model

| @staticmethod
| ensure_proper_language_model(nlp: Optional["Language"]) -> None

Checks if the spacy language model is properly loaded.

Raises an exception if the model is invalid.