Version: 3.x
rasa.nlu.featurizers.sparse_featurizer._count_vectors_featurizer
CountVectorsFeaturizer Objects
class CountVectorsFeaturizer(SparseFeaturizer)
Creates a sequence of token counts features based on sklearn's CountVectorizer
.
All tokens which consist only of digits (e.g. 123 and 99 but not ab12d) will be represented by a single feature.
Set analyzer
to 'char_wb'
to use the idea of Subword Semantic Hashing
from https://arxiv.org/abs/1810.07150.
__init__
| __init__(component_config: Optional[Dict[Text, Any]] = None, vectorizers: Optional[Dict[Text, "CountVectorizer"]] = None, finetune_mode: bool = False) -> None
Construct a new count vectorizer using the sklearn framework.
train
| train(training_data: TrainingData, cfg: Optional[RasaNLUModelConfig] = None, **kwargs: Any, ,) -> None
Train the featurizer.
Take parameters from config and construct a new count vectorizer using the sklearn framework.
process
| process(message: Message, **kwargs: Any) -> None
Process incoming message and compute and set features
persist
| persist(file_name: Text, model_dir: Text) -> Optional[Dict[Text, Any]]
Persist this model into the passed directory.
Returns the metadata necessary to load the model again.
load
| @classmethod
| load(cls, meta: Dict[Text, Any], model_dir: Text, model_metadata: Optional[Metadata] = None, cached_component: Optional["CountVectorsFeaturizer"] = None, should_finetune: bool = False, **kwargs: Any, ,) -> "CountVectorsFeaturizer"
Loads trained component (see parent class for full docstring).