Skip to main content

NLU Training

Using the playground, we can experiment with multiple NLU configuration to determine the best one.

Dataset Generation

The first step is to generate one or multiple datasets. To generate a dataset, we must create a .yml file in our model with the following header:

max_samples_per_intent: 50
test_split_percentage: 10

The file can be placed anywhere. A good practice is to place it into the nlu/datasets folder.

The max_samples_per_intent configures the maximum samples per intent that will be included in the dataset. The test_split_percentage configures the split that will be generated between training data and test data.

To launch the generation, click the Generate Dataset option from the Train submenu. Once the generation is finished, you will get a confirmation in the chat, and you will see the file being updated.


You can generate a dataset multiple times. The data will be overwritten every time.

Training Using a Dataset

To train a model using an already generated dataset, rather than generating new data, make sure the model config file includes the reference to the dataset file:

- name: default
threshold: 0.8
dataset_file: nlu/datasets/d10.yml

Testing a Model

To test the current model against a dataset, you have to create a .yml test file with the following header:

dataset_file: nlu/datasets/d10.yml

To launch the test, click the Test Model option from the Train submenu.


The file can be placed anywhere. A good practice is to place it into the nlu/tests folder.