Train NLU Models using AutoNLP

To train NLU models on the NeuralSpace Platform you don't need any machine learning knowledge. In this article we will learn how to train our models.

Prerequisites

Getting Started: Make sure to follow Getting Started to log in and install the Language Understanding Service. If you are using APIs, save your authorization token in a variable called AUTHORIZATION_TOKEN before moving ahead
Create a Project:
- Make sure to create a project and have the project ID in a variable called PROJECT_ID
- Make sure to have the language for which you added training examples in a variable called LANGUAGE
Add training data: Make sure to have at least two intents with 10 examples each

Train Model

Multiple Train Jobs

Sometimes while training a model, specifically when you have less training data, same model when trained seperately multiple times can show slight variation in performance (2-4%). To solve this, you can parallelly run multiple train jobs for the same data and then select the model which gives the best performance. By default, we run 5 training jobs for you, but you can set it to any number of your choice by changing the noOfTrainingJob parameter in train API.

Train API launches a training job on our Platform and returns a unique model ID. This model ID can be used to monitor the training status of this job. Give a name to the model by specifying it in the modelName parameter. As mentioned above, you can also set the number of training jobs you want to run by specifying it in noOfTrainingJob parameter. If not set, 5 will be run. projectId and language are mandatory parameters.

curl --location --request POST 'https://platform.neuralspace.ai/api/nlu/v1/model/train/queue' \
--header 'Accept: application/json, text/plain, */*' \
--header 'Content-Type: application/json;charset=UTF-8' \
--header "Authorization: ${AUTHORIZATION_TOKEN}" \
--data-raw "{
    \"projectId\": \"${PROJECT_ID}\",
    \"language\":\"${LANGUAGE}\",
    \"modelName\": \"My First Model\",
    \"noOfTrainingJob\": 3
}"

Store this returned model ID in a variable.

MODEL_ID="YOUR-MODEL-ID"

note

Every time you call a train job for a given project and language a new model ID gets generated. The total number of training jobs you can queue at a time is equal to the number of trained models left in your subscription.

Get Model Status

After calling the train job, a new model ID is generated. You can use this ID to track your training progress as well as fetch model related attributes.

Request
curl --location --request GET "https://platform.neuralspace.ai/api/nlu/v1/model?modelID=${MODEL_ID}" \
--header 'Accept: application/json, text/plain, */*' \
--header 'Content-Type: application/json;charset=UTF-8' \
--header "Authorization: ${AUTHORIZATION_TOKEN}"

Status	Description
Initiated	A training job has been created but not yet queued. Only jobs which are valid get queued in the training pipeline.
Queued	Queued jobs are ready for training. If you are a free plan user then your jobs have the lowest priority. Jobs by basic plan users get a higher priority than free plan users. Advanced plan users always get the highest priority.
Pipeline Building	Our AutoNLP pipeline is getting built based on the data in your project.
Pipeline Built	AutoNLP pipeline built and is ready to execute.
Preparing Data	Our secret sauce gets poured on your data here.
Data Prepared	Your data is ready for training.
Training	AutoNLP has started training.
Trained	AutoNLP trained successfully.
Saved	Model artifacts saved in our secure cloud storage.
Completed	Model is ready to be deployed.
Failed	Training failed. Reason for training can be found in the `message` attribute of the model object.
Timed Out	We have a hard time-out of 6 hours set for all NLU models. If a model takes longer than that, we cancel the job. This occurs rarely and only when our platform is overloaded.
Dead	Training jobs which have not updated their status for more than 10 hours are declared dead. This means they are not responding. This is also a rare event and happens only when our platform is overloaded.

Once a model's trainingStatus becomes Completed then only it's ready for deployment.

note

Only models with status Completed, Failed, Timed Out, Dead can be deleted.

List Models

You can also list all your projects within a single project.
Read about how to list all your projects in this article.

Request
curl --location --request POST 'https://platform.neuralspace.ai/api/nlu/v1/list/model' \
--header 'Accept: application/json, text/plain, */*' \
--header 'Content-Type: application/json;charset=UTF-8' \
--header "Authorization: ${AUTHORIZATION_TOKEN}" \
--data-raw "{
    \"filter\": {
        \"projectId\": \"${PROJECT_ID}\",
        \"language\": \"${LANGUAGE}\"
    },
    \"pageNumber\": 1,
    \"pageSize\": 10
}"

This is a pagination API, hence, pageSize determines how many projects to retrieve, and pageNumber determines which page to fetch. This Api will return a list of all the models in the language you have specified for the given project ID. Model attributes like trainingStatus, trainingTime, etc. are described in the next section.

Get Single Model

Use this API to fetch a single model and its attributes.

MODEL_ID="YOUR-MODEL-ID"

curl --location --request GET "https://platform.neuralspace.ai/api/nlu/v1/model?modelId=${MODEL_ID}" \
--header "Authorization: ${AUTHORIZATION_TOKEN}"

This will return a json object in the following format:

Model Details
{
    "success": true,
    "message": "Model status fetched",
    "data": {
        "name": "...", 
        "appType": "nlu",
        "projectId": "...",
        "apikey": "...",
        "createdBy": {
            "email": "...",
            "role": "provider",
            "apikey": "...",
            "referenceKey": "..."
        },
        "active": true,
        "status": "active",
        "createdAt": 1620059929722,
        "updatedAt": 1620059929722,
        "modelId": "...",
        "replicas": 0,
        "trainingStatus": "Completed",
        "lastStatusUpdateAt": "2021-05-04T09:28:45.672Z",
        "trainingProgress": [
            "Initiated",
            "Pipeline Building",
            "Queued",
            "Pipeline Built",
            "Preparing Data",
            "Data Prepared",
            "Training",
            "Trained",
            "Saved",
            "Completed"
        ],
        "examplesPerIntent": {
            "SOME-INTENT": NUMBER-OF-EXAMPLES-FOR-THIS-INTENT,
            ...
        },
        "metrics": {
            "intentClassifierPerformance": {
                "i_acc": 0.9894935488700867,
                "i_f1": 0.9894935488700867
            },
            "nerPerformance": {
                "e_f1_strict": 0.971139669418335,
                "e_f1_partial": 0.971139669418335
            }
        },
        "language": ".."
    }
    "timestamp": 1620120623094
}

Description of Fields

Fields	Description
`name`	Name of the model.
`appType`	This will always be `nlu`.
`projectId`	The ID of the project this model belongs to.
`apikey`	Your API Key
`createdAt`	Timestamp of when this model was created.
`updatedAt`	Timestamp of when this model was updated.
`modelId`	A unique ID for your model.
`replicas`	This indicates how many replicas of this model is deployed on our platform. Multiple replicas ensue higher throughput and higher availability.
`trainingStatus`	The current training status.
`lastStatusUpdateAt`	During training, whenever the status changes this field is updated with a timestamp.
`trainingProgress`	A list of all training statuses this model has gone through.
`examplesPerIntent`	This is the distribution of your training dataset. Keys in this dictionary are intents and values are the number of examples you have in the training set for that intent.
`metrics`	When you have test examples in a project, the model is evaluated on them. Here you will find some metrics that we calculate to gauge the performance of the model. These numbers are all zeros if you don't upload any test examples.
`metrics.intentClassifierPerformance`	These are the metrics for the intent classifier.
`metrics.intentClassifierPerformance.i_acc`	The fraction of test examples for which AutoNLP predicted the right intent.
`metrics.intentClassifierPerformance.i_f1`	[For advanced users only] This is the macro averaged F1 score
`metrics.nerPerformance`	These are the metrics for entity recognition.
`metrics.nerPerformance.e_f1_strict`	Here we consider exact boundary surface string match and entity type
`metrics.nerPerformance.e_f1_partial`	Here we consider partial boundary match over the surface string, regardless of the type;
`language`	The language this model was trained for.

When running machine learning models for entity recognition, it is common to report metrics (precision, recall and f1-score) at the individual token level. This may not be the best approach, as a named entity can be made up of multiple tokens. At the same time, regular NER scheme tend to ignore the possibility of partial matches which are scenarios when the entity recognition system gets the named-entity surface string correct but the type wrong.

Update Model Name

This API lets you update your model name to your customized need.

Request
MODEL_ID="YOUR-MODEL-ID"

curl --location --request PUT 'https://platform.neuralspace.ai/api/nlu/v1/model' \
--header 'Accept: application/json, text/plain, */*' \
--header 'Content-Type: application/json;charset=UTF-8' \
--header "Authorization: ${AUTHORIZATION_TOKEN}" \
--data-raw "{
    \"modelId\": \"${MODEL_ID}\",
    \"modelName\": \"New Name\"
}"

Delete Model

Delete your models using this API.

curl --location --request DELETE 'https://platform.neuralspace.ai/api/nlu/v1/model' \
--header 'Accept: application/json, text/plain, */*' \
--header 'Content-Type: application/json;charset=UTF-8' \
--header "Authorization: ${AUTHORIZATION_TOKEN}" \
--data-raw "{
    \"modelId\": \"${MODEL_ID}\",
}"

Train NLU Models using AutoNLP

Prerequisites​

Related Topics​

Train Model​

Multiple Train Jobs

note

Get Model Status​

note

List Models​

Get Single Model​

Description of Fields​

Update Model Name​

Delete Model​

Prerequisites

Related Topics

Train Model

Get Model Status

List Models

Get Single Model

Description of Fields

Update Model Name

Delete Model