Skip to main content

Deploy NLU Models

Open In Collab Run in Postman

Once you have successfully trained your model which means that trainingStatus for your model is Completed (model status), you can deploy the model on our platform. You don't have to worry about MLOps or DevOps at all. Scaling out your models is as simple as specifying how many replicas you want and our platform will manage the deployment for you.


  • Getting Started: Make sure to follow Getting Started to login and install nlu app. If you are using APIs, save your authorization token in a variable called AUTHORIZATION_TOKEN before moving ahead.
  • Create a Project:
    • Make sure to create a project and have the project id in a variable called PROJECT_ID.
    • Make sure to have the language for which you added training examples in a variable called LANGUAGE.
  • Add training data: Make sure to have at least two intents with 10 examples each
  • Train a model: Train a model and store the model id in a variable called MODEL_ID

Deploy model

Use this command to deploy model using CLI. You can specify replicas of a given model you want and the platform manages the scaling.

neuralspace nlu deploy -m $MODEL_ID -n 3
Number of Replicas

Make sure you are in the allowed range of nReplicas otherwise your deployment request might fail. This limit depends on your current subscription. If you specify 3 nReplicas after you have already deployed 2 replicas your deployment will only increase by 1 making it a total of 3 replicas and NOT to 5 replicas.

Unload a Model

To unload your model completely you can nReplicas to 0 in the request payload.

curl --location --request POST '' \
--header 'Accept: application/json, text/plain, */*' \
--header 'Content-Type: application/json;charset=UTF-8' \
--header "Authorization: ${AUTHORIZATION_TOKEN}" \
--data-raw "{
\"modelId\": \"${MODEL_ID}\",
\"nReplicas\": 0

Deployment Limits

The number of models you can deploy depends on the limits you have in your subscription.

Deployment Failures

Failures during deployment can occur when the model not trained yet, infrastructure overload or invalid request parameters based, and invalid/expired license.