What are Entities?
Entities play a major role in language understanding. To perform an action on a certain user query you not only need to understand the intent behind it but also the entities present in it.
E.g., if someone says "flights from Berlin to London", the intent here is flight-search
and entities are Berlin
and London
, which are of type city
.
In a given piece of text, entities can be anything from names, addresses, account numbers to very domain-specific terms like names of chemicals, medicines, etc. Essentially any valuable information that can be extracted from text.
Entities can also be seen at a more granular level. In the above example Berlin
can be from-city
and London
can be to-city
.
A very domain-specific use-case could be, e.g., "I need 8 paracetamol tablets", where 8
is number
, and paracetamol
is medicine-component
, and tablets
is medicine-form
.
Supported Entities on NeuralSpace
To be able to extract a wide variety of entities we provide five different types of entities.
Pre-trained Entities
We have a catalog of over 30 entity extractors like, person
, date
, number
, geo-location
, etc. that you can use off-the-shelf.
Regular Expressions
To extract entities that follow a certain pattern, you can use our in-build regular expression entity extractor.
Lookup Entities
If you have a finite list of things you want to extract from a given piece of text, e.g., list of languages, location landmarks, company departments, etc. you can use lookup entities. These are essentially a list of words or phrases that you want to extract from a given text.
Synonyms
Same entities can sometimes come in different forms. E.g. New Delhi
and Delhi
, New Your City
and NYC
.
Cases where you are performing an action based on the entity value, you will have to make them homogeneous.
That means, you would want your NLU model to return a single value (Delhi
) for both New Delhi
and Delhi
.
This can be done using synonyms.
Trainable Entities
If you want AutoNLP to learn to extract your custom entities, you can do that too.
Tip
Make sure to try all the above methods to extract entities as they are more deterministic. AutoNLP trains an AI model to predict entities and that can be nondeterministic.