Overview
Service Code: sentence-splitter
Sentence Splitter (coming soon)
- Sentence spitting (also called tokenization, segmentation, or boundary disambiguation) is the process of detecting sentence boundaries, i.e., where the sentence begins and ends. It is considered as a difficult task in NLP because of the ambiguous nature of the punctuation marks. For example, a period does not always show the end of a sentence. It may be a decimal point or represent any abbreviation or email. Moreover, there are many other languages (especially Chinese, Japanese and Urdu) which have an ambiguity in sentence endings, i.e., the sentence sometimes have no definite boundary. This process plays a vital role in text classification, chatbots, language translation, sentimental analysis and many more.
To overcome this issue, we have built the NeuralSpace Sentence Splitter, which can be used to tokenize words and sentences, similar to popular Python library NLTK but for many more languages.
- 👉 APIs (coming soon)
Features
State-of-the-art Models: Use our pre-trained state-of-the-art models through APIs and integrate them in any application.
Multi-language Support: Go global with 80+ supported languages.