Named Entity Recognizer (NER)

Definition

• Detects and classifies named entities

• Categories: persons, locations and organizations

Features

Arabic Named Entities Detection and Classification

The Arabic Named Entity Recognizer (NER) extracts named entities from Modern Standard Arabic text and classifies them into three main types: human names, locations, and organizations. Arabic NER can extract foreign and Arabian names, as well as entities such as cities, countries, streets, squares, political parties, companies, and ministries.

Arabic Text Preprocessing

Arabic NER uses either the Auto Corrector or the Speller to correct spelling mistakes during preprocessing. The user has the option to select between the components, depending on the error rate of the input text. Auto Corrector can be used for text with a low error rate, Speller for text with a high error rate.

Hybrid Approaches

Arabic NER uses hybrid deterministic/probabilistic approaches to detect and classify named entities. It deploys a set of deterministic rules and gazetteers to detect named entities with high confidence, and then a CRF model trained on Sarf features detects named entities. Finally, a substring-matching component processes the entire article again to extract missing human names that are part of previously detected human names.

APIs

Get Named Entities which detects and classifies

Example