SARF (Morphological Analyzer)

Established: January 29, 2014

Definition

Features

Morphological analysis

Sarf provides all possible morphological analyses for an input Arabic word. Each analysis consists of the diacritized word and the morphological breakdown of the analysis in terms of prefixes, stem, and suffixes. The stem is further decomposed into its root and morphological pattern. Moreover, each analysis carries the part of speech and a set of morphosyntactic features such as gender and number. The analyses are ranked to reflect the actual language usage of each analysis.

Word synthesis

Sarf constructs a final-form word from its morphological analysis. This analysis must contain root, pattern, stem, part of speech, prefixes, and suffixes.

Generation of derivatives

Given a specific analysis, Sarf can derive all valid stems having the same root. The derivation may be limited by a specified target part of speech.

Generation of inflections

Given a specific analysis, Sarf can derive all valid inflected forms having the same stem. The inflected forms are a combination of all valid prefix-stem-suffix combinations.

Correctness of Arabic words

Sarf is able to identify if an input word is a valid Arabic word or not. If valid, Sarf provides its possible morphological analyses. Otherwise, the word is not valid in Arabic.

Awareness of input diacritics

Input text diacritics are noted during analysis. Diacritics found in the input will be used as a filter on the generated analyses, but if the input diacritics are determined to be wrong, they are ignored.

APIs

  • Analyze Word analyzes a word and suggests all possible analyses of it. If the word is fully or partially diacritized, analyses that do not match the diacritics will be filtered out.
  • Synthesize Word synthesizes the final form of a word using its constituents (prefix/suffix//stem) through the analysis structure.
  • Get Derivatives using a word analysis and specific part of speech, it retrieves derivatives that match this part of speech.
  • Get Inflections retrieves all possible inflections of a given word
  • Get Plural returns the plural form of a word using its analysis.
  • Get Singular returns the singular form of a word using its analysis

Examples

 

back to ATKS

People

Portrait of Eslam Kamal

Eslam Kamal

Principal Research SDE Manager