IAES Inter national J our nal of Articial Intelligence (IJ-AI) V ol. 15, No. 2, April 2026, pp. 1876 1890 ISSN: 2252-8938, DOI: 10.11591/ijai.v15.i2.pp1876-1890 1876 A comparati v e study of Arabic mor phological analyzers Omar Saadiy eh 1 , Alaaeddine Ramadan 2 , Chamseddine Zaki 3 , Mohamad Hajjar 4 , Gilles Ber nard 1 1 P aragraphe Research Lab, Uni v ersity of P aris VIII, P aris, France 2 Colle ge of Engineering and Computing, American Uni v ersity of Bahrain, Rif f a, Bahrain 3 Colle ge of Engineering and T echnology , American Uni v ersity of the Middle East, Eg aila, K uw ait 4 F aculty of T echnology , Lebanese Uni v ersity , Saida, Lebanon Article Inf o Article history: Recei v ed Jun 8, 2025 Re vised Jan 9, 2026 Accepted Jan 25, 2026 K eyw ords: Arabic dialects processing Arabic linguistics Arabic natural language processing Language learning Morphological analyzer ABSTRA CT The eld of Arabic na tural language processing (NLP) has witnessed signicant adv ancements, dri v en by the de v elopment of v arious morphological analyzers. This paper compares se v eral major Arabic morphological analyzers and e xamines their ability to handle w ord ambiguities, process dialects, operate ef ciently , and support do wnstream NLP tasks. By re vie wing pre vious studies, we identify k e y g aps, including the limited resources for dialects, the shortage of annotated corpora, and challenges related to system scalabilit y . The study also hi ghlights future directions, such as b uilding lar ger and more di v erse corpora, adapting neural models for dialects, and de v eloping analyzers that are more interpretable and trustw orth y . Ov erall, this com parati v e o v ervie w aims to pro vide a clearer understanding of the current state of Arabic morphological analyzers, synthesize e xisting research, and of fer practical recommendations for future w ork in this area. This is an open access article under the CC BY -SA license . Corresponding A uthor: Alaaeddine Ramadan Colle ge of Engineering and Computing, American Uni v ersity of Bahrain Rif f a, Bahrain Email: alaaeddine.ramadan@aubh.edu.bh 1 INTR ODUCTION W ith more than 400 million speak ers w orldwide, Arabic is the of cial language in 22 countries. It is rank ed as the fourth commonly used language on the internet [1]. Research conducted by se v eral researchers [2], [3] has identi ed three v ariations within Arabic: i) classical Arabic (CA), kno wn for its use in literary w orks and the Quran, ii) modern standard Arabic (MSA) is commonly used in formal conte xts, and ii i) dialectal Arabic (D A) utilized in informal con v ersations and e v eryday interactions [4]. D A further branches out into six groups, including Egyptian, Le v antine, Gulf, Iraqi, Maghrebi, and other re gional dialects [2], [5], [6]. Similar , to semitic languages Arabic features a morphological structure characterized by root letters, prex es, suf x es and di v erse grammatical patterns. Morphology in v olv es studying ho w w ords are structured from units kno wn as morphemes. Morphemes are the units of meaning, in a language. It’ s crucial to understand ho w the y are arranged within w ords for language processing tasks lik e part of speech tagging parsing and machine translation [7]. In Arabic core w ords ha v e inected forms. F or instance Arabic v erbs boast 5400 forms compared to 6 in English as sho wn in T able 1. T able 1. English v erb paradigm VB VBD VB G VBN VBP VBZ go went going gone go goes J ournal homepage: http://ijai.iaescor e .com Evaluation Warning : The document was created with Spire.PDF for Python.
Int J Artif Intell ISSN: 2252-8938 1877 In grammar v erbs can change forms to con v e y tenses and grammatical aspects. The base form is VB past tense is denoted by VBD, the gerund or present participle form is VBG, past participle form is VBN, non 3rd person si n gul ar form is VBP , and the 3rd person singular form is VBZ. Arabic v erbs can tak e on forms based on gender (2), number (3), person (3), aspect (3), particle (2), mood (3), v oice (2), pronominal clitic (12), and conjunction clitic (3) combinations as illustrated in Figure 1. Figure 1. Arabic morphology e xample Arabic follo ws a system based on roots, where w ords are typically created from a three letter root. Thi s root system enables the formation of w ords, with meanings through dif ferent patterns and af x es, resulting in a di v erse range of le xical forms . In Arabic, the process of inection in v olv es altering prex es, suf x es and inx es to e xpress functions lik e tense, mood, v oice, number , gender , and case. F or e xample the root "k t b" t• can gi v e rise to w ords such as "kataba" t• (he wrot e) "yaktub u" tk§ (he writes) "kitab" At• (book), and "maktab" tk› (of ce). This morphological comple xity allo ws for v ersatility and richness in language e xpression. Also introduces challenges in processing Arabic for natural language applications. The rest of the document is structured as follo ws: section 2 pro vides an o v ervie w of adv ancement s in syntactic and semantic analysis specically tailored for Arabic. Section 3 e xplores approaches to Arabic morphological analysis, highlighting its k e y aspects and the a v ai lable analyzers. Lastly section 4 of fers an e xamination of e xisting techniques used in morphological analysis, within Arabic linguistics. In section 5 the challenges and future prospect s of natural language processing (NLP) are e xplored. This article concludes with a summary of the ndings. 2 SYNT A CTIC AND SEMANTIC AN AL YSIS The tw o basic methods for comprehending natural language are syntactic and semantic analysis. Syntactic analysis (pars ing) e xamines sentence structure according to grammatical rules. In Arabic, this is challenging due to rich morphology , e xible w ord order , and diacritics. W ords often consist of roots, prex es, suf x es, and inx es, making morphological analysis a prerequisite. Morphological ambiguity and v ariable w ord order notably af fect parsing performance [8]. F or e xample, the root --— (k-t-b) produces A• (k ¯ atib, writer), At• (kit ¯ ab, book), and wtk› (makt ¯ ub, written). Although Arabic typically fol lo ws a v erb-subject-object (VSO) order , as in TAft˜ ™r˜ ™• (akal al-rajul al-tuf f ¯ ah . ah, the man ate the apple), it can also use subject-v erb-object (SV O) and others, adding comple xity . Diacritics, which mark short v o wels, are often omitted, leading to ambiguity t• (ktb) can mean Ábata• (kataba, he wrote), Ábitu• (kutiba, it w as written), or utu• (kutub, books). Accurate analysis relies on rules go v erning agreement, conjug ation, and particle use. Semantic analysis focuses on meaning at w ord, phrase, and sentence le v els. In Arabic, it is complicated by polysemy , synon ymy , and conte xt dependence. W ord sense disambiguation (WSD) is vital; for e xample, Ÿyˆ (’ayn) may mean “e ye, “spring, or “sp y . Named entity recognition (NER) identies entities such as dm› (Muh . ammad) or -r¡Aq˜ (al-Q ¯ ahirah, Cairo). Semantic role labeling denes relationships, as in ¨lˆ Y˜ Atk˜ dm› YWˆ (a‘t . ¯ a Muh . ammad al-kit ¯ ab il ¯ a Al ¯ ı), where dm› is the gi v er , Atk˜ the object, and ¨lˆ the recipient. Le xical semantics e xplores relations lik e synon yms ( dy`F a nd C¤rs› ), anton yms, and hierarchies. Conte xtual analysis resolv es ambiguities, as in TFCdm˜ Y˜ ¡Ð (dhahaba il ¯ a al -madrasah), meaning “He went to school, where the subject is implied. Ambiguity remai ns the main challenge in Arabic syntactic and semantic analysis, stemming from omitted diacritics and e xible w ord order . F or instance, Atk˜ t• (kataba al-kit ¯ ab) can mean “he wrote the book” or “the book w as A compar ative study of Ar abic morpholo gical analyzer s (Omar Saadiyeh) Evaluation Warning : The document was created with Spire.PDF for Python.
1878 ISSN: 2252-8938 written. Dialectal v ariation further complicates processing; “house” is y (bayt) in MSA b ut C  (d ¯ ar) or Mw˜ (al-h . a wsh) in dialects. The scarcity of annotated corpora and linguistic resources also limits progress. Despite these challenges, syntactic and semantic analysis are essential for adv ancing Arabic NLP tasks such as translation, information retrie v al, and sentiment analysis. 3 ARABIC MORPHOLOGICAL AN AL YSIS APPR O A CHES AND A V AILABLE AN AL YZERS 3.1 Arabic mor phological analysis appr oaches This section e xplores v arious approaches to linguistic anal ysis based on le xicons, which systemati cally store linguistic rules. The le xicon comprises tw o main sections: the rst contains the w ord roots, patterns, and stems, and the other displays related information in the analysis outcomes. The k e y approaches discussed are: Root-pattern morphology: focuses on the relationship between meaning and form, using nonconcatenati v e methods to deri v e stems from root–pattern combinations, as described by McCarth y . Prominent systems include the Buckw alter Arabic morphological analyzer (B AMA) and standard Arabic morphological analysis (SAMA) (T able 2). Stem-based morphology: e xpands be yond surf ace forms to pro vide linguistic and semantic data for each le xical item. It inte grates root–pattern structures with syntactic information, of fering a more intuiti v e frame w ork for le xicon e xpansion. Le x eme-based morphology: recognizes that a single le x eme can produce multiple w ord forms, focusing on stem-le v el representations rather than indi vidual root or pattern constituents. Syllable-based morphology: although ef fecti v e in some European languages , syllable-based approaches remain lar gely une xplored in semitic languages lik e Arabic. T able 2. Examples of root-pattern morphology Root P attern In Arabic Meaning (drs) xC  (CaCaCa) ™` (darasa) ÁFÁCÁ  study (CA CiC) ™iˆAa (dAris) xÃCAÁ  student (CaC aCa ) ÅalaÌ`a (dar asa ) ÁsaÄCÁ  he teaches (CA CiC) ™iˆAa (dAriswn) ÁžwuFÃCAÁ  group of students 3.2 A v ailable mor phological analyzers Standard Arabic language morphological analysis (SALMA) [9] w as e v aluated using the SALMA Gold Standard corpus, with a focus on the prediction accurac y of 22 morphological features at the morpheme le v el. The e v aluation included tw o distinct Arabic te xt samples: the Qur’an [10] and the CCA [11]. Exact match accurac y reached 71.21% for the CCA corpus and 53.50% for the Qur’an, with man y of t he discrepancies being minor (e.g., symbol substitutions). The system sho wed particularly strong performance in 15 morphological cate gories, including part-of-speech (POS), v erb and particle subcate gories, deniteness, v oice, and root-related features achie ving accuracies of 98.53% for CCA and 90.11% for Qur’an. The remaining 7 cate gories, such as gender , number , and case, sho wed s lightly lo wer accurac y , ranging from 81.35%–97.51% for CCA and 74.25%–89.03% for the Qur’an. These results demonstrate the SALMA t agger’ s ef fecti v eness in deli v ering ne-grained morphological analysis across v arious Arabic te xt genres, le v eraging traditional Arabic grammar rules within a kno wledge-based frame w ork. In terms of methodology , the SALMA tagger is a rule-based, kno wledge-dri v en analyzer , b uilt on traditional Arabic grammar and the SALMA-ABCLe xicon, a massi v e le xical resource compiled from 23 classical dictionaries (14M tok ens; 2.7M v o welized pai rs). Its modular design inte grates tok enization, lemmatization, root e xtraction, v o welization, and pattern generation, allo wing for highly detailed morpheme-le v el tagging across 22 features. Its main strength is the high accurac y in features lik e POS, v erb type, and root-related cate gories, making it a strong choice for detailed corpus annotation. Its weaknesses appear in cate gories lik e gender , case, and number , particularly in classical Arabic, where the performance drops compared to MSA. Error analysis sho ws that man y f ailures are minor (e.g., symbol substitution or misassigned diacritics), though some errors reect the comple xity of handling ambiguous morphosyntactic features. While per -feature accurac y is reported, statistical signicance testing and condence interv als are absent, lea ving rob ustness across corpora less certain. Int J Artif Intell, V ol. 15, No. 2, April 2026: 1876–1890 Evaluation Warning : The document was created with Spire.PDF for Python.
Int J Artif Intell ISSN: 2252-8938 1879 SAMA [12] follo ws a rule-based le xical approach rather than statistical or neural methods. It b uil d s on the Buckw alter analyzer by e xpanding root and pattern co v erage through an enriched le xicon and rened af xation rules. The system outputs all possible morphological parses for a gi v en surf ace form, which pro vides wide co v erage b ut lea v es the task of conte xtual disambiguation to e xternal modules. This design reects both a strength comprehensi v eness of analysis and a weakness, since in practice the ra w outputs are often too ambiguous to use without further processing. The analyzer w as primarily de v eloped and distrib uted by the Linguistic Data Consortium (LDC), and while it does not train on a specic corpus in the w ay statistical models do, its le xicons are informed by e xtensi v e le xical resources curated o v er years of Arabic linguistic research. In terms of e v aluation, SAMA is documented as a linguistic resource rather than a benchmark ed system, so no formal e v aluation numbers (e.g., accurac y for POS tagging, stemming, or lemmatization) are typically reported, and no condence interv als or statistical signicance testing are pro vided. B AMA [13] is a rule-based, le xicon-dri v en tool for Arabic morphological analysis, designed for MSA by T im Buckw alter . It uses an ASCII-based representation and includes modules for tok enization, transliteration, le xicon lookup, and morphological analysis, producing detailed output with features lik e person, number , gender , aspect, and v oice. Initially implemented in Perl and later in Ja v a, B AMA supports only Arabic and of fers multiple analyses per tok en. It is widely used in linguistic research, NLP applications, and Arabic language technologies . Resources are curated internally , with no reported training corpus or e v aluation ag ainst gold standards in the original release. P erformance metrics appear only in later comparati v e studies, and no statistical signicance testing is a v ailable for B AMA alone. F arasa analyzer [14] is an adv anced Arabic NLP tool de v eloped by the Qatar Computing Rese arch Institute (QCRI). It is grounded in a statistical learning approach, specically an support v ector machine (SVM)-rank classier with linear k ernels, which le v erages a wide set of linguistic and probabilistic features such as prex/suf x li k elihoods, stem templates, and le xicon lookups. Unlik e purely rule-based analyzers, it combines statistical ranking with curated l e xi cons, striking a balance between ef cienc y and accurac y . It pro vides comprehens i v e NLP capabilities via a RESTful W eb API and is a v ailable as standalone Ja v a jars. F arasa supports the Arabic language and includes components such as se gmentation, spell checking, POS tagging, lemmatization, diacritization, dependenc y parsing, constituenc y parsing, and NER. The accurac y of F arasa (up to 98.94%) matches or slightly surpasses state-of-the-art systems. Error analysis re v eals weaknesses in handling foreign named entities and o v erly long w ords with multiple v alid se gmentations. In these cases, the model often generates the correct se gmentation b ut misranks it, suggesting room for impro v ement through richer g azetteers or feature e xpansion. The analyzer w as trained on parts of the Penn Arabic T reebank (A TB) [15] and a lar ge Aljazeera corpus (94M w ords, 2000–2011), and te sted both on A TB subsets and an independent W ikiNe ws set of 18,271 w ords. F or do wnstream e v aluation, F arasa w as benchmark ed on machine translation using IWSL T TED talks (183K sentences) and the NEWS corpus (202K sentences), and on information retrie v al (IR) using the TREC 2001/2002 Arabic ne wswire collection (59.6M w ords, 75 topics). AlKhalil analyzer has tw o v ersions : i) the rst v ersion, de v eloped in 2010 [16], pro vides all possible v o welized forms for a gi v en w ord. Each v o welized form is accompanied by detailed morphological information, including clitics, stem, root, and POS tag, and ii) the second v ersion, de v eloped in 2017 [17], it adopts a rule-based morpho-syntactic approach implemented in Ja v a. It relies on an e xtensi v e, carefully structured le xicon of deri v ed and non-deri v ed w ords, clitic lists, and root –pattern les, enriched with lemmas and patterns. Its w orko w includes normalizat ion, se gmentat ion into proclitics/stems/enclitics, and parallel analysis of st ems as e xceptional, non-deri v ed, deri v ed nouns, or v erbs. V alidation steps check compatibility between clitics, stems, and diacritics before producing the set of possible analyses. A major strength of this system is its broad le xical co v erage (o v er 4.1M v o welized stems), high accurac y , and speed, which together mak e it rob ust and ef cient for do wnstream tas ks. Ho we v er , lik e man y out-of-conte xt analyzers, it produces multiple candidate analyses for ambiguous w ords, which can o v erwhelm applications without a disambiguation module. F or e xample, the non-v o welized form œlˆ can yield outputs lik e œlˆ (science), œalaˆ (ag), or œiluˆ (w as kno wn), underscoring its reliance on e xternal disambiguation for conte xt-sensiti v e interpretation. The system w as e v aluated on more than 72 million diacritized w ords from the T ashk eela corpus (63M) [18], Nemlar (0.5M), and RDI (8.5M). Results sho wed co v erage of 99.31%, wi th an a v erage of 4.71 lemmas, 5 .08 stems, and 8.05 v o welized forms per w ord, reecting its rich le xical resources. On Nemlar , it achie v ed 97.16% lemma match, 96.76% stem match, and 97.21% diacritization accurac y , with full-feature match at 96.56%. Its throughput reached 632 w ords/second, balancing speed with co v erage. The authors do A compar ative study of Ar abic morpholo gical analyzer s (Omar Saadiyeh) Evaluation Warning : The document was created with Spire.PDF for Python.
1880 ISSN: 2252-8938 not report statistical signicance testing or condence interv als. Arabic Stanford Se gmenter: the Arabic Stanford Se gmenter [19] is a widel y recognized tool for morphological se gmentation and tok enization of Arabic te xt. De v eloped as part of the Stanford NLP Group’ s toolkit, it is based on a conditional random elds (CRF) model trained on annotated Arabic corpora. The tool is particularly ef fecti v e in addressing the challenges of Arabic morphology , which include af xation, clitics, and the absence of clear w ord boundaries in written form. The Stanford Se gmenter attempts to se gment the clitics correctly using a statistical model that learns from linguistic patterns in annotated data, primarily dra wing on the Penn Arabic T reebank (P A TB) [15]. Unlik e rule-based systems that may requi re e xtensi v e linguistic input and manual tuning, the Arabic Stanford Se gmenter le v erages machine learning techniques, which allo w it to generalize well across dif ferent domains. It outputs both se gmented tok ens and their corresponding morphological analyses, making it a comprehensi v e preprocessing solution for modern Arabic NLP pipelines. Reported results sho w strong performance with an F1 of 92.09% on Egyptian Arabic and statis tically signicant g ains ( p < 0 . 001 ) o v er prior baselines, plus a 7 decoding speedup compared to MAD A and MAD A-ARZ. Error analysis highlights three issues: i) inconsistencies in gold data, ii) o v erly local se gmentation features, and iii) conte xt-sensiti v e ambiguities (e.g., wla meaning “and not” or “or , and -na as pronoun vs. v erb suf x). Strengths include dialect-agnostic design, tested impro v ements, and ef cienc y; weaknesses lie in handling conte xt-sensiti v e se gmentation and data inconsistencies. MAD AMIRA [20] is a morphological analyzer that assigns morphological tags to each w ord in a sentence by considering the w ord’ s conte xt. It inte grates tw o morphological analysis systems: MAD A [21] and AMIRA [22]. Initially , the system analyzes the w ords of a sentence out of conte xt using the SAMA analyzer [12]. T o choose a single solution from the multiple options generated in this rst phase, a disambiguation step based on the use of SVM and the language models is performed. It adopts a machine learning approach that relies on linear SVM classiers and n-gram language models for morphological feature prediction, combined with ranking modules for disambiguation. Unlik e its Perl-based pre decessors, it is implemented in Ja v a, which contrib utes to its rob ustness, portability , and remarkable ef cienc y , achie ving speed impro v ements of up to 20 . The analyzer supports both MSA and Egyptian Arabic (EGY), using the P A TB (parts 1–3) and Egyptian Arabic T reebanks (parts 1–6) as training data, respecti v ely . The test sets included around 25K w ords for MSA and 20K for EGY . Ev aluation sho ws h i gh accurac y: for MSA, 95.9% POS accurac y , 96.0% lemma accurac y , and 86.3% diacritization; for EGY , 92.4% POS, 87.8% lemma, and 83.2% diacritization. T ok enization reached 98.9% perfect accurac y in MSA and 96.6% in EGY . MAD AMIRA s strengths lie in its broad functionality (morphological disambiguation, diacritization, POS tagging, tok enization, glossing, and stemming), speed, and e xtensibility . It also all o ws e xible tok enization schemes and pro vides both XML and HTTP interf aces, making it user -friendly . W eaknesses include a slight drop in accurac y compared to MAD A for some metrics (up to 0.6% lo wer in EGY full morphological accurac y) and hea vy memory requirements (up to 2.5 GB heap space). Ov erall, e v aluation results are reported with clear accurac y percentages b ut without statistical signicance testing or condence interv als, lea ving rob ustness comparisons open for further analysis. CAMEL MORPH MSA [23] is a comprehensi v e and publicly a v ailable morphological analyzer and generator for MSA. Featuring o v er 100,000 lemmas and support for rare morphological features inherited from classical Arabic, it signicantly e xpands the analytical capabilities of Arabic NLP tools. The system generates approximately 1.45 billion analyses and 535 mil lion distinct diacritizations. CAMEL MORPH MSA inte grates seamlessly with the camel tools Python suite [24], ensuring ease of use. Ev aluation across lar ge datasets, including MSA-CB, CA-CB, and P A TB-T rain, sho ws rob ust accurac y and signicantly impro v ed co v erage. In terms of strengths, CAMEL MORPH MSA dramatically impro v es le xical co v erage and reduces out-of-v ocab ulary (OO V) rates by 36% compared to SAMA/CALIMA across massi v e corpora lik e MSA-CB (9.9B tok ens, 11.4M types) and CA-CB (0.7B tok ens, 2.4M types). Ev aluation on P A TB-T rain sho wed a 95.9% recall, with manual inspection attrib uting about 90% of mismatches to annotation errors rather than the system itself, highlighting its reliability . Error analyses re v ealed challenges in handling spelling inconsistencies, lemma–stem mismatches, and ambiguous paradigms. Its main weakness lies in speed, running 2.4–2.9 times slo wer than SAMA, though of fering richer analyses per w ord. Importantly , the results were reported with dataset-scale e v aluations and manual error breakdo wns, b ut without e xplicit statistical signicance testing or condence interv als. Alma [25] is an open-source tool for Arabic language processing that inte grates lemmatization, POS tagging, and root e xtraction. Its approach is primarily frequenc y-based and le xicon-dri v en, le v eraging a lar ge pre-computed memory b uilt from the Qabas le xicographic database [26], the Shamela corpus, and digitized Int J Artif Intell, V ol. 15, No. 2, April 2026: 1876–1890 Evaluation Warning : The document was created with Spire.PDF for Python.
Int J Artif Intell ISSN: 2252-8938 1881 le xicons. This design shifts computational comple xity from runtime analysi s to memory construction, enabling Alma to achie v e v ery high processing speeds lemmatizing around 34,000 tok ens per second. F or OO V cases, Alma inte grates a ne-tuned bidirectional encoder representati o ns from transformers (BER T) model to impro v e POS tagging, which achie v ed F1-scores abo v e 98% on the Arabic T reebank (A TB) for POS classication. Its co v erage e xtends across 40 POS tags and includes the rst fully functional root tagger grounded in Qabas. Ev aluation results highlight Alma’ s competiti v e performance: on the LDC Arabic T reebank (339k tok ens) it reached 87.8% in true lemmatization and 92.7% in POS tagging, while on the SALMA corpus (34k tok ens) it achie v ed 90.5% and 93.8% respecti v ely . These scores were further impro v ed when combined with BER T for OO V handling. Speed compari sons sho wed Alm a v astly outperformed MAD AMIRA (1710 seconds vs. 10 seconds on A TB). Error analysis re v ealed most f ailures were due to ambiguous lemmatization (61% of errors), where Alma f a v ored the most frequent lemma e v en if conte xtually less accurate, and to general POS confusions, such as mistaking adjecti v es for nouns. Ibn-Ginni is a h ybrid Arabi c morphological analyzer that combines the speed and precision of Buckw alter Arabic morphological analyzer (B AMA) with the broader classical Arabic co v erage of the Alkhalil analyzer . T o impro v e co v erage, morphological data for 3 million unique Arabic w ords w as generated using Alkhalil, rened, and added to B AMA s database. The resulting sys tem analyzed 600,000 more w ords than B AMA alone, with an a v erage analysis time of 0.3 milliseconds per w ord. In benchmark tes ting, Ibn-Ginni pro vided full morphological solutions for 72.72% of w ords and partial solutions for 24.24%, demonstrating impro v ed performance and ef cienc y [27]. SinaT ools [28] is an open-source toolkit de v eloped at Birzeit Uni v ersity . It adopts a h ybrid methodology that inte grates rule-based resources with modern machine learning, particularly ne-tuned BER T models. Its morphological analysis module, Alma, relies on a frequenc y-based le xicon where lemmatization, POS tagging, and root tagging are handled through dictionary lookups, while a BER T -based model supports OO V handl ing. Other modules, including NER and w ord sense disambiguation (WSD), are also po wered by transformer models such as AraBER T v2 [29]. This design not only ensures speed and accurac y b ut also pro vides e xibility through v arious inte gration interf aces, including CLI, API, and SDK. Its modularity and e xtensibility allo w de v elopers to plug in additional NLP tasks with minimal ef fort, which highlights its strength as a research and applied tool. Ho we v er , the reliance on pre-computed le xicons limits its adaptability in unseen or domain-shifted conte xts, as illustrated by consistent v erb-tagging for ambiguous w ords re g ardless of conte xt. The toolkit is trained and e v aluated on se v eral corpora. Morphological e v aluation w as conducted on the Arabic T reeBank (A TB, 339k tok ens) and the SALMA dataset (34k tok ens), while NER w as tested on the W ojood datasets [30], including W ojoodGaza (50k tok ens from ne ws te xts) and a Politics dataset (12k tok ens). WSD w as benchmark ed using the SALMA sense-annotated corpus (34k tok ens), and semantic relatedness w as assessed through SemEv al-2024 with 595 sentence pairs. In terms of performance, SinaT ools achie v ed lemmatization accurac y of 90.5% and POS tagging at 97.5%. Its NER module reached an F1-score of 87.3%, the WSD module recorded 82.6% o v erall accurac y , and semantic relatedness scored 0.49 Spearman correlation. These e v aluations, though impressi v e, underline that SinaT ools’ s trength lies in high-speed le xicon-back ed morphology with h ybrid neural e xtensions. Camelira [31] is a multi-D A morphologi cal disambiguator that int e grat es statistical and neural approaches for analysis . Its backbone relies on CAMeL T ools’ morphological disambiguation system. The tool co v ers four Arabic v arieties: MSA, Egyptian, Gulf, and Le v antine, and is accessible through a user -friendly web interf ace. Distinguishing itself from prior analyzers, Camelira not only outputs disambiguated readings in conte xt b ut also presents alternati v e out-of-conte xt analyses along with probability scores. A k e y strength is its inte gration of dialect identication, which automatically selects the appropriate disambiguator , making it v aluable for learners or researchers who may not kno w the input dialect. Ho we v er , its co v erage is limited to specic dialects, and the system struggles with unseen genres or underrepresented v arieties, producing occasional errors when processing te xts outside its training distrib ution. Sample outputs in the interf ace demonstrate diacritized te xt, tok enized forms, lemmas, and full morphological features, b ut Gulf Arabic lacks diacritization due to una v ailable annotated resources. In terms of resources, Camelira relies on the datasets used in the CAMeL T ools pipeline and the multi-Arabic dialect applications and resources (MAD AR) shared task for dialect identication. Ev aluation reported for morphological disambiguation, the model achie v es accurac y across dialects as follo ws: MSA (95.9% for all tags, 98.7% POS), Egyptian (90.5%, 94.0%), Gulf (93.8%, 96.6%), and Le v antine (85.5%, 92.7%). A compar ative study of Ar abic morpholo gical analyzer s (Omar Saadiyeh) Evaluation Warning : The document was created with Spire.PDF for Python.
1882 ISSN: 2252-8938 According to Zalmout and Habash’ s [32], bidirectional long short-term memory (Bi-LSTM) morphological disambiguation system is a neural morphological disambiguation model for Arabic that combines Bi-LSTM architectures with morphological analyzers. Unlik e earlier rule-based or statistical approaches, the system le v erages w ord and character le v el embeddings enriched with subw ord and morphological features (such as af x es or dictionary-based tags). Its strength lies in using the outputs of a traditional m orpho l ogical analyzer not as a replacement b ut as a guide, ranking possible analyses with learned probabilities. This h ybrid design captures long-distance dependencies better than x ed-windo w methods and signicantly boosts disambiguation for morphologically rich features l ik e case and mood. W eaknesses remain in areas such as case assignment and rare cate gories (e.g., second-person v erbs, passi v e v oice), where ambiguity and data sparsity still limit performance. The authors pro vide detailed error analysis, sho wing, for instance, that while their system doubles the cases where it outperforms MAD AMIRA, some errors persist, especially for morphosyntactic cues hea vily reliant on syntax. F or e v aluation, the authors use the P A TB parts 1–3 as the main dataset (503K training w ords, 63K w ords each for de v elopment and test), complemented with pre-trained embeddings from the 2.15 billion-w ord Arabic gig a w ord corpus [33]. Results demonstrate full morphological analysis accurac y equal to 90.0%, and 76.9% for OO V w ords. Across specic features, POS tagging reached 97.9%, case tagging impro v ed by 3.7 points, and diacritization accurac y w as equal to 91.7%. These results are statistically signicant across metrics, supported by comparati v e error anal yses and condence-based scoring. Ov erall, the system illustrates the enduring v alue of combining deep neural architectures with traditional analyzers, sho wing measurable impro v ements while highlighting remaining g aps in modeling ne-grained Arabic morphology . Neural-based Arabic morphological analyzer [34] emplo ys a neural-based approach, specically a recurrent neural netw ork (RNN), to perform Arabic morphological analysis. Unlik e earlier rule-based systems, this model le v erages sub-w ord information (prex es, inx es, roots, and suf x es ) and con v erts them into v ectors for sequence modeli ng. The analyzer aims to o v ercome tw o main g aps in prior w ork, particularly in the Jabalin system: the inability to identify nouns and the hea vy reliance on dictionaries for v erb form classication. By combining pattern e xtraction, sub-w ord v ectorization, and RNN-based classication, the system is able to automatically identify morphosyntactic descriptions (MSDs) for both v erbs and nouns. This design highlights a strength i n its ability to handle dictionary dependenc y problems and generalize to nouns deri v ed from v erbal roots, something pre vious analyzers struggled with. Ho we v er , one noted limitation is reduced accurac y for certain rare v erb forms "Iii", where performance dropped to 73%, indicat ing challenges in modeling less frequent pat terns. F or its dataset, the system relies on the Qur’anic Arabic Corpus [10] which already includes morphological labels . with preprocessing, reducing the initial 1,778 unique w ords into an e xpanded dataset of o v er 30,936 labeled w ords using linguistic pattern tables. After splitting, 24,748 w ords were used for training and 6,188 for testing. The e v aluation reported 99% o v erall accurac y , 99% precision, 96% recall, and 97% F1-score, with results brok en do wn by POS, aspect, gender , number , and v erb form. Statis tical comparisons with the Jabalin system sho wed a mark ed impro v ement (99% vs. 39% o v erall accurac y), especially in noun recognition (99% vs. 0%). While the study did not report formal signicance testing or condence interv als, the detailed per -feature results (POS, tense, gender , number , and v erb form) demonstrate rob ust e v aluation across morphological cate gories. Morphosyntactic tagging wi th pre-trained transformer models (CAMeLBER T) [35] adopts a neural approach, ne-tuning pre-trained transformer models (CAMeLBER T -MSA for MSA and CAMeLBER T -Mix for dialects). Each morphosyntactic feature is modeled with an independent classier , and in some setups, predictions are rened using e xternal morphological analyzers (SAMA for MSA, CALIMA for Egyptian, and automatically induced analyzers for Gulf and Le v antine). This h ybrid design sho ws clear strengths, it achie v es state-of-the-art results across all v arieties studied, with absolute impro v ements. Its weaknesses, ho we v er , stem from reliance on analyzer quality and dialectal orthographic inconsistenc y; while manually crafted analyzers impro v e tagging accurac y , automatically generated ones can sometimes hurt performance as data gro ws. Error analysis highlights dif culties with enclitics and nominal distinctions, particularly in dialects, with POS misclassications and annotation inconsistencies contr ib uting to common f ailures. The model w as trained and tested on four corpora: P A TB (629k tok ens, MSA), Gumar (202k, Gulf), ARZTB (175k, Egyptian), and Curras (57k, Le v antine). Ev aluation includes POS tagging and full morphosyntactic feature prediction. F or POS tagging, accurac y reached 98.9% (MSA), 96.9% (Egyptian), 97.9% (Gulf), and 94.6% (Le v antine). F or full morphosyntactic tagging (ALL T A GS), accurac y w as 96.3% (MSA), 91.0% (Egyptian), Int J Artif Intell, V ol. 15, No. 2, April 2026: 1876–1890 Evaluation Warning : The document was created with Spire.PDF for Python.
Int J Artif Intell ISSN: 2252-8938 1883 95.7% (Gulf), and 87.6% (Le v antine). Results are statistically signicant (McNemar’ s test, p < 0 . 05 ). The system le v erages pre-trained transformers, e xternal analyzers, and cross-dialectal transfer , while highlighting resource and annotation limitations. Zalmout et al. [36] neural disambiguator (Egyptian Arabic) adopts a neural, Bi- LSTM–based approach for morphological tagging and disambiguation of Egyptian Arabic. It inte grates w ord and character embeddings (tested with both con v olutional neural net w ork (CNN) and long short-term memory (LSTM) v ariants), alongside embedding space mapping to handle noisy , user -generated dialectal te xt. Noise normalization is applied at the v ector le v el, a v oiding ra w te xt alterations. The system le v erages morphological analyzers deri v ed from SAMA, CALIMA, and AD AM resources to generate candidate analyses, whi ch are then resolv ed using neural models. A lar ge in-house Egyptian Arabic corpus (410M w ords) w as used for pre-training embeddings, while the annotated ARZ corpus (160K tok ens; split into 134K train, 20K de v , and 21K blind test) w as used for supervised training and e v aluation. In terms of performance, the best conguration achie v ed POS accurac y of 93.6%, lemma accurac y of 88.1%, diacritization accurac y of 83.8%, and full morphological analysis accurac y of 78.4%, yielding signicant error reductions o v er the MAD AMIRA baseline (e.g., 21.9% relati v e impro v ement in POS). Error analysis re v ealed strengths in handling noisy orthograph y and clitic se gmentation, b ut also weaknesses such as frequent confusion among nominal cate gories (74% of POS errors) and issues with Hamza spelling, diacritization propag ation, and MSA–EGY cognate mismatches. Interestingly , when trained on COD A-normalized orthograph y , results nearly matched the best noise-rob ust setup, suggesting the model closely approaches the performance ceiling for such data. Statistical reporting includes accurac y metrics with relati v e error reduction; ho we v er , no e xplicit condence interv als or signicance tests were pro vided. Stanza (St anfordNLP) for Arabic UD [37] is a fully neural, language-agnostic NLP toolkit de v eloped by Stanford, designed to process ra w te xt through a complete pipeline including tok enization, multi-w ord tok en e xpansion, lemmatizati on , POS and morphological tagging, dependenc y parsing, and NER. F or Arabic, it relies on the P ADT treebank wi thin the Uni v ersal Dependencies (UD v2.5) frame w ork. Its models use Bi-LSTM architectures with biaf ne scoring for syntactic analysis, and seq2seq ensembles for lemmatization and tok en e xpansion, allo wing the system to generalize ef fecti v ely across di v erse languages. A k e y strength lies in its broad multilingual co v erage (66 languages) and ability to handle te xt end-to-end from ra w input, producing competiti v e or state-of-the-art performance. Its weaknesses, ho we v er , include slo wer runtime compared to lightweight systems such as spaCy , a n d occasional errors in sentence se gmentation and multi-w ord tok en e xpansion in morphologically rich languages. The authors also ackno wledge computational cost as a limiting f act or for scal ability and ef cienc y . F or datas ets, Stanza w as trained on 112 corpora, with Arabic specically using the P ADT UD treebank (non-cop yrighted portion), plus additional NER data such as A QMAR for NER. The Arabic P ADT e v aluation sho ws v ery high tok enization accurac y (99.98) and strong performance in POS tagging (UPOS 94.89, XPOS 91.75, UFeats 91.86), lemmatization (93.27), and dependenc y parsing (U AS 83.27, LAS 79.33). F or Arabic NER, Stanza achie v ed an F1 score of 74.3 on A QMAR, comparable to FLAIR b ut outperforming spaCy where a v ailable. Results were benchmark ed ag ainst UDPi pe and spaCy using the of cial UD e v aluation script, b ut no statistical signicance tests or condence interv als were reported. Stanza demonstrates rob ustness and breadth, though ef cienc y and handling of genre/domain v ariability remain areas for impro v ement. UDPipe 2 (Neural UD Pipeline for Arabic) for Arabic (P ADT treebank), UDPipe 2.0 [38] sho wed strong b ut not a wless performance. It achie v ed v ery high se gmentation scores (tok ens: 99.98, w ords: 93.71, sentences: 80.89 F1), indicating reliable basic preprocessing. In tagging, it reached UPOS 90.64, XPOS 87.81, and UFeats 88.05, while lemmatization stood at 87.38. P arsing results were competiti v e b ut lo wer: U AS 88.94, LAS 72.34, MLAS 63.77, and BLEX 65.66. These numbers highlight that while UDPipe is rob ust at se gmentation and POS tagging, parsing comple x Arabic syntax remains challenging. Strengths lie in its end-to-end neural joint model that handles multiple tasks consistently without language-specic parameter tuning. Ho we v er , weaknesses emer ge with Arabic morphology and syntax, where error analysis indic ates struggles with rich inection, clitic se gmentation, and long-distance dependencies. F or e xample, the model often produces incorrect lemma forms when diacritics or clitics alter the base w ord, and dependenc y arcs occasionally mislabel subordinate clauses or prepositional phrases, reducing LAS. While these shortcomings are typical in morphologically ric h languages, the consistenc y of UDPipe’ s results across treebanks suggests its architecture generalizes well, e v en if Arabic parsing lags behind se gmentation accurac y . A compar ative study of Ar abic morpholo gical analyzer s (Omar Saadiyeh) Evaluation Warning : The document was created with Spire.PDF for Python.
1884 ISSN: 2252-8938 UDify (mBER T Multi-task Morphology for UD Arabic) [39] is a multilingual, multi-task neural analyzer b uilt on pretrained mBER T embeddings. It jointly predicts POS tags, morphological features, lemmas, and dependenc y parses using a self-attention architecture. T rained on 124 UD treebanks (75 languages), including Arabic P ADT (around 6.1k sentences), UDify achie v es strong syntactic accurac y (UPOS 96.58%, U AS 87.72%, LAS 82.88%) b ut performs poorly in lemmatization (73.55%) due to lack o f character -le v el embeddings—a k e y limitation for morphologically rich languages. While m ultilingual training boosts parsing for Arabic, weaknesses remain in morphology-sensiti v e tasks lik e lemmas and UFeats. No statistical signicance testing or condence interv als were reported. Gulf Arabic neural morphology [40] combines rule-based morphological analyzers with a neural disambiguation model. Specically , the Gulf Arabic anal yzer w as created automatically through paradigm completion based on annotated training data, while high-quality manual analyzers were used for MSA (SAMA) and Egyptian Arabic (CALIMA). F or disambiguation, the authors emplo yed a neural joint model (sequence-to-sequence with shared encoders for le xical and morphological features) alongside a baseline maximum lik elihood estimation (MLE) system. The y tested dif ferent setups: no analyzer , Gulf-only analyzer , and combinations with MSA and EGY analyzers, embedding or ranking the candidates. The strengths lie in the system’ s ability to handle Gulf Arabic morphology for the r st time and its adaptability to data size. Ho we v er , weaknesses appear when analyzers constrain the neural model, especial ly in lemmatization: ranking candidates often reduces accurac y , sho wing the analyzer’ s limited co v erage could restrict performance rather than impro v e it. The system w as trained and e v aluated on the annotated Gumar Corpus [41], Emirati Arabic no v els totaling about 202K tok ens across train/de v/test splits (with train 162K tok ens). Additional embeddings were dra wn from the lar ger 100M-tok en Gumar corpus. On test, the best conguration reached full 89.2%, T A GS 92.9%, LEX 93.1%, POS 96.7%, SEG 97.3%. Results were reported with detailed breakdo wns b ut without statistical signicance testing or condence interv als. Error analysis highlighted that lemmatization remained the weak est link, often suf fering when analyzers’ le xicon f ailed to match the di v ersity of Gulf lemmas. 4 CLASSIFICA TION OF ARABIC MORPHOLOGICAL AN AL YSIS TECHNIQ UES The results of all twenty analyzers, along with their performance metrics, are presented in T able 3. These metrics: accurac y , precision, recall, and F1-score, are compiled from re po r ted sources to enable direct comparison. When grouped by methodological approach, the results reect both strengths and trade-of fs. Rule-based analyzers such as AlKhalil (2017) and SALMA (2013) sho w high reliability on lar ge, curated resources, often e xceeding 95% in tasks lik e lemma accurac y or diacritization. Ho we v er , their performance drops considerably when applied to dialects or ne-grai n e d cate gories (e.g., patterns, stems), suggesting limited adaptability . Hybrid systems lik e Madamira (2014), Ibn Ghini (2024), and SinaT ools (2024) e xtend co v erage by combining rules with statistical or neural components, producing rob ust se gmentation and morphological disambiguation. Nonetheless, their performance is not uniform: Madamira achie v es o v er 98% in se gmentation b ut lo wer scores (77–86%) in diacritization and full solutions, while SinaT ools performs well in lemma and POS tagging yet sho ws moderate outcomes in WSD and semantic tasks. Corpus size and di v ersity ha v e a clear impact across s ystems. Models trained on lar ge, balanced resources such as the P A TB (>1.3M w ords) or Gumar (202K annotated) consistently produce higher accurac y and generalization. F or instance, Stanford’ s se gmenter (2020) and F arasa (2016) achie v e se gmentation accurac y near or abo v e 98%, reecting the adv antage of lar ge-scale training. In contrast, analyzers b uilt on smaller or domain-specic corpora, such as the Qur’anic Arabic Corpus (31K w ords), sho w v ery strong performance within that domain (99% accurac y) b ut with limited applicability be yond it. Dialectal e xtensions, as in Camelira (2022) or Zalmout–Habash (2018), demonstrate competiti v e results (POS 90–94%), though accurac y remains belo w that of MSA systems, highlighting the challenges of resource scarcity and linguistic v ariability . Neural architectures dominate recent benchmarks in Arabic morphosyntactic tagging and parsing. Models lik e CAMeLBER T (2022) and UDify (2019) e xceed 95% in UPOS and full-tagging tasks, conrming the strength of conte xtualized embeddings and multitask learning. Hybrid sys tems remain rele v ant: Ibn Ghini (2024) of fers notable ef cienc y (0.3 ms/w ord), and Madamira pro vides broad functional co v erage, making them practical where speed and e xplainability matter more than state-of-the-art accurac y . The eld is shifting: rule-based systems e xcel in controlled settings, h ybrid approaches of fer balance, and neural architectures deli v er top accurac y when lar ge, di v erse corpora are a v ailable. Int J Artif Intell, V ol. 15, No. 2, April 2026: 1876–1890 Evaluation Warning : The document was created with Spire.PDF for Python.
Int J Artif Intell ISSN: 2252-8938 1885 T able 3. Arabic morphological analyzers performance Analyzer Approach Corpus/Size Morpheme Ev aluation CAMEL MORPH 2024 Rule-based P A TB / 1.5M w ords lemma + analysis + diacritization Recall: 95.9% Alma 2024 Fr equenc y-based + Le xicon-dri v en + BER T (OO V) LDC A TB (1.5M), SALMA (500K) Morphological analysis F1: 88% (A TB), 90% (SALMA) Ibn Ghini 2024 Hybri d 3M w ords / 600K analyzed Full morphological solutions Accurac y: 72.72% Alkhalil + B AMA Extended / 3M w ords P artial solutions Accurac y: 24.24% 0.3ms per w ord Analysis speed T ime: 0.3 ms/w ord AlKhalil 2017 Rul e-based Nemlar (500K) & T eshk eela (75M) Rate-Lemma Accurac y: 97.16% Rate-stem Accurac y: 96.76% Rate-diac Accurac y: 97.21% Rate-full Accurac y: 96.56% Gold standard (MSA, 546 w ords) Root Acc: 74.96%, Prec: 78.94%, Rec: 74.96%, F1: 76.90% Stem Acc: 54.43%, Prec: 57.33%, Rec: 54.43%, F1: 55.84% P attern Acc: 36.35%, Prec: 38.28%, Rec: 36.35%, F1: 37.29% Multi-dialect (EGY , TUN, 10 sentences each) N/A Acc: 68%, Prec: 68%, Rec: 66%, F1: 67% Stanford 2020 Sta tistical ML (CRF-based) Penn A TB (>1.3M w ords) Se gmentation F1: 98.24% F arasa 2016 Stat istical (SVM-rank + Le xicon) Noor–Ghateh (223,690 w ords) W ord se gmentation Acc: 81%, Prec: 81%, F1: 89% Penn A TB (>1.3M w ords) Se gmentation (base) Acc: 98.76% Penn A TB (>1.3M w ords) Se gmentation (lookup) Acc: 98.94% Madamira 2014 Hybrid (Rule-based + ML disambiguation with SVM + LMs) MSA (25K w ords) EV ALDiac Acc: 86.3% Ev alLe x Acc: 96% Ev alFull Acc: 84.1% Perfect T ok Acc: 98.9% Correct se gmentation Acc: 99.2% EGY (20K w ords) EV ALDiac Acc: 83.2% Ev alLe x Acc: 87.8% Ev alFull Acc: 77.3% Perfect T ok Acc: 96.6% Correct se gmentation Acc: 97.6% Penn A TB (>1.3M w ords) Se gmentation Acc: 98.76% Noor–Ghateh (223,690 w ords) W ord se gmentation Acc: 80%, Prec: 80%, Rec: 99%, F1: 88% Multi-dialect (EGY , TUN, MSA, 10 sentences each) N/A Acc: 85%, Prec: 86%, Rec: 88%, F1: 87% SALMA 2013 Rule-ba sed, Kno wledge-dri v en (Grammar + Le xicon) CCA (500K), Qur’an (77K) Morphological features Acc: 98.53% (CCA), 90.11% (Qur’an) CCA (500K), Qur’an (77K) Remaining cate gories Acc: 81.35–97.51% (CCA), 74.25–89.03% (Qur’an) SinaT ools 2024 Hybrid (Rule-based + BER T/T ransformers) A TB (339K), SALMA (34K) Morphology (Lemma, POS) Lemma: 90.5%, POS: 97.5% W ojood (50K), Politics (12K) NER F1: 87.3% SALMA Sense (34K) WSD Acc: 82.6% SemEv al-2024 (595 pairs) Semantic relatedness Spearman: 0.49 Camelira 2022 Stat istical + Neural (CAMeL tools backbone) MSA Morphological disambiguation All tags: 95.9%, POS: 98.7% Egyptian All: 90.5%, POS: 94.0% Gulf All: 93.8%, POS: 96.6% Le v antine All: 85.5%, POS: 92.7% Zalmout and Habash 2017 Neural (Bi -LSTM + Analyzer guidance) P A TB (503K train, 63K de v/test) Morphological disambiguation Full: 90.0%, OO V : 76.9% Gig a w ord (2.15B) POS: 97.9%, Diac: 91.7% Neural analyzer (RNN) Neural (RNN + subw ord v ectors) Qur’anic Arabic Corpus (31K) Morphological analysis Acc: 99%, Prec: 99%, Rec: 96%, F1: 97% CAMeLBER T 2022 Neural (T ransformer -based + Analyzer support) A TB (629K) Morphosyntactic tagging POS: 98.9%, All tags: 96.3% ARZTB (175K), Gumar (202K), Curras (57K) Dialects: 91–95% Zalmout–Habash 2018 (EGY) Neural (Bi-LSTM, noise-rob ust) ARZ (160K), Gumar (410M pretrain) Morphological disambiguation POS: 93.6%, Lemma: 88.1%, Diac: 83.8%, Full: 78.4% Stanza 2020 Neural (Bi-LSTM + seq2seq) P ADT UD (Arabic) Full UD morphology UPOS: 94.9, XPOS: 91.8, UFeats: 91.9, Lemma: 93.3 Dependenc y parsing U AS: 83.3, LAS: 79.3 A QMAR (NER) NER F1: 74.3 UDPipe 2.0 (2018) Neural (Joint model) P ADT UD (Arabic) Morphology + P arsing UPOS: 90.6, XPOS: 87.8, UFeats: 88.1 Lemma: 87.4, U AS: 88.9, LAS: 72.3 UDify 2019 Neural (mBER T multitask) UD Arabic P ADT (6.1K sents) POS + Features + Lemma + P arsing UPOS: 96.6, U AS: 87.7, LAS: 82.9, Lemma: 73.6 Gulf Morph 2020 Hybrid (Rule-based analyzers + Neural disambiguation) Gumar annotated corpus (202K), embeddings 100M Gulf morphology (POS, SEG, LEX) POS: 96.7%, SEG: 97.3% LEX: 93.1%, Full: 89.2% A compar ative study of Ar abic morpholo gical analyzer s (Omar Saadiyeh) Evaluation Warning : The document was created with Spire.PDF for Python.