Rayed's Real life
Posts

Arabic Projects Ideas 3: Open Arabic Stemmer

By Rayed

May 23, 2009

Nowadays we have many great options to build your own search engines, either for your website, or for you own custom applications. to mention a few:

  • Lucene
  • Solr (uses Lucene)
  • Sphinx

All of these search engine option works great for English language, and they have decent support for Arabic language thanks to Unicode and UTF-8, but unfortunately they still lack a the power of stemming that you will find in English language.

Stemming is the process for reducing inflected (or sometimes derived) words to their stem, base or root form – generally a written word form. The stem need not be identical to the morphological root of the word; it is usually sufficient that related words map to the same stem, even if this stem is not in itself a valid root.

http://en.wikipedia.org/wiki/Stemming

The idea is to build open stem engine for Arabic language, and port to different computer languages: C, C++, Java, C#, etc… so it can be easily used in all Arabic information retentival systems.

If you have an ideas to improve the initial idea please comment.

  • Arabic
  • Engine
  • Lucene
  • Search
  • Solr
  • Sphinx
comments powered by Disqus

Related

  • Arabic Projects Ideas 2: Open Arabic Fonts
  • Arabic Projects Ideas 1: Open Um Alqura Calendar
  • شعب مُصاب بالشيزوفرينيا!
  • موقع قيم النسخة الثانية
  • IPV6 دعوة لحضور ورشة عمل الإصدار السادس لبروتوكول الانترنت
  • شعار جديد لمدينة الملك عبدالعزيز للعلوم والتقنية
  • Al Madina newspaper uses drupal
  • في وطني كن مهايطيا ولا تكن مبدعا
  • أفضل رسام كاريكاتير سعودي
  • فضيحة شواطئنا!
  • لينكس: تبي تحيرة خيرة
  • قيم: موقع تقييم المطاعم
  • معرض الرياض للكتاب باقي 5 ايام
  • وداعاً امي منيرة
  • ماهو ارخص سوبر ماركت في الرياض؟
© Rayed's Real life 2024