Arabic Projects Ideas 3: Open Arabic Stemmer
By Rayed
Nowadays we have many great options to build your own search engines, either for your website, or for you own custom applications. to mention a few:
All of these search engine option works great for English language, and they have decent support for Arabic language thanks to Unicode and UTF-8, but unfortunately they still lack a the power of stemming that you will find in English language.
Stemming is the process for reducing inflected (or sometimes derived) words to their stem, base or root form – generally a written word form. The stem need not be identical to the morphological root of the word; it is usually sufficient that related words map to the same stem, even if this stem is not in itself a valid root.
http://en.wikipedia.org/wiki/Stemming
The idea is to build open stem engine for Arabic language, and port to different computer languages: C, C++, Java, C#, etc… so it can be easily used in all Arabic information retentival systems.
If you have an ideas to improve the initial idea please comment.