Results 1 -
4 of
4
Finding Variants of Out-of-Vocabulary Words in Arabic
"... Transliteration of a word into another language often leads to multiple spellings. Unless an information retrieval system recognises different forms of transliterated words, a significant number of documents will be missed when users specify only one spelling variant. Using two different datasets, w ..."
Abstract
- Add to MetaCart
Transliteration of a word into another language often leads to multiple spellings. Unless an information retrieval system recognises different forms of transliterated words, a significant number of documents will be missed when users specify only one spelling variant. Using two different datasets, we evaluate several approaches to finding variants of foreign words in Arabic, and show that the longest common subsequence (LCS) technique is the best overall. 1
Soundex-based Translation Correction in Urdu–English Cross-Language Information Retrieval
"... Cross-language information retrieval is difficult for languages with few processing tools or resources such as Urdu. An easy way of translating content words is provided by Google Translate, but due to lexicon limitations named entities (NEs) are transliterated letter by letter. The resulting NEs er ..."
Abstract
- Add to MetaCart
Cross-language information retrieval is difficult for languages with few processing tools or resources such as Urdu. An easy way of translating content words is provided by Google Translate, but due to lexicon limitations named entities (NEs) are transliterated letter by letter. The resulting NEs errors (zynydyny zdn for Zinedine Zidane) hurts retrieval. We propose to replace English non-words in the translation output. First, we determine phonetically similar English words with the Soundex algorithm. Then, we choose among them by a modified Levenshtein distance that models correct transliteration patterns. This strategy yields an improvement of 4 % MAP (from 41.2 to 45.1, monolingual 51.4) on the FIRE-2010 dataset. 1
Khalid S. R. Aloufi Diacritic Oriented Arabic Information Retrieval System
"... Arabic language support in search engines and operating systems is improved in recent years. Searching in the Internet is reliable and can be compared to the excellent support for several other languages, including English. However, for text with diacritics there are some limitations. For this reaso ..."
Abstract
- Add to MetaCart
Arabic language support in search engines and operating systems is improved in recent years. Searching in the Internet is reliable and can be compared to the excellent support for several other languages, including English. However, for text with diacritics there are some limitations. For this reason, most Information retrieval (IR) systems remove diacritics from text and ignore it for its complexity. Searching text with diacritics is important for some kinds of documents, such as those of religious books, some newspapers and children stories. This research shows the design and development of the system that overcome the problem. The proposed system considers diacritics. The proposed system includes the design complexity in the retrieving algorithm rather than the information repository, which is database in this study. Also, this study analyses the results and the performance. Results are promising and performance analysis shows methods to enhance design and increase the performance. The proposed system can be integrated in search engines, text editors and any information retrieval system that include Arabic text. Performance analysis of the proposed system shows that this system is reliable. The proposed system is applied on database of Hadeeth, which is religious book includes the prophet action and statements. The system can be applied in any kind of data repository.
A New Intelligent Methodology for Computer based Assessment of Short Answer Question based on a new Enhanced Soundex phonetic Algorithm for Arabic Language
"... Today most e-tests that created using the commercial tools for e-test generation or the Learning Management Systems (LMSs) such as Moodle or others don't provide a methodology for a perfect assessment of short answer questions. Unfortunately all of them provide a binary assessment that can be 1 (for ..."
Abstract
- Add to MetaCart
Today most e-tests that created using the commercial tools for e-test generation or the Learning Management Systems (LMSs) such as Moodle or others don't provide a methodology for a perfect assessment of short answer questions. Unfortunately all of them provide a binary assessment that can be 1 (for completely True) or 0 (for completely false) even if the answer is partially true or partially false. So in this paper the author presents a new intelligent methodology, and its implementation, for computer based assessment of the student’s short answer in e-test with English or Arabic language. This methodology is based on applying the Soundex phonetic algorithm on the answer’s word for English or Arabic language to facilitate a computer based intelligent marking method. The student who responds with the correct spelling answer’s word takes the total point of the question while the student who responds with the correct sounding but not correct spelling word may take points less than or equal to the total points according to the considered subject and the instructor’s opinion. This intelligent marking method can be used for subjects that are not required correct spelling answers such as Science, Humanities and other subjects rather than the “languages ” subjects. This paper also presents a new enhanced Soundex algorithm for Arabic language that achieved less error rates than the present algorithms as shown in the experimental results.

