سال انتشار: ۱۳۸۶

محل انتشار: پانزدهیمن کنفرانس مهندسی برق ایران

تعداد صفحات: ۸

نویسنده(ها):

Sadra Abedinzadeh – Faculty of Electrical and Computer Engineering School of Engineering – University of Tehran Tehran – Iran
Fattaneh Taghiyareh – Faculty of Electrical and Computer Engineering School of Engineering – University of Tehran Tehran – Iran
Farhad Oroumchian – Faculty of Information Technology University of Wollongong (Dubai campus) Dubai – UAE Control and Intelligent Processing Center of xcellence, University of Tehran – Iran

چکیده:

The development of Language Engineering(LE) and Information Retrieval (IR) applicationsrequires availability of sizeable, reliable andrepresentative collection of documents. Moreover, cross hnguage Information Retrieval (CLIR) systemsare widely used recently due to the explosion of non-documents. However, the lack of such acollection to be used in CLIR which deals with Persian retrieval is a big drawback in researches in this field This paper describes a 90MB Persian-Englishcollection which contains 7073 documents generatedfiom Wikipedia, an open encyclopedia, web site and isrepresented in XML format. We also use the RSLPcollection description schema to describe ourcollection.