4th Workshop on Cross Lingual Information Access Filtering News for Epidemic Surveillance: towards Processing More Languages with Fewer Resources the Noisier the Better: Identifying Multilingual Word Translations Using a Single Monolingual Corpus Multi-word Expression-sensitive Word Alignment Co-occ
Min Zhang,Sudeshna Sarkar,Raghavendra Udupa,Adam Lopez,Eneko Agirre,Nicola Ferro,Ting Liu,Paul Mcnamee,Yao Meng,Carol Peters,Ralf Steinberger,Vasudeva Varma,Iiit Hyderabad,Haifeng Wang,Pushpak Bhattacharya,Tsetsuya Sakai,Microsoft Research,Asia Vi,Gael Lejeune,Antoine Doucet,Roman Yangarber,Nadine Lucas,Reinhard Rapp,Michael Zock,Andrew Trotman,Yue,Tsuyoshi Okita,Alfredo Maldonado Guerra,Yvette Graham,Ling-Xiang Tang,Shlomo Geva,Yue Xu,Achille Falaise,David Rouquet,Didier Schwab,Hervé Blanchon,Christian Boitet,Marina Litvak,Mark Last,Slava Kisilevich,Daniel Keim,Hagay Lipman,Assaf Ben,Gur,Vishal Vachhani,Manoj Chinnakotla,Mitesh Khapra,Andy Way,Diptesh Chatterjee,Arpit Mishra,Assaf Ben Gur,Tetsuya Sakai,Gaël Lejeune,Oren Etzioni,Michele Banko,Stephen Soder,Daniel S,Ralph Grishman,Silja Huttunen,Roman,Linge,R Jp,T P Steinberger,R Weber,E Yan-Garber,D H Van Der Goot,Al,Khudhairy,Poibeau,Horacio Thierry,Roman Saggion,Yangarber,Lawrence Erlbaum Associates,Hillsdale N J,T Pattabhi,R K Rao,Sobha Lalitha,Devi
2010-01-01
Abstract:ii Introduction Welcome to the Coling Workshop on Cross Lingual Information Access. Cross-lingual information access (CLIA) is concerned with technologies and applications that enable people to freely access information expressed in any language which may differ from the query language. As the web has grown to include rich contents in many different languages, and with rapid globalization, there is a growing demand for CLIA. Ordinary netizens who surf the Internet for special information and communicate in social networks, global companies which provide multilingual services to their multinational customers, governments who aim to lower the barriers to international commerce and collaboration and homeland security are in need of CLIA. This has triggered vigorous research and development activity in CLIA. This workshop is the fourth in a series of workshops and aims to address the need of CLIA. The previous three workshops were held In this workshop, in addition to Cross-lingual Information Retrieval (CLIR), the focus is on multilingual information extraction, information integration, summarization and other key technologies that are useful for CLIA. The workshop aims to bring together researchers from a variety of fields such as information retrieval, computational linguistics, machine translation, and practitioners from government and industry to address the issue of information need of multilingual societies. This workshop also aims to highlight and emphasize the contributions of Natural Language Processing (NLP) and Computational Linguistics to CLIA, in addition to the previously better represented viewpoint from Information Retrieval. The workshop received a total of fourteen submissions, out of which the proceedings includes ten papers covering various aspects of this field. There are two papers on corpus acquisition. The papers by Pattabhi et al. and Lejune et al. focus on acquiring multilingual documents on various topics. There are three papers on bilingual lexicon acquisition. The papers by Okita et al. and Chatterjee et al. propose methods for word alignment and lexicon extraction from parallel and comparable corpora, while the paper by Rapp et al. proposes to learn dictionaries from monolingual corpora containing foreign words. Tang et al. do named entity translation for cross language question answering applications by combining a number of different sources, namely, machine translation, online encyclopedia and web documents. Falaise et al. use a light ontology to extraxt content from multilingual texts and user requests associated with images. Litvak et al. explore the performance of summarization methods across two languages. The paper by Vachchani et al. presents studies …