By Jian-Yun Nie, Graeme Hirst
Look for details isn't any longer solely restricted in the local language of the person, yet is progressively more prolonged to different languages. this offers upward push to the matter of cross-language info retrieval (CLIR), whose aim is to discover appropriate info written in a unique language to a question. as well as the issues of monolingual details retrieval (IR), translation is the foremost challenge in CLIR: one may still translate both the question or the files from a language to a different. despite the fact that, this translation challenge isn't just like full-text computing device translation (MT): the target isn't really to provide a human-readable translation, yet a translation appropriate for locating suitable files. particular translation equipment are hence required. The aim of this e-book is to supply a complete description of the specifi c difficulties bobbing up in CLIR, the options proposed during this sector, in addition to the remainder difficulties. The publication begins with a common description of the monolingual IR and CLIR difficulties. diverse sessions of techniques to translation are then offered: ways utilizing an MT approach, dictionary-based translation and techniques in accordance with parallel and related corpora. additionally, the common retrieval effectiveness utilizing varied ways is in comparison. will probably be proven that translation techniques particularly designed for CLIR can rival and outperform high quality MT platforms. eventually, the ebook deals a glance into the long run that pulls a robust parallel among question enlargement in monolingual IR and question translation in CLIR, suggesting that many methods built in monolingual IR could be tailored to CLIR. The e-book can be utilized as an creation to CLIR. complex readers may also locate extra technical info and discussions concerning the closing study demanding situations sooner or later. it's appropriate to new researchers who intend to hold out examine on CLIR.
Read or Download Cross-language Information Retrieval (Synthesis Lectures on Human Language Technologies) PDF
Similar ai & machine learning books
This quantity offers accomplished, self-consistent assurance of 1 method of machine imaginative and prescient, with many direct or implied hyperlinks to human imaginative and prescient. The ebook is the results of decades of study into the bounds of human visible functionality and the interactions among the observer and his atmosphere.
This ebook makes a speciality of the sensible matters and techniques to dealing with longitudinal and multilevel facts. All information units and the corresponding command records can be found through the internet. The operating examples are available the 4 significant SEM packages--LISREL, EQS, MX, and AMOS--and Multi-level packages--HLM and MLn.
It really is changing into an important to correctly estimate and display screen speech caliber in a number of ambient environments to assure prime quality speech verbal exchange. This useful hands-on publication indicates speech intelligibility dimension tools in order that the readers can begin measuring or estimating speech intelligibility in their personal approach.
Examine in common Language Processing (NLP) has swiftly complex lately, leading to fascinating algorithms for stylish processing of textual content and speech in numerous languages. a lot of this paintings specializes in English; during this e-book we tackle one other team of attention-grabbing and hard languages for NLP examine: the Semitic languages.
Additional info for Cross-language Information Retrieval (Synthesis Lectures on Human Language Technologies)
Although there are often hybrid systems, we can generally classify MT systems into two categories: traditional rule-based MT and statistical MT (SMT). Systran is a typical rule-based MT system. The MT systems of Google and Language Weaver are statistical systems. com. com. com. 30 cross-language information retrieval using rules and resources constructed manually. Rules and resources can be of different types: lexical, phrasal, syntactic, semantic, and so on. For example, translations stored in a bilingual dictionary provide basic resources for lexical translation.
This is the case of multimedia information retrieval, in which the multimedia information can also be described or annotated in a different language. , picture of moon eclipse), it is not important whether the image is described or annotated in the user’s native language. What is important is the image itself. Multimedia IR may or may not be related to MLIR and CLIR, depending on the technique used to identify the appropriate images. The case that is related to MLIR and CLIR is when we use the textual description of image to determine if the image is relevant, and the textual description is in a language different from that of the query (also a textual description).
Although translation in CLIR shares many of the problems in general translation, it also has its own problems, and can be dealt with in a different way. In CLIR literature, in addition to full text machine translation, the following two approaches are also widely proposed and tested: • • Dictionary-based translation: this approach tries to identify and select the possible translations of each source word from a bilingual dictionary. The translation words form together a representation of the query in the target language.