Results 1 -
2 of
2
Information Extraction as a core language technology: What is IE?
- of Lecture Notes in Computer Science, chapter In M-T. Pazienza (ed.), Information Extraction
, 1997
"... this paper, between traditional IR and the newer IE , is not totally clear everywhere but can itself become a question of degree. Suppose parsing systems that produce syntactic and logical representations were so good, as some now believe, that they could process huge corpora in an acceptably short ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
this paper, between traditional IR and the newer IE , is not totally clear everywhere but can itself become a question of degree. Suppose parsing systems that produce syntactic and logical representations were so good, as some now believe, that they could process huge corpora in an acceptably short time. One can then think of the traditional task of computer question answering in two quite different ways. The old way was to translate a question into a formalised language like SQL and use it to retrieve information from a database- as in "Tell me all the IBM executives over 40 earning under $50K a year". But with a full parser of large corpora one could now imagine transforming ing the query to form an IE template and searching the WHOLE TEXT (not a data base) for all examples of such employees---both methods should produce exactly the same result starting from different information sources --- a text versus a formalised database. What we have called an IE template can now be seen as a kind of frozen query that one can reuse many times on a corpus and is therefore only important when one wants stereotypical, repetitive, information back rather than the answer to one-off questions. "Tell me the height of Everest?", as a question addressed to a formalised text corpus is then neither a IR nor IE but a perfectly reasonable single request for an answer. "Tell me about fungi", addressed to a text corpus with an IR system, will produce a set of relevant documents but no particular answer. Tell me what films my favourite movie critics likes, addressed to the right text corpus , is undoubtedly IE as we saw, and will produce an answer also. The needs and the resources available determine the techniques that are relevant, and those in turn determine what it is to answer a questio...

