NLP for IR - Natural Language Processing for Information Retrieval (2004)
Abstract:
s are reduced forms of text -- NLP may not be useful in extracting good features since it has already been done n 1990s saw arrival of TREC workshops -- Large collections (gigabytes) -- Full-text documents -- Queries and relevance judgments n Now it should be possible for NLP to help, right? Copyright 2000 by James Allan, CIIR Sampled NLP at TREC... n UMass -- Processing of query to remove "stop phrases", find phrases . Phrases handled by proximity operators in query -- Analyzing documents to find companies, countries, etc... n CMU/Claritech -- Identify noun phrases, verb phrases, prepositional phrases, and sub-phrases -- Find related thesaurus terms n NYU/GE -- Document analysis for head/modifier normalization . "info retrieval" and "retrieval of info" and "info that is retrieved" all converted to "info+retrieval" -- Part of speech tagging, noun phrases, name identification 28 Copyright 2000 by James Allan, CIIR TREC's NLP track... n Run primarily in TREC-5 (1996)...
Citations
| 195 | Viewing morphology as an inference process, in – Krovetz |
| 192 | Automatic Information Organization and Retrieval – Salton - 1968 |
| 130 | Using WordNet to disambiguate word senses for text retrieval – Voorhees - 1993 |
| 20 | A new comparison between conventional indexing (MEDLARS) and automatic text processing (SMART – Salton - 1972 |
| 1 | As cited in Sparck Jones 1999.) -- C.W. Cleverdon, "The Cranfield tests on index language devices – Gauthier-Villars - 1970 |

