Results 11 - 20
of
46
Link Contexts in Classifier-Guided Topical Crawlers
"... Abstract—Context of a hyperlink or link context is defined as the terms that appear in the text around a hyperlink within a Web page. Link contexts have been applied to a variety of Web information retrieval and categorization tasks. Topical or focused Web crawlers have a special reliance on link co ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Abstract—Context of a hyperlink or link context is defined as the terms that appear in the text around a hyperlink within a Web page. Link contexts have been applied to a variety of Web information retrieval and categorization tasks. Topical or focused Web crawlers have a special reliance on link contexts. These crawlers automatically navigate the hyperlinked structure of the Web while using link contexts to predict the benefit of following the corresponding hyperlinks with respect to some initiating topic or theme. Using topical crawlers that are guided by a Support Vector Machine, we investigate the effects of various definitions of link contexts on the crawling performance. We find that a crawler that exploits words both in the immediate vicinity of a hyperlink as well as the entire parent page performs significantly better than a crawler that depends on just one of those cues. Also, we find that a crawler that uses the tag tree hierarchy within Web pages provides effective coverage. We analyze our results along various dimensions such as link context quality, topic difficulty, length of crawl, training data, and topic domain. The study was done using multiple crawls over 100 topics covering millions of pages allowing us to derive statistically strong results. Index Terms—Web Search, Web mining, performance evaluation. 1
Visual Foraging of Highlighted Text: An Eye-Tracking Study
- Proc. HCI International Conference
, 2007
"... The wide availability of digital reading material online is causing a major shift in everyday reading activities. Readers are increasingly skimming instead of reading in depth. Highlights are increasingly used in digital interfaces to direct attention toward relevant passages within texts. In this p ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
The wide availability of digital reading material online is causing a major shift in everyday reading activities. Readers are increasingly skimming instead of reading in depth. Highlights are increasingly used in digital interfaces to direct attention toward relevant passages within texts. In this paper, we study the eye-tracking behavior of subjects using both keyword highlighting and a new highlighting technique called ScentHighlights, introduced recently [7]. In this first eye-tracking study of highlighting interfaces, we show that there is direct evidence of the von Restorff isolation effect [21] in the eye-tracking data, in that subjects performed better when a fact is isolated (highlighted) against a homogeneous background. Users with the ScentHighlights condition paid more attention to highlighted areas and are more accurate than with other interfaces. In addition to confirming the von Restorff effect, we found that there is great variation in subject differences in reading strategies among subjects, even in the presence of strong cues such as highlights. Some readers scan for highly profitable regions first, while others read sequentially despite the presence of strong highlight cues. The results point to future design possibilities in highlighting interfaces. Author Keywords Automatic text highlighting, dynamic summarization, contextualization, personalized information access, eBooks, Information Scent.
Taking the Initiative with Extempore: Exploring Out-of-Turn Interactions with Websites
- Computing Research Repository (CoRR), 2003. Proceedings of the 2004 Joint ACM/IEEE Conference on Digital Libraries (JCDL’04) 1-58113-832-6/04 $ 20.00 © 2004 ACM
, 2003
"... We present the first study to explore the use of out-of-turn interaction in websites. Out-of-turn interaction is a technique which empowers the user to supply unsolicited information while browsing. This approach helps flexibly bridge any mental mismatch between the user and the website, in a manner ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
We present the first study to explore the use of out-of-turn interaction in websites. Out-of-turn interaction is a technique which empowers the user to supply unsolicited information while browsing. This approach helps flexibly bridge any mental mismatch between the user and the website, in a manner fundamentally different from faceted browsing and site-specific search tools. We built a user interface (Extempore) which accepts out-of-turn input via voice or text; and employed it in a US congressional website, to determine if users utilize out-of-turn interaction for information-finding tasks, and their rationale for doing so. The results indicate that users are adept at discerning when out-of-turn interaction is necessary in a particular task, and actively interleaved it with browsing. However, users found cascading information across information-finding subtasks challenging. Therefore, this work not only improves our understanding of out-of-turn interaction, but also suggests further opportunities to enrich browsing experiences for users.
Supporting Intelligent Web Search
- In ACM Transactions on Internet Technology Special Issue on Intelligent Techniques for Web Personalization
, 2007
"... Search engines continue to struggle to provide everyday users with a service capable of delivering focussed results that are relevant to their information needs. Moreover, traditional search engines really only provide users with a starting point for their information search. That is, upon selecting ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Search engines continue to struggle to provide everyday users with a service capable of delivering focussed results that are relevant to their information needs. Moreover, traditional search engines really only provide users with a starting point for their information search. That is, upon selecting a page from a search result list, the interaction between user and search engine is effectively over and the user must continue their search alone. In this paper, we argue that a comprehensive search service needs to provide the user with more help, both at the result list level and beyond, and we outline some recommendations for intelligent Web search support. We introduce the SearchGuide Web search support system and we describe how it fulfils the requirements for a search support system, providing evaluation results where applicable.
Users’ perspectives on the usefulness of structure for XML information retrieval
- In Proceedings of the 1st International Conference on the Theory of Inofrmation Retrieval
, 2007
"... Abstract: The widespread use of the eXtensible Markup Language (XML) on the Web and in digital libraries has led to a drastic increase in the number of XML Information Retrieval (IR) systems being developed. XML IR approaches exploit the logical structure of documents for their querying, retrieval a ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
Abstract: The widespread use of the eXtensible Markup Language (XML) on the Web and in digital libraries has led to a drastic increase in the number of XML Information Retrieval (IR) systems being developed. XML IR approaches exploit the logical structure of documents for their querying, retrieval and presentation to the user. Despite their abundance, there remains uncertainty regarding the advantages that structural information may bring to IR. In this paper we report on a user study exploring questions around the potential benefits of structure to users, such as: Is structural information useful when searching for relevant information? Can the structure of a document help to locate relevant information when browsing inside a document? Does the role of structural information depend on the length of a document? Our investigation was conducted as part of the INEX 2006 interactive track experiment, which we supplemented with questionnaires. Our qualitative analysis of the data collected from seven participants aims to identify how users will interact with XML IR systems. We do this by drawing parallels with paper based information searching, Web searching, and digital library searching. What we find is that XML IR users are unlike Web users – they use advanced search facilities, they prefer a list of results supplement with branch points into the document, and they need better methods of navigation within long documents. 1.
Designing Adaptive Feedback for Improving Data Entry Accuracy
"... Data quality is critical for many information-intensive applications. One of the best opportunities to improve data quality is during entry. USHER provides a theoretical, data-driven foundation for improving data quality during entry. Based on prior data, USHER learns a probabilistic model of the de ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
Data quality is critical for many information-intensive applications. One of the best opportunities to improve data quality is during entry. USHER provides a theoretical, data-driven foundation for improving data quality during entry. Based on prior data, USHER learns a probabilistic model of the dependencies between form questions and values. Using this information, USHER maximizes information gain. By asking the most unpredictable questions first, USHER is better able to predict answers for the remaining questions. In this paper, we use USHER’s predictive ability to design a number of intelligent user interface adaptations that improve data entry accuracy and efficiency. Based on an underlying cognitive model of data entry, we apply these modifications before, during and after committing an answer. We evaluated these mechanisms with professional data entry clerks working with real patient data from six clinics in rural Uganda. The results show that our adaptations has the potential to reduce error (by up to 78%), with limited effect on entry time (varying between-14 % and +6%). We believe this approach has wide applicability for improving the quality and availability of data, which is increasingly important for decision-making and resource allocation. ACM Classification: H5.2 [Information interfaces and presentation]:
ZEUS – Zoomable Explorative User Interface for Searching and Object Presentation
"... Abstract. In this paper we describe a first version of ZEUS, a web application that combines browsing, searching and object presentation. With the zooming and panning based navigation concept of ZEUS and a hierarchical organization of the information space we try to solve the problems of information ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Abstract. In this paper we describe a first version of ZEUS, a web application that combines browsing, searching and object presentation. With the zooming and panning based navigation concept of ZEUS and a hierarchical organization of the information space we try to solve the problems of information overload. It has to be evaluated if categorization, zooming and a full text search can minimize that the user gets lost in hyperspace. The concept of ZEUS is based on some thesis about human cognition, navigation and exploration which we hope to prove with evaluation and user testing of our application in the future.
Extracting named entities and relating them over time based on wikipedia
- Informatica
"... This paper presents an approach to mining information relating people, places, organizations and events extracted from Wikipedia and linking them on a time scale. The approach consists of two phases: (1) identifying relevant pages- categorizing the articles as containing people, places or organizati ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
This paper presents an approach to mining information relating people, places, organizations and events extracted from Wikipedia and linking them on a time scale. The approach consists of two phases: (1) identifying relevant pages- categorizing the articles as containing people, places or organizations; (2) generating timeline- linking named entities and extracting events and their time frame. We illustrate the proposed approach on 1.7 million Wikipedia articles. Povzetek: Predstavljene so metode rudarjenja informacij iz Wikipedie in urejanje v časovno zgradbo. 1
Using Information Scent to Model the Dynamic Foraging Behavior of Programmers in Maintenance Tasks
"... In recent years, the software engineering community has begun to study program navigation and tools to support it. Some of these navigation tools are very useful, but they lack a theoretical basis that could reduce the need for ad hoc tool building approaches by explaining what is fundamentally nece ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
In recent years, the software engineering community has begun to study program navigation and tools to support it. Some of these navigation tools are very useful, but they lack a theoretical basis that could reduce the need for ad hoc tool building approaches by explaining what is fundamentally necessary in such tools. In this paper, we present PFIS (Programmer Flow by Information Scent), a model and algorithm of programmer navigation during software maintenance. We also describe an experimental study of expert programmers debugging real bugs described in real bug reports for a real Java application. We found that PFIS’ performance was close to aggregated human decisions as to where to navigate, and was significantly better than individual programmers ’ decisions. Author Keywords Information foraging, debugging, software maintenance
ScentIndex: Conceptually Reorganizing Subject Indexes for eBooks. (submitted for publication
, 2004
"... A great deal of analytical work is done in the context of reading, in digesting the semantics of the material, the identification of important entities, and capturing the relationship between entities. Visual analytic environments, therefore, must encompass reading tools that enable the rapid digest ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
A great deal of analytical work is done in the context of reading, in digesting the semantics of the material, the identification of important entities, and capturing the relationship between entities. Visual analytic environments, therefore, must encompass reading tools that enable the rapid digestion of large amount of reading material. Other than plain text search, subject indexes, and basic highlighting, tools are needed for rapid foraging of text. In this paper, we describe a technique that presents an enhanced subject index for a book by conceptually reorganizing it to suit particular expressed user information needs. Users first enter information needs via keywords describing the concepts they are trying to retrieve and comprehend. Then our system, called ScentIndex, computes what index entries are conceptually related, and reorganizes and displays these index entries on a single page. We also provide a number of navigational cues to help users peruse over this list of index entries and find relevant passages quickly. Compared to regular reading of a paper book, our study showed that users are more efficient and more accurate in finding, comparing, and comprehending material in our system.

