Results 11 - 20
of
140
A Fast Similarity Join Algorithm Using Graphics Processing Units
"... Abstract — A similarity join operation A ⋊⋉ɛ B takes two sets of points A, B and a value ɛ ∈ R, and outputs pairs of points p ∈ A, q ∈ B, such that the distance D(p, q) ≤ ɛ. Similarity joins find use in a variety of fields, such as clustering, text mining, and multimedia databases. A novel similari ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
Abstract — A similarity join operation A ⋊⋉ɛ B takes two sets of points A, B and a value ɛ ∈ R, and outputs pairs of points p ∈ A, q ∈ B, such that the distance D(p, q) ≤ ɛ. Similarity joins find use in a variety of fields, such as clustering, text mining, and multimedia databases. A novel similarity join algorithm called LSS is presented that executes on a Graphics Processing Unit (GPU), exploiting its parallelism and high data throughput. As GPUs only allow simple data operations such as the sorting and searching of arrays, LSS uses these two operations to cast a similarity join operation as a GPU sort-and-search problem. It first creates, on the fly, a set of space-filling curves on one of its input datasets, using a parallel GPU sort routine. Next, LSS processes each point p of the other dataset in parallel. For each p, it searches an interval of one of the space-filling curves guaranteed to contain all the pairs in which p participates. Using extensive theoretical and experimental analysis, LSS is shown to offer a good balance between time and work efficiency. Experimental results demonstrate that LSS is suitable for similarity joins in large high-dimensional datasets, and that it performs well when compared against two existing prominent similarity join methods. I.
Learning while holding a conversation with a computer
- In L. PytlikZillig, M. Bodvarsson, & R. Bruning (Eds.), Technology-based
, 2005
"... Some of the recent electronic learning environments have moved beyond the conventional delivery of text, multimedia, and objective tests. There are systems with animated conversational agents, intelligent adaptive tutoring, interactive simulations, and other features designed to engage learners and ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
Some of the recent electronic learning environments have moved beyond the conventional delivery of text, multimedia, and objective tests. There are systems with animated conversational agents, intelligent adaptive tutoring, interactive simulations, and other features designed to engage learners and promote deeper comprehension. One system is AutoTutor, a learning environment that tutors students by holding a conversation in natural language. AutoTutor’s design was inspired by explanation-based constructivist theories of learning, intelligent tutoring systems that adaptively respond to student knowledge, and empirical research on dialogue patterns in tutorial discourse. AutoTutor presents challenging questions and then engages in mixed initiative dialogue that guides the student in building an answer. It provides feedback to the student on what the student types in (positive, neutral, negative feedback), pumps the student for more information, prompts the student to fill in missing words, gives hints, fills in missing information, identifies and
Frequency of basic English grammatical structures: A corpus analysis
- JOURNAL OF MEMORY AND LANGUAGE
, 2007
"... Many recent models of language comprehension have stressed the role of distributional frequencies in determining the
relative accessibility or ease of processing associated with a particular lexical item or sentence structure. However, there
exist relatively few comprehensive analyses of structural ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
Many recent models of language comprehension have stressed the role of distributional frequencies in determining the
relative accessibility or ease of processing associated with a particular lexical item or sentence structure. However, there
exist relatively few comprehensive analyses of structural frequencies, and little consideration has been given to the appro-
priateness of using any particular set of corpus frequencies in modeling human language. We provide a comprehensive set
of structural frequencies for a variety of written and spoken corpora, focusing on structures that have played a critical role
in debates on normal psycholinguistics, aphasia, and child language acquisition, and compare our results with those from
several recent papers to illustrate the implications and limitations of using corpus data in psycholinguistic research.
Multimodal Dialogue Systems for Interactive TVApplications
- in Proceedings of 4th IEEE International Conference on Multimodal Interfaces
, 2002
"... Many studies have shown the advantages of building multimodal systems, however not in the interactive TV application context. This paper reports on a qualitative study of a multimodal program guide for interactive TV. The system was designed by adding speech interaction to an already existing TV pro ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
Many studies have shown the advantages of building multimodal systems, however not in the interactive TV application context. This paper reports on a qualitative study of a multimodal program guide for interactive TV. The system was designed by adding speech interaction to an already existing TV program guide. Study results indicate that spoken natural language input combined with visual output is preferable for TV applications. Furthermore, the user feedback requires a clear distinction between the dialogue system's domain result and system status in the visual output. Consequently, we propose an interaction model that consists of three entities: user, domain results, and system feedback.
Bigram Analysis of Java Bytecode Sequences
- In Second Workshop on Intermediate Representation Engineering for the Java Virtual Machine
, 2002
"... Introduction Much research has been conducted in the analysis of Java bytecodes in order to gain a better understanding of how Java programs behave. One branch of this research has focused on analysing bytecode usage within the Java Virtual Machine (JVM), with particular emphasis on analysing byteco ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
Introduction Much research has been conducted in the analysis of Java bytecodes in order to gain a better understanding of how Java programs behave. One branch of this research has focused on analysing bytecode usage within the Java Virtual Machine (JVM), with particular emphasis on analysing bytecodes associated with various benchmark programs. Previous research has focused on the frequencies of the individual bytecodes at the static class-file level [2]. Another branch examines dynamic bytecodes, as executed by the JVM itself at run-time [4, 6]. This project follows on from previous dynamic bytecode analysis, analysing streams of Java bytecodes produced at the platform independent level. It di#ers from previous projects, in that it is not concentrating on the occurrences of the individual bytecodes, but in the occurrences of bigrams, or bytecode pairs. We report on a project that performed a bigram analysis of dynamic bytecode sequences. The objective was to identify the most comm
The information conveyed by words in sentences
- Journal of Psycholinguistic Research
, 2003
"... A method is presented for calculating the amount of information conveyed to a hearer by a speaker emitting a sentence generated by a probabilistic grammar known to both parties. The method applies the work of Grenander (1967) to the intermediate states of a top-down parser. This allows the uncertain ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
A method is presented for calculating the amount of information conveyed to a hearer by a speaker emitting a sentence generated by a probabilistic grammar known to both parties. The method applies the work of Grenander (1967) to the intermediate states of a top-down parser. This allows the uncertainty about structural ambiguity to be calculated at each point in a sentence. Subtracting these values at successive points gives the information conveyed by a word in a sentence. Word-byword information conveyed is calculated for several small probabilistic grammars, and it is suggested that the number of bits conveyed per word is a determinant of reading times and other measures of cognitive load. KEY WORDS: computational psycholinguistics; entropy reduction.
Using Citations to Generate Surveys of Scientific Paradigms
"... The number of research publications in various disciplines is growing exponentially. Researchers and scientists are increasingly finding themselves in the position of having to quickly understand large amounts of technical material. In this paper we present the first steps in producing an automatica ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
The number of research publications in various disciplines is growing exponentially. Researchers and scientists are increasingly finding themselves in the position of having to quickly understand large amounts of technical material. In this paper we present the first steps in producing an automatically generated, readily consumable, technical survey. Specifically we explore the combination of citation information and summarization techniques. Even though prior work (Teufel et al., 2006) argues that citation text is unsuitable for summarization, we show that in the framework of multi-document survey creation, citation texts can play a crucial role. 1
Variation in language and cohesion across written and spoken registers
- In
, 2004
"... This paper investigates the variation in cohesion across written and spoken registers. The same method and corpora were used as in Biber’s (1988) study on linguistic variation across speech and writing; however instead of focusing on 67 linguistic features that primarily operate at the word level, w ..."
Abstract
-
Cited by 6 (5 self)
- Add to MetaCart
This paper investigates the variation in cohesion across written and spoken registers. The same method and corpora were used as in Biber’s (1988) study on linguistic variation across speech and writing; however instead of focusing on 67 linguistic features that primarily operate at the word level, we compared 236 language and cohesion features at the textlevel. Variations in frequencies across these features provided evidence for six dimensions: (1) speech versus writing, (2) informational versus declarative, (3) factual versus situational, (4) topic consistency versus topic variation, (5) elaborative versus constrained, (6) narrative versus non-narrative. Our cohesion and linguistic analysis showed most variation in speech and writing, whereas the linguistic feature analysis operating at the word level did not yield any difference.
Where Should Complexity Go? Cooperation in Complex Agents with Minimal Communication
- Innovative Concepts for Agent-Based Systems
, 2002
"... The `Radical Agent Concept' in this chapter is that communication between agents in a MAS should be the simplest part of the system. When extensive real-time coordination between modules is required, then those modules should probably be considered elements of a single modular agent rather than ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
The `Radical Agent Concept' in this chapter is that communication between agents in a MAS should be the simplest part of the system. When extensive real-time coordination between modules is required, then those modules should probably be considered elements of a single modular agent rather than as agents themselves. The advantage of this distinction is that system developers can then leverage standard software-engineering practices and more centralized coordination mechanisms to reduce the over-all complexity of the system. In this chapter I provide arguments for this point and also examples, both from nature and from my own research in building modular agents.
Co-constructed narratives in online, collaborative mathematics problem solving. Paper presented at the international conference
- on AI in Education (AI-Ed 2005
, 2005
"... Abstract. Our approach to the study of the narrative aspects of learning mathematical problem-solving extends the conception of narrative as the central artefact of interest, to include the process of collaborative dialog and emergent narratives. This perspective favours the conception of the dialog ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
Abstract. Our approach to the study of the narrative aspects of learning mathematical problem-solving extends the conception of narrative as the central artefact of interest, to include the process of collaborative dialog and emergent narratives. This perspective favours the conception of the dialogical aspects of interaction as shared achievements of co-participants and central meaning-making procedures. On the other hand, our qualitative analysis of transcripts of collaborative problem-solving interactions online revealed striking resemblances with the narrative form. Based on these observations we attempt to establish a link between the narrative and dialogical perspectives and explore relevant implications for the design of the Virtual Math Teams collaborative learning environment. Truth is not to be found inside the head of an individual person, it is born between people collectively searching for truth, in the process of their dialogic interaction. (Bakhtin, [1], p.110)

