Results 1  10
of
71
Using Semantic Role to Improve Question Answering
 In Proceedings of EMNLP 2007
, 2007
"... Shallow semantic parsing, the automatic identification and labeling of sentential constituents, has recently received much attention. Our work examines whether semantic role information is beneficial to question answering. We introduce a general framework for answer extraction which exploits semanti ..."
Abstract

Cited by 67 (3 self)
 Add to MetaCart
(Show Context)
Shallow semantic parsing, the automatic identification and labeling of sentential constituents, has recently received much attention. Our work examines whether semantic role information is beneficial to question answering. We introduce a general framework for answer extraction which exploits semantic role annotations in the FrameNet paradigm. We view semantic role assignment as an optimization problem in a bipartite graph and answer extraction as an instance of graph matching. Experimental results on the TREC datasets demonstrate improvements over stateoftheart models. 1
A Polynomial Time Computable Metric Between Point Sets
, 2000
"... Measuring the similarity or distance between two sets of points in a metric space is an important problem in machine learning and has also applications in other disciplines e.g. in computational geometry, philosophy of science, methods for updating or changing theories, . . . . Recently Eiter and Ma ..."
Abstract

Cited by 54 (5 self)
 Add to MetaCart
Measuring the similarity or distance between two sets of points in a metric space is an important problem in machine learning and has also applications in other disciplines e.g. in computational geometry, philosophy of science, methods for updating or changing theories, . . . . Recently Eiter and Mannila have proposed a new measure which is computable in polynomial time. However, it is not a distance function in the mathematical sense because it does not satisfy the triangle inequality.
Crosslingual annotation projection for semantic roles
 Journal of Artificial Intelligence Research
, 2009
"... This article considers the task of automatically inducing rolesemantic annotations in the FrameNet paradigm for new languages. We propose a general framework that is based on annotation projection, phrased as a graph optimization problem. It is relatively inexpensive and has the potential to reduce ..."
Abstract

Cited by 36 (3 self)
 Add to MetaCart
This article considers the task of automatically inducing rolesemantic annotations in the FrameNet paradigm for new languages. We propose a general framework that is based on annotation projection, phrased as a graph optimization problem. It is relatively inexpensive and has the potential to reduce the human effort involved in creating rolesemantic resources. Within this framework, we present projection models that exploit lexical and syntactic information. We provide an experimental evaluation on an EnglishGerman parallel corpus which demonstrates the feasibility of inducing highprecision German semantic role annotation both for manually and automatically annotated English data. 1.
A Framework for Defining Distances Between FirstOrder Logic Objects
, 1998
"... this paper we develop a framework for distances between clauses and distances between models. The framework can be parametrised by a measure for the distance between atoms. It takes into account subterms common to distinct atoms of a set of atoms in the measurement of the distance between sets. More ..."
Abstract

Cited by 34 (3 self)
 Add to MetaCart
this paper we develop a framework for distances between clauses and distances between models. The framework can be parametrised by a measure for the distance between atoms. It takes into account subterms common to distinct atoms of a set of atoms in the measurement of the distance between sets. Moreover, for a constant number of variables, the complexity of the distance computation is polynomially bounded by the size of the objects. Initial experiments show that the framework can be the basis of good clustering algorithms. The framework consists of three levels: At the first level one chooses a distance between atoms . The second level upgrades this distance to a distance between sets of atoms. We propose a framework that is a generalisation of three polynomial time computable similarity measures proposed by Eiter and Mannila, and an instance which is a real distance function, computable in polynomial time. We develop also a binary prototype function for sets of points. Prototype fun
Using sets of feature vectors for similarity search on voxelized CAD objects
 in Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data
, 2003
"... In modern application domains such as multimedia, molecular biology and medical imaging, similarity search in database systems is becoming an increasingly important task. Especially for CAD applications, suitable similarity models can help to reduce the cost of developing and producing new parts by ..."
Abstract

Cited by 33 (13 self)
 Add to MetaCart
(Show Context)
In modern application domains such as multimedia, molecular biology and medical imaging, similarity search in database systems is becoming an increasingly important task. Especially for CAD applications, suitable similarity models can help to reduce the cost of developing and producing new parts by maximizing the reuse of existing parts. Most of the existing similarity models are based on feature vectors. In this paper, we shortly review three models which pursue this paradigm. Based on the most promising of these three models, we explain how sets of feature vectors can be used for more effective and still efficient similarity search. We first introduce an intuitive distance measure on sets of feature vectors together with an algorithm for its efficient computation. Furthermore, we present a method for accelerating the processing of similarity queries on vector set data. The experimental evaluation is based on two real world test data sets and points out that our new similarity approach yields more meaningful results in comparatively short time. 1.
Counting Pedestrians in Video Sequences Using Trajectory Clustering
 IEEE Transactions on Circuits and Systems for Video Technology, Vol.16, Issue
, 2006
"... Abstract—In this paper, we propose the use of lustering methods for automatic counting of pedestrians in video sequences. As input, we consider the output of those detection/tracking systems that overestimate the number of targets. Clustering techniques are applied to the resulting trajectories in o ..."
Abstract

Cited by 32 (1 self)
 Add to MetaCart
(Show Context)
Abstract—In this paper, we propose the use of lustering methods for automatic counting of pedestrians in video sequences. As input, we consider the output of those detection/tracking systems that overestimate the number of targets. Clustering techniques are applied to the resulting trajectories in order to reduce the bias between the number of tracks and the real number of targets. The main hypothesis is that those trajectories belonging to the same human body are more similar than trajectories belonging to different individuals. Several data representations and different distance/similarity measures are proposed and compared, under a common hierarchical clustering framework, and both quantitative and qualitative results are presented. I.
Contextbased similarity measures for categorical databases
 of Lecture Notes in Computer Science
, 2000
"... ..."
(Show Context)
MultiStep DensityBased Clustering
"... Abstract. Data mining in large databases of complex objects from scientific, engineering or multimedia applications is getting more and more important. In many areas, complex distance measures are first choice but also simpler distance functions are available which can be computed much more efficien ..."
Abstract

Cited by 25 (9 self)
 Add to MetaCart
(Show Context)
Abstract. Data mining in large databases of complex objects from scientific, engineering or multimedia applications is getting more and more important. In many areas, complex distance measures are first choice but also simpler distance functions are available which can be computed much more efficiently. In this paper, we will demonstrate how the paradigm of multistep query processing which relies on exact as well as on lowerbounding approximated distance functions can be integrated into the two densitybased clustering algorithms DBSCAN and OPTICS resulting in a considerable efficiency boost. Our approach tries to confine itself to εrange queries on the simple distance functions and carries out complex distance computations only at that stage of the clustering algorithm where they are compulsory to compute the correct clustering result. Furthermore, we will show how our approach can be used for approximated clustering allowing the user to find an individual tradeoff between quality and efficiency. In order to assess the quality of the resulting clusterings, we introduce suitable quality measures which can be used generally for evaluating the quality of approximated partitioning and hierarchical clusterings. In a broad experimental evaluation based on realworld test data sets, we demonstrate that our approach accelerates the generation of exact densitybased clusterings by more than one order of magnitude. Furthermore, we show that our approximated clustering approach results in high quality clusterings where the desired quality is scalable w.r.t. the overall number of exact distance computations. 1
Thesus: Organizing Web Document Collections Based on Link Semantics
 VLDB J
, 2003
"... The requirements for effective search and management of the WWW are stronger than ever. Currently web documents are classified based on their content not taking into account the fact that these documents are connected to each other by links. We claim that a page's classification is enriched by ..."
Abstract

Cited by 24 (3 self)
 Add to MetaCart
(Show Context)
The requirements for effective search and management of the WWW are stronger than ever. Currently web documents are classified based on their content not taking into account the fact that these documents are connected to each other by links. We claim that a page's classification is enriched by the detection of its incoming links' semantics. This would enable effective browsing and enhance the validity of search results in the WWW context. Another aspect that is under addressed and is strictly related to the tasks of browsing and searching is the similarity of documents at the semantic level. The above observations lead us to the adoption of a hierarchy of concepts (ontology) and a thesaurus to exploit links and provide a better characterization of web documents. The enhancement of the documents characterization makes operations such as clustering and labeling become very interesting. To this end, we devised a system called THESUS. The system deals with an initial sets of web documents, extracts keywords from all pages' incoming links and converts them to semantics by mapping them to a domain's ontology. Subsequently, a clustering algorithm is applied to discover groups of web documents. The effectiveness of the clustering process is based on the use of a novel similarity measure between documents characterized by sets of terms. Web documents are organized into thematic subsets based on their semantics. The subsets are then labeled, thus enabling easier management (browsing, searching, querying) of the Web. In this article, we detail the process of this system and give an experimental analysis of its restfits.
Distances and (indefinite) kernels for sets of objects
 In ICDM
, 2006
"... For various classification problems involving complex data, it is most natural to represent each training example as a set of vectors. While several distance measures for sets have been proposed, only a few kernels over these structures exist since it is difficult in general to design a positive sem ..."
Abstract

Cited by 15 (1 self)
 Add to MetaCart
(Show Context)
For various classification problems involving complex data, it is most natural to represent each training example as a set of vectors. While several distance measures for sets have been proposed, only a few kernels over these structures exist since it is difficult in general to design a positive semidefinite (PSD) similarity function. The main disadvantage of most existing set kernels is that they are based on averaging, which might be inappropriate for problems where only specific elements of the two sets should determine the overall similarity. In this paper we propose a class of kernels for sets of vectors directly exploiting set distance measures and, hence, incorporating various semantics into set kernels and lending the power of regularization to learning in structural domains where natural distance functions exist. These kernels belong to two groups: (i) kernels in the proximity space induced by set distances and (ii) set distance substitution kernels (nonPSD in general). We report experimental results which show that our kernels compare favorably with kernels based on averaging and achieve results similar to other stateoftheart methods. At the same time our kernels bring systematically improvement over the naive way of exploiting distances. 1