Results 1 - 10
of
21
An open source grammar development environment and broad-coverage English grammar using HPSG
- IN PROCEEDINGS OF LREC 2000
, 2000
"... The LinGO (Linguistic Grammars Online) project's English Resource Grammar and the LKB grammar development environment are language resources which are freely available for download for any purpose, including commercial use (see http://lingo.stanford.edu). Executable programs and source code are both ..."
Abstract
-
Cited by 81 (5 self)
- Add to MetaCart
The LinGO (Linguistic Grammars Online) project's English Resource Grammar and the LKB grammar development environment are language resources which are freely available for download for any purpose, including commercial use (see http://lingo.stanford.edu). Executable programs and source code are both included. In this paper, we give an outline of the LinGO English grammar and LKB system, and discuss the ways in which they are currently being used. The grammar and processing system can be used independently or combined to give a central component which can be exploited in a variety of ways. Our intention in writing this paper is to encourage more people to use the technology, which supports collaborative development on many levels.
The Grammar Matrix: An Open-Source Starter-Kit for the Rapid Development of Cross-Linguistically Consistent Broad-Coverage Precision Grammars
- Proceedings of the Workshop on Grammar Engineering and Evaluation at the 19th International Conference on Computational Linguistics
, 2002
"... The grammar matrix is an open-source starter-kit for the development of broadcoverage HPSGs. By using a type hierarchy to represent cross-linguistic generalizations and providing compatibility with other open-source tools for grammar engineering, evaluation, parsing and generation, it facilit ..."
Abstract
-
Cited by 32 (9 self)
- Add to MetaCart
The grammar matrix is an open-source starter-kit for the development of broadcoverage HPSGs. By using a type hierarchy to represent cross-linguistic generalizations and providing compatibility with other open-source tools for grammar engineering, evaluation, parsing and generation, it facilitates not only quick start-up but also rapid growth towards the wide coverage necessary for robust natural language processing and the precision parses and semantic representations necessary for natural language understanding.
The DeepThought core architecture framework
- In Proceedings of LREC-2004, volume IV
, 2004
"... The research performed in the DeepThought project aims at demonstrating the potential of deep linguistic processing if combined with shallow methods for robustness. Classical information retrieval is extended by high precision concept indexing and relation detection. On the basis of this approach, t ..."
Abstract
-
Cited by 27 (7 self)
- Add to MetaCart
The research performed in the DeepThought project aims at demonstrating the potential of deep linguistic processing if combined with shallow methods for robustness. Classical information retrieval is extended by high precision concept indexing and relation detection. On the basis of this approach, the feasibility of three ambitious applications will be demonstrated, namely: precise information extraction for business intelligence; email response management for customer relationship management; creativity support for document production and collective brainstorming. Common to these applications, and the basis for their development is the XMLbased, RMRS-enabled core architecture framework that will be described in detail in this paper. The framework is not limited to the applications envisaged in the DeepThought project, but can also be employed e.g. to generate and make use of XML standoff annotation of documents and linguistic corpora, and in general for a wide range of NLP-based applications and research purposes.
LinGO Redwoods - A Rich and Dynamic Treebank for HPSG
- In Beyond PARSEVAL. Workshop of the Third LREC Conference
, 2002
"... The LinGO Redwoods initiative is a seed activity in the design and development of a new type of treebank. A treebank is a (typically hand-built) collection of natural language utterances and associated linguistic analyses; typical treebanks---as for example the widely recognized Penn Treebank (Ma ..."
Abstract
-
Cited by 24 (6 self)
- Add to MetaCart
The LinGO Redwoods initiative is a seed activity in the design and development of a new type of treebank. A treebank is a (typically hand-built) collection of natural language utterances and associated linguistic analyses; typical treebanks---as for example the widely recognized Penn Treebank (Marcus, Santorini, & Marcinkiewicz, 1993), the Prague Dependency Treebank (Hajic, 1998), or the German TiGer Corpus (Skut, Krenn, Brants, & Uszkoreit, 1997)---assign syntactic phrase structure or tectogrammatical dependency trees over sentences taken from a naturallyoccuring source, often newspaper text. Applications of existing treebanks fall into two broad categories: (i) use of an annotated corpus in empirical linguistics as a source of structured language data and distributional patterns and (ii) use of the treebank for the acquisition (e.g. using stochastic or machine learning approaches) and evaluation of parsing systems.
Efficient Deep Processing of Japanese
, 2002
"... We present a broad coverage Japanese grammar written in the HPSG formalism with MRS semantics. The grammar is created for use in real world applications, such that robustness and performance issues play an important role. It is connected to a POS tagging and word segmentation tool. ..."
Abstract
-
Cited by 20 (6 self)
- Add to MetaCart
We present a broad coverage Japanese grammar written in the HPSG formalism with MRS semantics. The grammar is created for use in real world applications, such that robustness and performance issues play an important role. It is connected to a POS tagging and word segmentation tool.
Continuous or discontinuous constituents? a comparison between syntactic analyses for constituent order and their processing systems
- Research on Language and Computation
, 2004
"... Abstract. In this paper I discuss several possible analyses for constituent order in German. Approaches that assume continuous constituents are compared with an approach that assumes discontinuous constituents. I will show that certain proposals that have been made to analyze constituent order are e ..."
Abstract
-
Cited by 16 (1 self)
- Add to MetaCart
Abstract. In this paper I discuss several possible analyses for constituent order in German. Approaches that assume continuous constituents are compared with an approach that assumes discontinuous constituents. I will show that certain proposals that have been made to analyze constituent order are either not adequate or cannot be implemented with currently available systems. For the proposals that can be implemented I will discuss the amount of work a parser has to do. I then compare two implementations of larger fragments of German: the Verbmobil grammar and the Babel grammar. It is shown that the amount of work to be done to parse the Verbmobil grammar is significantly higher than the work that has to be done parsing with the Babel grammar. Key words: German, HPSG, implementation, linearization, parsing 1.
Relative clause extraposition in german: An efficient and portable implementation
- Research on Language and Computation
, 2005
"... Abstract. In this paper, I propose an implementation of relative clause extraposition in German. The proposal builds on Kiss (in press) who treats relative clause extraposition as an anaphoric process by means of percolation of anchors to which the relative clause is bound. I discuss several sources ..."
Abstract
-
Cited by 11 (3 self)
- Add to MetaCart
Abstract. In this paper, I propose an implementation of relative clause extraposition in German. The proposal builds on Kiss (in press) who treats relative clause extraposition as an anaphoric process by means of percolation of anchors to which the relative clause is bound. I discuss several sources of spurious ambiguity in Kiss’s original formulation and suggest a two-step percolation of anchors that crucially distinguishes right-peripheral from central or left-peripheral percolation. Since extraposition is fairly productive, and phrase-structure alternates between head initial (prepositional phrases, V-initial) and head-final structures (postpositional phrases, V-final), German provides a good testing ground for techniques controlling spurious ambiguity that may easily be ported to languages where phrase structure is more canonical and/or extraposition more restricted. Finally, the performance of the Kissstyle approach is compared to an alternative implementation in terms of rightward movement, similar to Keller (1995).
Measure For Measure: Parser Cross-Fertilization - Towards Increased Component Comparability and Exchange
, 2000
"... Over the past few years significant progress was accomplished in efficient processing with wide-coverage hpsg grammars. hpsg-based parsing systems are now available that can process medium-complexity sentences (of ten to twenty words, say) in average parse times equivalent to real (i.e. human readin ..."
Abstract
-
Cited by 10 (5 self)
- Add to MetaCart
Over the past few years significant progress was accomplished in efficient processing with wide-coverage hpsg grammars. hpsg-based parsing systems are now available that can process medium-complexity sentences (of ten to twenty words, say) in average parse times equivalent to real (i.e. human reading) time. A large number of engineering improvements in current hpsg systems were achieved through collaboration of multiple research centers and mutual exchange of experience, encoding techniques, algorithms, and even pieces of software. This article presents an approach to grammar and system engineering, termed competence & performance profiling, that makes systematic experimentation and the precise empirical study of system properties a focal point in development. Adapting the profiling metaphor familiar from software engineering to constraint-based grammars and parsers, enables developers to maintain an accurate record of system evolution, identify grammar and system deficiencies quickl...
The automatic acquisition of verb subcategorisations and their impact on the performance of an HPSG parser
- In Proc. of the First International Joint Conference on Natural Language Processing (IJCNLP-04
, 2004
"... Abstract. We describe the automatic acquisition of a lexicon of verb subcategorisations from a domain-specific corpus, and an evaluation of the impact this lexicon has on the performance of a “deep”, HPSG parser of English. We conducted two experiments to determine whether the empirically extracted ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
Abstract. We describe the automatic acquisition of a lexicon of verb subcategorisations from a domain-specific corpus, and an evaluation of the impact this lexicon has on the performance of a “deep”, HPSG parser of English. We conducted two experiments to determine whether the empirically extracted verb stems would enhance the lexical coverage of the grammar and to see whether the automatically extracted verb subcategorisations would result in enhanced parser coverage. In our experiments, the empirically extracted verbs enhance lexical coverage by 8.5%. The automatically extracted verb subcategorisations enhance the parse success rate by 15 % in theoretical terms and by 4.5 % in practice. This is a promising approach for improving the robustness of deep parsing. 1
A Lexicon Module for a Grammar Development Environment
- Proceedings of the 4th International Conference of Language Resources and
, 2004
"... Past approaches to developing an effective lexicon component in a grammar development environment have suffered from a number of usability and efficiency issues. We present a lexical database module currently in use by a number of grammar development projects. The database module presented addresses ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
Past approaches to developing an effective lexicon component in a grammar development environment have suffered from a number of usability and efficiency issues. We present a lexical database module currently in use by a number of grammar development projects. The database module presented addresses issues which have caused problems in the past and the power of a database architecture provides a number of practical advantages as well as a solid framework for future extension. 1.

