Results 1 -
2 of
2
Evaluating Natural Language Processing Systems
, 1993
"... This report presents a detailed analysis and review of NLP evaluation, in principle and in practice. Part 1 examines evaluation concepts and establishes a framework for NLP system evaluation. This makes use of experience in the related area of information retrieval and the analysis also refers to ev ..."
Abstract
-
Cited by 104 (0 self)
- Add to MetaCart
This report presents a detailed analysis and review of NLP evaluation, in principle and in practice. Part 1 examines evaluation concepts and establishes a framework for NLP system evaluation. This makes use of experience in the related area of information retrieval and the analysis also refers to evaluation in speech processing. Part 2 surveys significant evaluation work done so far, for instance in machine translation, and discusses the particular problems of generic system evaluation. The conclusion is that evaluation strategies and techniques for NLP need much more development, in particular to take proper account of the influence of system tasks and settings. Part 3 develops a general approach to NLP evaluation, aimed at methodologically-sound strategies for test and evaluation motivated by comprehensive performance factor identification. The analysis throughout the report is supported by extensive illustrative examples. This work was carried out under the UK Science and Engineeri...
E*PLORE-ING THE SIMULATION DESIGN SPACE
"... One of the major puzzles in performing multi-agent-based simulations is the validity of their results. Optimisation of simulation parameters can lead to results that can be deceitful, optimistic, or plainly wrong. When the issue at stake is inherently complex, which is frequently the case with socia ..."
Abstract
- Add to MetaCart
One of the major puzzles in performing multi-agent-based simulations is the validity of their results. Optimisation of simulation parameters can lead to results that can be deceitful, optimistic, or plainly wrong. When the issue at stake is inherently complex, which is frequently the case with social phenomena, the search for emergent outcomes is closely related to macro effects deriving from micro behaviours, and the drawing of valid conclusions from the analysis of the observed results should be done with extra care. Multi-agent-based social simulation is increasingly used not only to understand and explain phenomena, but also to predict outcomes and even to prescribe measures to be adopted by colective (public or private) entities. The notion that conclusions of simulation studies will be applied to real social settings brings an added responsibility to the researcher. Principled methodologies are needed that can minimise the ad hoc nature of experimentation. In this paper, we present a set of methodological principles to explore the space of possible designs involved in simulation experiments. Principles are needed not only for the design of agents and the societies they are immersed in, but also for the design of models of simulations themselves. Several techniques are shown that can provide an increasingly broad covering of the space of possible experiment designs. We also explore some alternatives on how to progressively complexify particular mechanisms. 1

