Issues in Performance Evaluation: A Case Study of Math Recognition
 10TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION
, 2009
Performance evaluation of document recognition systems is a difficult and practically important problem. Issues arise in defining requirements, in characterizing the system’s range of inputs and outputs, in interpreting published performance evaluation results, in reproducing performance evaluation experiments, in choosing training and test data, and in selecting performance metrics. We discuss these issues in the context of evaluating systems for recognition of mathematical expressions. Excellent progress has been made in the theory and practice of performance evaluation, but many open problems remain.
P.: Web Interface and Collection for Mathematical Retrieval
 Masaryk University, Bertinoro, Italy
, 2011
A More Canonical Form of Content MathML to Facilitate Math Search
, 2007
Normalization of Digital Mathematics Library Content MathML Canonicalization
Abstract. Paper discusses the needs for data normalization in a Digital Mathematics Library (DML). Specifically, emphasis is given to canonicalizing formulae encoded in Presentation MathML notation which starts to be available in several DMLs and is used by DML applications. This is a prerequisite for advanced processing — namely math enabled fulltext searching or semantic filtering and automated classification. Different sources of MathML and their specifics are described. Several use cases of possible formulae canonicalization transformations are listed and discussed in detail. Findings are finally concluded and a design of a tobedeveloped canonicalization tool is outlined.
Non visual access to mathematical contents: State of the art and prospective
 In Proceedings of the WEIMS Conference
, 2009
ContextAware Adaption. A Case Study on Mathematical Notations
, 2008
"... In the last two decades, the World Wide Web has become the universal, and — for many users — main information source. Search engines can efficiently serve daily life information needs due to the enormous redundancy of relevant resources on the web. For educational — and even more so for scientific i ..."
In the last two decades, the World Wide Web has become the universal, and — for many users — main information source. Search engines can efficiently serve daily life information needs due to the enormous redundancy of relevant resources on the web. For educational — and even more so for scientific information needs, the web functions much less efficiently: Scientific publishing is built on a culture of unique reference publications, and moreover abounds with specialized structures, such as technical nomenclature, notational conventions, references, tables, or graphs. Moreover, many of these structures are peculiar to specialized communities determined by nationality, research group membership, or adherence to a special school of thought. To keep the muchlamented “digital divide ” from becoming a “cultural divide”, we have to make online material more accessible and adaptable to individual users. In this paper we attack this goal for the field of mathematics where knowledge is abstract, highly structured, and extraordinarily interlinked. Modern, contentbased representation formats like OpenMath or content MathML allow us to capture, model, relate, and represent mathematical knowledge objects and thus make them contextaware and machineadaptable to the respective user contexts. Building on previous work which can make mathematical notations adaptable we employ user modeling techniques to make them adaptive to relieve the reader of configuration tasks. We present a comprehensive framework for adaptive notation management and evaluate it on an implementation integrated in the elearning platform panta rhei.
Editorial Team
[@Science European Thematic Network]
Math Expression Retrieval Using Symbol Pairs in Layout Trees
, 2013
We have developed a layoutbased math retrieval system by indexing on pairs of symbols in mathematical expressions. Existing approaches to layoutbased retrieval include tree edit distancebased matching on MathML trees (Kamali and Tompa, 2013) and longest common subsequence matching in LATEX strings (Kumar et al., 2012). In our work, we compare our new layoutbased retrieval method with a math retrieval system built using the conventional textbased retrieval system Lucene (Zanibbi and Yuan, 2011), as such systems are commonly used for math search. We show that the search results returned by our system are scored by participants in a study as significantly more similar than those of the comparison system and that our system is fast enough to be used in real time. iii