Results 1 -
7 of
7
A static load-balancing scheme for parallel xml parsing on multicore cpus
- In CCGrid’07 (IEEE International Symposium on Cluster Computing and the Grid ), Rio de Janeiro
, 2007
"... A number of techniques to improve the parsing performance of XML have been developed. Generally, however, these techniques have limited impact on the construction of a DOM tree, which can be a significant bottleneck. Meanwhile, the trend in hardware technology is toward an increasing number of cores ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
A number of techniques to improve the parsing performance of XML have been developed. Generally, however, these techniques have limited impact on the construction of a DOM tree, which can be a significant bottleneck. Meanwhile, the trend in hardware technology is toward an increasing number of cores per CPU. As we have shown in previous work, these cores can be used to parse XML in parallel, resulting in significant speedups. In this paper, we introduce a new static partitioning and load-balancing mechanism. By using a static, global approach, we reduce synchronization and load-balancing overhead, thus improving performance over dynamic schemes for a large class of XML documents. Our approach leverages libxml2 without modification, which reduces development effort and shows that our approach is applicable to real-world, production parsers. Our scheme works well with Sun’s Niagara class of CMT architectures, and shows that multiple hardware threads can be effectively used for XML parsing. 1.
ParaXML: A Parallel XML Processing Model on the Multicore CPUs
"... performance and scale well on a multicore machine. XML has emerged as the de facto standard interoperable data format for the web service, the database and document processing systems. The processing of the XML documents, however, has been recognized as the performance bottleneck in those systems; a ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
performance and scale well on a multicore machine. XML has emerged as the de facto standard interoperable data format for the web service, the database and document processing systems. The processing of the XML documents, however, has been recognized as the performance bottleneck in those systems; as a result the demand for highperformance XML processing grows rapidly. On the hardware front, the multicore processor is increasingly becoming available on desktop-computing machines with quadcore shipping now and 16 core system within two or three years. Unfortunately almost all of the present XML processing algorithms are still using serial processing model, thus being unable to take advantage of the multicore resource. We believe a parallel XML processing model should be a cost-effective solution for the XML performance issue in the multicore era. In this paper, we present a generalpurpose parallel XML processing model, ParaXML, designed for multicore CPUs. General speaking, ParaXML treats the XML document as the general tree structure and the XML processing task as the extension from the parallel tree traversal algorithm for the classic discrete optmization problems. The XML processing, however, has quite distinct characteristics from the classic discrete optmization problems, thus demanding the special treatments and the finegrained tuning technologies. ParaXML internally adopts a fine-grained work-stealing scheme to dynamically control the load balance among the parallel-running threads, and a novel approach is also introduced to trace the stealing actions and the running results to facilitate the reducing of those parallel-running results. Besides, ParaXML provides the tuning options, particularly for the large XML documents, to control the trade-off between the parallelism gain and task-partitioning overhead. To show the feasibility and effectiveness of the ParaXML model, we demonstrate our parallel implementations of three fundamental XML processing tasks based on the ParaXML: traversal, serializing and parsing. The empirical study in this paper shows that those parallel implementations substantially improved the 1
Performance Enhancement with Speculative Execution Based Parallelism for Processing Large-scale XML-based Application Data
"... We present the design and implementation of a toolkit for processing large-scale XML datasets that utilizes the capabilities for parallelism that are available in the emerging multi-core architectures. Multi-core processors are expected to be widely available in research clusters and scientific desk ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
We present the design and implementation of a toolkit for processing large-scale XML datasets that utilizes the capabilities for parallelism that are available in the emerging multi-core architectures. Multi-core processors are expected to be widely available in research clusters and scientific desktops, and it is critical to harness the opportunities for parallelism in the middleware, instead of passing on the task to application programmers. An emerging trend is the use of XML as the data format for many distributed/grid applications, with the size of these documents ranging from tens of megabytes to hundreds of megabytes. Our earlier benchmarking results revealed that most of the widely available XML processing toolkits do not scale well for large sized XML data. A significant transformation is necessary in the design of XML processing for distributed applications so that the overall application turn-around time is not negatively affected by XML processing. We discuss XML processing using PiXiMaL, a parallel processing library for large-scale XML datasets. The parallelization approach is to build a DFA-based parser that recognizes a useful subset of the XML specification, and convert the DFA into an NFA that can be applied to an arbitrary subset of the input. Speculative NFAs are scheduled on available cores in a node to effectively utilize the processing capabilities and achieve overall performance gains. We evaluate the efficacy of this approach in terms of potential speedup that can be achieved for representative XML datasets. We also evaluate the effect of two different memory allocation libraries to quantify the memory-bottleneck as different cores access shared data structures.
Approaching a Parallelized XML Parser Optimized for Multi-Core Processors ∗ ABSTRACT
"... Very large scientific datasets are increasingly becoming available in XML formats. At the same time, multi-core processing is increasingly becoming available on desktop- and laptop-class computing machines. Unfortunately, most XML parsers are still using algorithms that are inherently serial, which ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Very large scientific datasets are increasingly becoming available in XML formats. At the same time, multi-core processing is increasingly becoming available on desktop- and laptop-class computing machines. Unfortunately, most XML parsers are still using algorithms that are inherently serial, which show little improvement on newer computing hardware. The current XML implementation landscape does not adequately meet the performance requirements of large scale applications. Thus far, applications using Web services (in the grid community, for example) have largely focused on XML protocol standardization and tool building efforts, and not on addressing the performance bottlenecks when dealing with large volumes of XML data. Generic parallel parsing has been studied in depth over the past thirty years. However, as yet, these results have not been applied to the problem of XML parsing. XML documents have some structural properties that make it more amenable to parallelized parsing than general context-free languages. As has been previously shown, XML parsers spend a large percentage of time tokenizing the input in an inherently serial process, typically running a deterministic finite automaton on the input. Our initial approach, described here, separates the process of parsing the XML from the process of reading the input. We take a well-known high performance parser, Piccolo, and apply two different strategies, Runahead and Piped, and examine the timing of the file read time and hence the overall time to parse large scientific XML files. Under the conditions tested here, performance decreases.
Boosting XML Filtering with a Scalable FPGA-based Architecture
"... The growing amount of XML encoded data exchanged over the Internet increases the importance of XML based publish-subscribe (pub-sub) and content based routing systems. The input in such systems typically consists of a stream of XML documents and a set of user subscriptions expressed as XML queries. ..."
Abstract
- Add to MetaCart
The growing amount of XML encoded data exchanged over the Internet increases the importance of XML based publish-subscribe (pub-sub) and content based routing systems. The input in such systems typically consists of a stream of XML documents and a set of user subscriptions expressed as XML queries. The pub-sub system then filters the published documents and passes them to the subscribers. Pub-sub systems are characterized by very high input ratios, therefore the processing time is critical. In this paper we propose a “pure hardware ” based solution, which utilizes XPath query blocks on FPGA to solve the filtering problem. By utilizing the high throughput that an FPGA provides for parallel processing, our approach achieves drastically better throughput than the existing software or mixed (hardware/software) architectures. The XPath queries (subscriptions) are translated to regular expressions which are then mapped to FPGA devices. By introducing stacks within the FPGA we are able to express and process a wide range of path queries very efficiently, on a scalable environment. Moreover, the fact that the parser and the filter processing are performed on the same FPGA chip, eliminates expensive communication costs (that a multi-core system would need) thus enabling very fast and efficient pipelining. Our experimental evaluation reveals more than one order of magnitude improvement compared to traditional pub/sub systems. 1.
XPEDIA: XML Processing for Data Integration
"... Data Integration engines increasingly need to provide sophisticated processing options for XML data. In the past, it was adequate for these engines to support basic shredding and XML generation capabilities. However, with the steady growth of XML in applications and databases, integration platforms ..."
Abstract
- Add to MetaCart
Data Integration engines increasingly need to provide sophisticated processing options for XML data. In the past, it was adequate for these engines to support basic shredding and XML generation capabilities. However, with the steady growth of XML in applications and databases, integration platforms need to provide more direct operations on XML as well as improve the scalability and efficiency of these operations. In this paper, we describe a robust and comprehensive framework for performing Extract-Transform-Load (ETL) of XML. This includes (i) full computational model and engine capabilities to perform these operations in an ETL flow, (ii) an approach to pushing down XML operations into a database engine capable of supporting XML processing, and (iii) methods to apply partitioning techniques to provide scalable, parallel processing for large XML documents. We describe experimental results showing the effectiveness of these techniques. 1.
Parallel and Distributed Approach for Processing Large-Scale XML Datasets
"... Abstract—An emerging trend is the use of XML as the data format for many distributed scientific applications, with the size of these documents ranging from tens of megabytes to hundreds of megabytes. Our earlier benchmarking results revealed that most of the widely available XML processing toolkits ..."
Abstract
- Add to MetaCart
Abstract—An emerging trend is the use of XML as the data format for many distributed scientific applications, with the size of these documents ranging from tens of megabytes to hundreds of megabytes. Our earlier benchmarking results revealed that most of the widely available XML processing toolkits do not scale well for large sized XML data. A significant transformation is necessary in the design of XML processing for scientific applications so that the overall application turn-around time is not negatively affected. We present both a parallel and distributed approach to analyze how the scalability and performance requirements of large-scale XML-based data processing can be achieved. We have adapted the Hadoop implementation to determine the threshold data sizes and computation work required per node, for a distributed solution to be effective. We also present an analysis of parallelism using our PIXIMAL toolkit for processing large-scale XML datasets that utilizes the capabilities for parallelism that are available in the emerging multi-core architectures. Multi-core processors are expected to be widely available in research clusters and scientific desktops, and it is critical to harness the opportunities for parallelism in the middleware, instead of passing on the task to application programmers. Our parallelization approach for a multi-core node is to employ a DFA-based parser that recognizes a useful subset of the XML specification, and convert the DFA into an NFA that can be applied to an arbitrary subset of the input. Speculative NFAs are scheduled on available cores in a node to effectively utilize the processing capabilities and achieve overall performance gains. We evaluate the efficacy of this approach in terms of potential speedup that can be achieved for representative XML data sets. I.

