Project ID:
RFP-2007-018
Title:
Parallel XML Document Parsing with Multi-core Processors
Summary:
Parsing of XML documents has been recognized as a performance bottleneck when processing XML. One cost-effective way to improve parsing performance is to use parallel algorithms and leverage the use of multi-core processors. Parallel parsing for XML Document Object Model (DOM) has been proposed, but the existing schemes do not scale up well with the number of processors. Further, there is little discussion of parallel parsing methods for other parsing models. The question is: how can we improve parallel parsing for DOM and other XML parsing models, when multi-core processors are available?
Full Description:
XML is a text-based, human-readable document format for structural information. It is the de facto standard interoperable data format for communicating between heterogeneous systems. As the use of XML documents becomes more prevalent, the demand for high-performance XML parsing grows rapidly. We believe that parallel algorithms, together with multi-core processors, can be a cost-effective way to improve XML parsing performance. Speedup of a well-designed parallel algorithm should flexibly scale with the number of processors. However, existing schemes for parallel DOM parsing do not demonstrate this behavior. What is the bottleneck in the existing parallel DOM parsing designs? Is the bottleneck a fundamental characteristic, or a design defect? Can we remove such a bottleneck? What are the tradeoffs? In addition to DOM, we are also interested in exploring parallel algorithms for other parsing models, such as SAX, StAX, VTD, etc. How are their parallel algorithm designs different from parallel DOM? Is there any fundamental technique that can boost the parsing performance regardless of the parsing model? How do they perform differently under different processor architectures? We are encouraging research on algorithms, and development of a software library, for parallel XML parsing on multi-core based systems.
Constraints and other information:
IPR will stay with the University. Cisco expects customary scholarly dissemination of results, and hopes that promising results would be made available to the community without limiting licenses, royalties, or other encumbrances. Contribution of working prototype to the OpenSource community is encouraged.
Proposal submission:
Please use the link below to submit a brief (max. 1-page) statement of interest. Where appropriate, propose a possible approach to the topic. We will progress towards making any awards in two phases. After reviewing the statements of interest received (phase 1), we will contact the most-promising teams to explore developing more complete proposals (phase 2). Phase 2 proposals will be developed, submitted, and evaluated for possible funding. Typically, evaluation will occur in the subsequent calendar quarter.
Create/submit a statement of interest for this RFP
Statements of interest will be evaluated continuously. For consideration in the first round, statements of interest should be submitted by: June 30, 2007
The first round of Phase 2 contacts will be initiated by Cisco by: July 31, 2007
Questions? Contact: research@cisco.com