Syncsort’s Contribution to Apache Sqoop Moves Big Data from the IBM Mainframe to Hadoop
Powerful New Technology Contributed by Syncsort to the Apache Sqoop Open Source Project Will Allow Hadoop Users to Easily Import and Transform Mainframe Data
WOODCLIFF LAKE, N.J. – October 09, 2014
Syncsort, a global leader in Big Data software, today announced another milestone contribution to the Apache Hadoop ecosystem, incorporating powerful technology into the Apache Sqoop open source project that will allow Hadoop users to easily import and transform data coming from the IBM System z mainframe environment.
“Many organizations are looking to increase efficiency and save money by moving targeted mainframe data and workload processing to Hadoop,” said Charles Zedlewski, vice president, products, Cloudera. “Taken together, Apache Sqoop and Syncsort’s open source contributions will facilitate the importation and transformation of all types of mainframe data, allowing customers to take full advantage of Hadoop’s advanced analytical capabilities.”
As Hadoop has emerged as the dominant data processing platform for the enterprise, there is a growing need to rapidly move and transform mainframe data into an understandable next generation Big Data format. Syncsort’s contributions to Apache Sqoop will make it much more cost effective to store mainframe historical data in HDFS and will also help free-up mainframe CPU cycles by allowing customers to move expensive data processing workloads from the mainframe to Hadoop.
The new technology is now committed as SQOOP-1272
, and supports loading multiple mainframe data sets to each of the nodes in a Hadoop cluster in parallel and transforming them into any Apache Sqoop supported file format. This makes it simple for organizations to integrate data from mainframe databases, such as DB2/z, IMS, Adabas, IDMS, and Datacom, with the rest of the data in a typical next-generation Big Data environment.
The contribution also features an open application programming interface (API) to allow anyone to extend support for more complex mainframe data files. Syncsort’s own award-winning
DMX-h technology uses this open API, serving as a feature-rich add-on that can handle binary sequential data with COBOL copybook metadata and VSAM datasets. Syncsort’s DMX-h plug-in also allows seamless archiving of mainframe data to Hadoop, preserving its original mainframe record format.
“We will continue to be one of the most prolific contributors to the Apache Hadoop family of projects, adding open source technology that helps simplify and accelerate the process of offloading of legacy workloads and data into Hadoop,” said Tendu Yogurtcu, vice president, engineering, Syncsort. “This new open source contribution extends Apache Sqoop with the ability to move Partitioned Data Sets, such as IBM DB2 dump files, from z/OS on the mainframe to Hadoop and to store the data in any Apache-Sqoop supported format.”
For more information on Syncsort DMX-h, click here
. For more information on Sqoop, click here
Syncsort provides fast, secure, enterprise-grade software spanning Big Data solutions in Hadoop to Big Iron on mainframes. We help customers around the world to collect, process and distribute more data in less time, with fewer resources and lower costs. 87 of the Fortune 100 companies are Syncsort customers, and Syncsort's products are used in more than 85 countries to offload expensive and inefficient legacy data workloads, speed data warehouse and mainframe processing, and optimize cloud data integration. Experience Syncsort at http://www.syncsort.com/en/TestDrive
Director, Corporate Communications