Introducing DMX-h Release 8

The best end-to-end approach for offloading legacy workloads into Hadoop

Break Free from Hadoop Complexity! Collect, Prepare, Blend, Transform & Distribute Data Seamlessly with DMX-h.

DMX-h is specifically designed to remove barriers to mainstream Hadoop adoption and deliver the best approach for shifting heavy workloads from expensive data warehouses and mainframes into Hadoop

DMX-h Sort Edition: A Smarter Sort for A Big Data Platform

Maximize the return on your Hadoop investment by increasing the scalability and efficiency of every node in the cluster.

For over 40 years, Syncsort has been the undisputed leader in high-performance sort for mainframe and open systems. Now you can benefit from the same technology in Hadoop. No need to make any changes to existing code, simply plug-in DMX-h Hadoop Sort to seamlessly accelerate MapReduce operations in Hadoop deployments.


Smarter Hadoop Sort Means Faster MapReduce

As much as 80% of all ETL processing is spent sorting data. Joins, aggregations, rankings, database loads, and more; all depend on sorting data. Hadoop is no exception. In fact, all MapReduce jobs involve sort for both the Map as well as the Reduce steps. Unfortunately, the native Hadoop sort, has limited performance capabilities, which can force organizations to spend precious IT resources tuning jobs or to add more nodes to achieve the desired performance.
Thanks to Syncsort’s recently committed contribution to the open source community – MAPREDUCE-2454  – sort is now a pluggable component of Hadoop. This means you can now run DMX-h – the fastest, most efficient sort tool – natively within the MapReduce framework, to seamlessly optimize Map-Sort and Reduce-Merge operations.


New and Expanded Use Cases

DMX-h Sort Edition enables more sophisticated manipulation of data, making Hadoop a more robust environment for the enterprise.
With DMX-h Sort Edition, organizations can more easily implement and optimize new use cases typical of enterprise ETL implementations, including:

  • Hash aggregations. Optimized hash-based aggregations can provide significant performance benefit for applications such as log analysis and queries on large data volumes.

  • Optimized full joins for change data capture (CDC). Critical data warehouse processes such as CDC require a full join.

  • Run jobs with a subset of data. Many applications, including data sampling, require processing a subset of the data, e.g. first N matches/limit N queries

  • No-sort option. Avoid sort altogether when not needed and/or redundant to minimize wasted resources.


Smarter Scalability

A smarter Hadoop sort also means smarter scalability. DMX-h Hadoop Sort Edition can help organizations process more data in less time without the need to constantly add more nodes to the cluster.

DMX-h Sort Edition dynamically optimizes performance as well as CPU and memory utilization of all sort-intensive computations by dynamically adapting to hardware architectures, operating system and data characteristics. The result is maximum vertical scalability that fully exploits the processing power of each node.