According to a recent survey from TDWI, overall Hadoop adoption by enterprises is on the rise, with 60 percent of respondents planning on having Hadoop clusters in production by the first quarter of 2016. But what exactly is Hadoop and what can it do for us?
Forrester analyst Mike Gualtieri offered a “Hadoop for Dummies” (otherwise known as regular business people) tutorial on the firm’s blog last summer. Here’s how he explained the technology:
Hadoop is an open source project that offers a platform to store and manage big data. There are two important things to understand about it. The first is how Hadoop stores files, and the second is how it processes data.
Hadoop’s storage capabilities are extremely powerful. Using it, an organization can store very large files and a great number of files. No longer are companies encumbered by the storage limits of a particular node or server.
Hadoop also has a cool framework for processing data called MapReduce. Moving data over a network can be painfully slow because files are so large, so MapReduce splits the data sets into smaller, independent chunks that are processed in a parallel manner – thus speeding up processing time.
How Hadoop is Conquering the Enterprise
As an open source technology that got its start in digital organizations, Hadoop’s challenge now is to scale to a variety of industries and types of companies, and to successfully integrate with more traditional IT platforms. As it expands across the enterprise and its ownership moves back and forth from citizen developers to central IT, run of the mill IT professionals have to become data architects, analysts, and scientists.
If the TDWI survey is any indication, Hadoop is not only meeting this challenge, but proving essential. “Hadoop for the enterprise is driven by several rising needs,” said Philip Russom in a 2015 TDWI white paper. “On a technology level, organizations need data platforms to handle exploding data volumes. They also need a scalable extension for existing IT systems in warehousing, archiving, and content management. On a business level, everyone wants to get business value and other organizational advantages out of big data instead of merely managing it as a cost center.”
And Hadoop has another trick up its sleeve – analytics. “Hadoop is not just a storage platform for big data: it’s also a computational platform for business analytics,” said Russom. “This makes Hadoop ideal for firms that wish to compete on analytics, as well as retain customers, grow accounts, and improve operational excellence.”
Then there’s companies like Cloudera that have built a data management and analytics platform on Apache Hadoop and the latest open source technologies. Operations departments are using it to create an enterprise data “hub” to deploy a single analytic data management platform that handles a variety of data to ensure optimal service and product delivery.
For more on Hadoop and what it does, head over to the QuickBase Fast Track blog.