What changes in the cloud computing and big data landscape should we be
expecting in 2013?
In this article we offer a round-up of industry experts' opinions as they
were asked by Cloud Expo / BigDataExpo Conference Chair Jeremy Geelan to
preview the fast-approaching year ahead.
2013 Will Be The Year of Big Data | The Internet of Things | Cloud To The
Rescue (DR) | SSD
John Engates | @jengates
CTO of Rackspace Hosting
Now its CTO, John joined Rackspace in August 2000, just a year after the
company was founded, as VP of Operations, managing the datacenter operations
and customer-service teams. Two years later, when Rackspace decided to add
new services for larger enterprise customers, he created and helped develop
the Intensive Hosting business unit. Most recently, he has played an active
role in the evolution and evangelism of Rackspace's cloud computing strategy
When we last left Oracle’s big data plans, there was definitely a missing
piece. Oracle’s Big Data Appliance as initially disclosed at last fall’s
OpenWorld was a vague plan that appeared to be positioned primarily as an
appliance that would accompany and feed data to Exadata. Oracle did specify
some utilities, such as an enterprise version of the open source R
statistical processing program that was designed for multithreaded execution,
plus a distribution of a NoSQL database based on Oracle’s BerkeleyDB as an
alternative to Apache Hive. But the emphasis appeared to be extraction and
transformation of data for Exadata via Oracle’s own utilities that were
optimized for its platform.
With Oracle’s announcement of general availability of the big data
appliance, it is filling in the blanks.
As such, Oracle’s plan for Hadoop was competition, not for Cloudera (or
If you look at the predictions made for 2012, you will find a new entry which
was not there last year. Be it Gartner, Forrester or McKenzie – “Big
Data” finds a place in the prediction.
So, what is big data? Is it the next path breaking technology which will
change everything or is it just a hype which will die down after sometime?
Let us take a realistic look at what the term big data mean and what problem
it can solve.
What is "Big Data"?
(The Wikipedia page on Big Data is not that good. The clearest explanation I
have found is from O’Reilly Radar – here is the link)
Here is a short explanation.
Big Data is the name given to the classes of technologies that needs to be
used when your data volume becomes so much that the RDBMS technologies can no
longer handle it.
Big data spans three dimensions (taken from this article of IBM):
Variety – Big data extends beyond st... (more)
My prediction for 2013 is that competitive advantage will translate into
enterprises using sophisticated Big Data analytics to create a new breed of
applications - Intelligent Applications.
"It's more than just insights from MapReduce", a CIO from a fortune 100 told
me, "It's about using data to make our customer touch points more engaging,
more interactive, more intelligent."
So when you hear about "Big Data solutions," you need to translate that into
a new category of "Intelligent Applications." At the end of the day, it's not
about people pouring through petabytes of data. It's actually about how one
turns the data into revenue (or profits).
This means that you MUST:
Start with the business problem first (preferably one with revenue upside
versus cost savings) Determine which data elements you can leverage AFTER #1
Define an analytical three-tier architecture (a... (more)
BigData (and Hadoop) are buzzword and growth areas of computing; this article
will distill the concepts into easy-to-understand terms.
As the name implies, BigData is literally "big data" or "lots of data" that
needs to be processed. Lets take a simple example: the city council of San
Francisco is required to take a census of its population - literally how many
people live at each address. There are city employees who are employed to
count the residents. The city of Los Angeles has a similar requirement.
Consider are two methods to accomplish this task:
1. Request all the San Francisco residents to line up at City Hall and be
prcessed by the city employees. Of course, this is very cumbersome and time
consuming because the people are brought to the city hall and processed one
by one - in scientific terms the data are transfered to the processing node.
The people have ... (more)
This article was authored by Matt McIlwain and was originally published on
Medium. For more of Matt's writing, you can follow him here!
Guest Post: Beyond Big Data and Benioff’s “AI Spring” to the Dawn of
Big Data, AI, Machine Learning, Hadoop, Predictive Analytics — we hear
these terms every day from companies such as Cloudera, Trifacta and Dato
(formerly GraphLab) that are securing many millions in financing. I believe
that 2015 will be the year when the conversation moves from Big Data to the
Dataware stack. Over the past twelve months we have seen a lot of companies
across the big data spectrum emerge and while the language can be the same,
there are clear product categories that have emerged which describe the
market opportunity and future growth.
This is the Dataware stack. Dataware is the combination of infrastructure,
data intelligence systems tha... (more)
Red Hat on Wednesday announced its Big Data direction and solutions to
satisfy enterprise requirements for highly reliable, scalable, and manageable
solutions to effectively run their Big Data analytics workloads. In addition,
Red Hat announced that the company will contribute its Red Hat Storage Hadoop
plug-in to the Apache Hadoop open community to transform Red Hat Storage into
a fully supported, Hadoop-compatible file system for Big Data environments,
and that Red Hat is building a robust network of ecosystem and enterprise
integration partners to deliver comprehensive Big Data solutions to
Red Hat Big Data infrastructure and application platforms are suited for
enterprises leveraging the open hybrid cloud environment. Red Hat is working
with the open cloud community to support Big Data customers. Many enterprises
worldwide use public cloud... (more)
Master Data Management (MDM) is a very important data governance aspect in
enterprises whereby MDM enables the development of a "Single Version of
Truth." MDM establishes Single Version of Truth by providing common
descriptions for enterprise-wide entities.
Need for MDM in Big Data Processing
Before Big Data, enterprises generally managed their transaction data in
traditional relational databases. One of the biggest strengths of relational
databases is their ability to enforce constraints like check constraints,
primary key, foreign key, etc., which ensure that the data captured is of the
In spite of such support for data integrity, enterprises had duplicates in
their master data that resulted in inaccurate results in analytics on that
data. For example, an enterprise may target an expensive advertisement
campaign for a new product to its existing c... (more)
What do you get when you combine Big Data technologies….like Pig and
Hive? A flying pig?
No, you get a “Logical Data Warehouse”.
My general prediction is that Cloudera and Hortonworks are both aggressively
moving to fulfilling a vision which looks a lot like Gartner’s “Logical
Data Warehouse”….namely, “the next-generation data warehouse that
improves agility, enables innovation and responds more efficiently to
changing business requirements.”
In 2012, Infochimps (now CSC) leveraged its early use of stream processing,
NoSQLs, and Hadoop to create a design pattern which combined real-time,
ad-hoc, and batch analytics. This concept of combining the best-in-breed Big
Data technologies will continue to advance across the industry until the
entire legacy (and proprietary) data infrastructure stack will be replaced
with a new (and open) one.
As this is happening, I predi... (more)
Compuware Corporation on Wednesday announced that Compuware APM for Big Data
now offers enhanced support and out-of-the-box dashboards that enable
organizations to optimize big data projects through unmatched visibility into
Hadoop, NoSQL and Cassandra deployments. Now organizations have deeper
insight into big data workloads and transactions to quickly find the root
cause of slow jobs and failures in minutes, instead of hours or days.
Enhancements for Hadoop enable operations teams to gain deep insight into the
most active users in a cluster with automatic profiling of intensive jobs.
Problem patterns, including data shuffle across the network, can be quickly
identified as well as resource utilization and tracking to enable charge-back
New enhancements in Compuware APM for Big Data include:
Enhanced out-of-the-box, zero configuration dashboards for Hadoop, ... (more)
DALLAS, Aug. 21, 2014 /PRNewswire-iReach/ -- Amid the proliferation of real
time data from sources such as mobile devices, web, social media, sensors,
log files and transactional applications, Big Data has found a host of
vertical market applications, ranging from fraud detection to R&D.
Photo - http://photos.prnewswire.com/prnh/20140821/138541
"Big Data Market: 2014 – 2020 – Opportunities, Challenges, Strategies,
Industry Verticals & Forecasts"
In 2014 Big Data vendors will pocket nearly $30 Billion from hardware,
software and professional services revenues Big Data investments are further
expected to grow at a CAGR of nearly 17% over the next 6 years, eventually
accounting for $76 Billion by the end of 2020 The market is ripe for
acquisitions of pure-play Big Data startups, as competition heats up between
IT incumbents Nearly every large scale IT ven... (more)