Recently the focus on increased insights and data driven decisions has become paramount from the board room to marketing, and even to the production floor. Those insights and decisions were promised by the world of “Big Data”. The hype around big data is unmistakable. You can’t miss it. I think Big Data will perform at the super bowl halftime show. Well maybe not, but you can bet that the stadium, advertisers, teams and vendors are all using big data technologies to be their best for that big game.
I speak with customers that are driven to implement a big data solution, and yet I’ve never seen more people working to learn and implement a technology with so little definition on how they plan to leverage it for their business. They know a few things…
These sounds good in theory, so let’s take them one by one.
1. Big Data might allow them to increase the value and variety of decisions they make.
The foundation of Big Data is a distributed platform and file system such as Mapreduce/Yarn/Tez and HDFS. These will allow you to work with data that is less traditional in form and volume and through that, you may be able to analyze a wider variety of things and possibly do this faster than if all the data had to go through the same normalization and ETL process. That means that we need a properly sized Hadoop platform with all the right tools and a methodology for how we are going to approach these types of data challenges. This takes time, trial and error to develop without guidance from a partner who has been there before. This is why many of our clients are reaching out to get help finding that leg up on accelerating their big data insights process.
2. These technologies might provide a way to get around the limitations of traditional BI systems
I’m not sure about this one. I do believe in many cases that we can deconstruct and prepare data for analysis in a more agile way using some of the tools that the Hadoop ecosystem provides. That said, we have only prepared the data for analysis, we have not actually done anything useful with the data. That will require a client tool like PowerBI, or another analytics platform. Our new data will likely need to be aligned with corporate enterprise BI data to be really useful and now we’re combining that insight, often integrating it into our enterprise BI environment now that we’ve found the valuable pieces of that data. So often we’re not getting around limitation, but becoming more agile in our methods of identifying data that’s valuable in the context of our business and then being more efficient in the size and scope of the data that we need to eventually integrate into our enterprise BI solutions. This helps us control data warehouse sprawl and keeps the EDW valuable for the business.
3. They believe it to be a cost-effective and rapid time-to-value option
Some companies think they will deploy an amazing big data solution on hardware they have laying around. This is fine for a prototype, but the only real way to deploy big data in a cost efficient, rapid and scalable way is to look to the cloud. The cloud provides a scalable set of services, infrastructure and storage that allow companies to explore big data’s capabilities and value without building out more datacenter floor space. The nature of cloud storage and compute being separate allows for a multi-tenant approach to big data that is virtually impossible to deliver on-premise. This allows for scale on demand approaches, saving significant dollars and administrative overhead. Automation capabilities with cloud platforms provide this functionality and allow it to be controlled either from central IT or from the business analysts as need. Pragmatic Works deploys 90% of our customers’ big data solutions in Microsoft’s Azure cloud for rapid time-to-value and a much better 12 month ROI.
4. They believe their team can adapt and learn the tools quickly
The most significant point of confusion when investing in big data is around all the different tools, services and components that make up the ecosystem. Check out here for more on that. There is Hadoop, Pig, Hive, Oozie, Flume, Storm, HDInsight, Cloudera, Impala, and so many others. Companies don’t know which way to go and are nervous about getting locked into a vendor or a direction. Sometimes they even go so far as to try the completely open source route, thinking that will be better, with no vendor to get locked into. The problem is without an enterprise behind your enterprise solution, you are left without support, training and a system for upgrades and improvements to the product. That is putting a lot of pressure on your team. Many teams are already working full time in their current roles and don’t have time to integrate 20 new technologies into their daily life. The cloud comes to the rescue again with products like Microsoft Azure’s HDinsight, a managed big data service in the cloud. This service can be scaled as needed, combines with azure’s other secure cloud services (They have added over 200 new services in the last year. .wow..) This service provides all the core Hadoop ecosystem components, noSQL options, virtual infrastructure and tons of integration options for your applications and enterprise systems. We are seeing customers excited about realizing their big data vision by leveraging a managed platform like this. Best of all, since it’s the cloud, you only pay for what you use. You’re still depreciating those old servers in the data center aren’t you? I thought so..
5. Their business is counting on it to provide insight into business growth and efficiency opportunities
Now that you’ve got this amazing distributed scaling on demand big data platform in the cloud, what are we doing with the data, and here in lies the big secret of big data? It doesn’t solve all your problems. You still need to do that. What a good big data implementation will allow you to do is iterate through data organization, curation and publication faster and provide it as a scalable source for more in depth analysis. Enter machine learning and predictive analytics. These technologies work with the big data ecosystem (and other data) to help you make decisions about what you expect your business, customers or market to do, allowing your stakeholders to plan more appropriately and effectively. These are the technologies that consume this data and provide real insight. Many companies are looking for a “Data Scientist” to drive these initiative, but Microsoft has done a nice job in working with their internal data scientist and their Azure teams to create a managed and scalable Azure Machine Learning service that really does a nice job of democratizing this technology and making it accessible to companies willing to explore it. Pragmatic Works is doing a lot in this space, helping clients adapt from traditional BI to a more forward looking approach using predictive analytics. Our customers are seeing significant value and ROI from their investments here. Remember Cloud = scalable, pay for what you use etc.. so this fits that model without having to hire a team of Phd’s to run it for you. In many cases we’re able to train the existing team to leverage this platform. That is very exciting for our clients.
As you can see Big Data is an exciting technology development, but as we watch companies adopt these technologies, we are paying careful attention to how they can be successful and not just initiate another overloaded data project. Our techniques and cloud adoption have resulted in our initiatives having a significantly higher adoption rate than the industry average.
If you want to find out how to take your business forward into a more insight driven world, contact us here at Pragmatic Works...
See you out there…
President, Pragmatic Works Consulting