Farnell had plenty to say about the importance of getting data landing right and how providers are failing to fully understand it when he spoke to CBR.
"If you don’t do a great job of data ingesting a lot of the data landing principles around metadata management, data quality and capture. If you don’t do that correct up front, you have no shot of being successful downstream."
Unfortunately for such an important part of the data pipeline it appears this isn’t being done well by either the business or those providing the tools.
"There are a lot of providers that just don’t get it. They understand the relational world, they understand the structured world but they have no idea how to be wildly successful with landing log files, unstructured data…all kinds of these other elements."
While he believes that clients have worked with providers to do a ‘decent job’ of reliably landing data, the amount of data that’s being created and used both from inside and outside the organisation today means that systems and processes can’t keep up.
"A lot of the systems, a lot of the processes and a lot of the capabilities that exist that clients use today just aren’t built and can’t keep up with the pace of innovation that we’re seeing."
This is why he believes that operating as an open source company is so important and why Teradata’s acquisition was so strategic.
The company aims to keep on top of everything that is happening in open source and to blend them into its methodology and patterns for how it works with clients.
Taking a product approach requires vast amounts of money and be really good at picking a sweet spot to operate in.
"You have to be really, really good at picking a sweet spot, move at the speed of light and raise either hundreds of millions, billions of dollars like you see Cloudera or Hortonworks doing.
"Or if you’re an existing company like IBM or Teradata, you have to spend massive amounts on R&D to keep your products relevant to the pace that is happening in open source.
Farnell highlights the challenges which face companies in the open source Big Data world and you can see from recent product release from the likes of IBM how it is trying to stay relevant.
One of the trends to stay on top of is streaming analytics, the idea of having real-time access to your data to analyse.
"Technologies like Spark, integrated with R and Python in conjunction with your whole Hadoop strategy is something that is becoming more and more prevalent as companies become successful with standing up Hadoop environments for what it’s good for."
If the data landing and integration has been done correctly then Farnell sees predictive as the ‘new world’ that business is stepping into.
However, it is necessary not to just predict one and roll out the model, he says: "You have to have an incredible listening capability to all the data that’s constantly moving and that you’re ingesting."
Although the Big Data market is growing, many of the deployments of the technology in businesses are often identified as being small – something that Farnell disputes.
"Just looking at our customers and knowing infinitely the Hortonworks, Cloudera, MapR and Amazon businesses – it is a massive, massive footprint."
Although the footprint is big, he identifies a difference between production systems or applications running or touching Big Data technologies and production systems that are utilising social data.
While companies may have their production systems or applications utilising data, production systems utilising social data is happening a lot less.
Farnell’s advice to companies deploying analytics is to start smart and scale fast, the company doesn’t want to look for tiny improvements.
"We don’t ever want to start our relationship with a client to re-platform something that already exists and do it just a little bit cheaper or a little bit fast – that’s never a good going-in position.
"Let’s think big, develop something or come up with a concept or idea that you just cannot do in business today."