As ‘big data’ evolves from buzzword to cutting edge to commonplace, it’s changing the
world around it. Being ‘data-driven’ will no longer be a differentiator but a basic necessity, and without tapping into proprietary and open data, businesses will fail to keep up in a world of savvy start-ups.
Good software won’t cut it. A wide-angled view of customers and their contexts will be crucial. Social media, calendars and connected devices will make up a picture that algorithms will use for predictive analytics and to create personalised services.
For some, that stirs old fears about machines taking human jobs. However, the best companies will combine human talent and tech: if you’re a data scientist, big data shouldn’t make you obsolete anytime soon.
The price of a data scientist
Data scientists are accomplished technical specialists capable of using an array of tools to interrogate data. They answer the questions businesses ask on their data, plus the ones they didn’t even know they should be asking.
Yet, the shortage of data talent is evident. CrowdFlower’s 2016 data science report found that 83 per cent of respondents said there weren’t enough data scientists to meet demand.
Why? Some of the greatest data storage and processing technologies of recent years have come from a small set of the best engineering brains. For example, Hadoop was the brainchild of a handful of engineers at Google but grown and open sourced at Yahoo. Spark was developed at UC Berkeley’s AMPLab.
Although these technologies are huge in the open source tech community, there is a shortage of talent with the analytical experience to understand and deploy these complex technologies effectively.
This makes data scientists expensive: it’s the law of supply and demand. The median salary currently stands at USD 119,000 according to Glassdoor. Good talent comes at a premium, and a great place to work, that also stretches scientists professionally, will be crucial to hiring and keeping people too.
Diffuse data
With all these dollars sunk into technical talent and infrastructure, it’s important for companies to get the most out of those investments. In practice, this means asking: how do we embed data into the DNA of an organisation?
Asking a small group of brains to re-orient a company is unlikely to end in success. Instead, data has to be diffused throughout the organisation. Data analysis is the combination of humans asking the right questions along with having the right tools to answer them. Allowing more people to ask questions should mean more ‘right questions’ and more ‘right answers’.
As such, data scientists should work closely with different types of employees to answer business questions. A data-driven business should cater for the needs of both data scientists and managers who want a higher-level overview. Data analytics transcend particular departments or seniority levels and blur the lines between technical and business functions.
To do that, data talent has to be matched with visionary leadership. Hiring a couple of PhDs will be beneficial, but without direction and support from the top, these highly paid data scientists may end up being expensive analysts, who make a few SQL queries followed by the odd Tableau visualisation. Management needs to clearly define the key business questions that need to be answered and create roadmaps for the medium to long term – showing what software needs to be built or bought, and who needs to be hired to use it.
Will data scientists become obsolete?
Will big data eventually make data scientists redundant? Maybe. More software companies are offering all-in-one platforms to collect, preprocess and store data; and write and deploy algorithms. Many applications even allow users to train and run a complex machine learning model without writing a line of code.
However, human intuition is still important in designing machine learning algorithms, particularly in ‘feature engineering’, which is extracting key variables from a set of data to use in an algorithm. Let’s say we are predicting customer churn. The important feature may not be the dates when a customer joins and leaves but the time span in between. Certain tasks and predictions, like this, require creativity and won’t be made by machines, yet.
Data scientists: you won’t be replaced any time soon.
Who knows how technologies like AI and machine-learning will develop in the next few decades. For the time being though, being a data scientist is just about the best place to be to take advantage of big data’s progress.