Want to extract real value from your data? Better hire a data scientist or two.
In the past several months, large enterprises, staffing firms and universities have observed increasing interest in a new class of data professional - the data scientist. A curious blend of business, analytics and computer skills, this hot new title is on the march in diverse verticals such as energy, e-commerce, healthcare and financial services. And if experts are correct, this is just the beginning. (See also: Who's hiring data scientists? Facebook, Google, StumbleUpon and more.)
"Companies are becoming so data- and application-centric. They need individuals who can come to the table to model and mine in big data environments," says Laura Kelley, Houston vice president at IT consulting and staffing firm Modis.
What sets data scientists apart from other data workers, including data analysts, is their ability to create logic behind the data that leads to business decisions. "Data scientists extract data, formulate models and apply quantitative analysis in a proactive manner," Kelley says.
These hefty responsibilities lead to a commanding salary -- $110,000 to $140,000 across the country, Kelley has found. "There are data scientist jobs available today - you just have to have the right combination of skills," she says.
CONTINUING ED: Should tech pros get an MBA?
Enter Michael Rappa, director of the Institute for Advanced Analytics at North Carolina State University in Raleigh. For the past six years, Rappa and his fellow professors have been refining their graduate program to develop ready-made data scientists.
"Data scientists have to draw structured and unstructured data from different sources, including real-time streams, and try to understand it to add value to the business," Rappa says. "It's not just about the volume of the data, but the variety and velocity of it."
Companies that attempt to handle big data with siloed statisticians, computer scientists or MBAs will fail, Rappa believes. Instead, they need professionals with a convergence of these skills to fully grasp the business and technological challenges.
MBAs understand business concepts such as product development and management, but aren't able to analyze and interpret data. Mathematicians and statisticians lack intimate knowledge of the business. "Data scientists must have an openness to solving business problems, not just be able to perform some nifty modeling. We educate students in a way that cuts across disciplines," Rappa says.
The approach has been proven out as 100% of the program's participants are placed before graduation. "They are highly sought after and highly paid," he says. In fact, the program recently expanded its annual enrollment from 40 to 80. "We doubled the size to meet the demand coming from the private and public sectors."
Rappa admits that the term "data scientist" is far more appealing than its piece parts such as statistician and computer scientist. "Data science captures the imagination," he says.
Eric Horn, education director at the Data Sciences Summer Institute at the University of Illinois at Urbana-Champaign, agrees that there is a certain mystique to data science, even though it has a heavy computer science influence.
For instance, his students as well as those at the university's Illinois Informatics Institute are trained in various machine learning algorithms, natural language processing and intelligent search algorithms. They also learn how to apply those algorithms in myriad domains such as healthcare.
Like Rappa, Horn has witnessed heightened interest in his program, but can't expand enrollment at this time due to funding.
Modis' Kelley feels educational opportunities will open up as more companies focus in on data scientist skill sets. She encourages candidates with partial talents - such as an MBA, analytics or computer science - to fill out their resume with degrees or certificates from tailored programs like Rappa's and Horn's. (See also: 10 tech-centric MBA programs.)
The data scientist draft
At eBay's transaction arm PayPal, Chief Scientist Mok Oh is creating a fantasy data scientist team and he's hoping to unearth candidates like those being churned out by the programs at Horn's and Rappa's institutions.
PayPal plans to study the tens of petabytes of data its customers and partners generate to predict buying patterns. Oh wants to carefully blend spending and behavioral data to develop profiles and uncover trends that will help attract new customers to PayPal and its partner ecosystem.
Though Oh's ideal candidate would have all three skill sets - business, analytics and computer science - he has not found enough of them. "It's almost impossible to find those three heads in one body," he says. So instead, he's developing a data science team comprising all three disciplines:
*A majority - 80% - will be PhDs focused on machine learning, natural language processing and data mining.
*10% will be statisticians highly skilled in data modeling and analytics to develop key performance metrics.
*Another 10% will be MBAs who know the right questions to ask such as "Why do people stop using PayPal?"
Oh is convinced this concentrated team - vs. dispersed silos of data analysts - will propel PayPal into the next generation and better serve its customers.
Donald Farmer, vice president of product management at business intelligence software maker QlikView, says most enterprises can make use of data scientists to improve processes and identify new business opportunities. For instance, in financial services, data scientists can develop algorithms for trading and risk management, and in pharmaceuticals they can study drug test results.
Farmer warns, though, that companies who bring on data scientists have to be able to tolerate failure. "Data science is all about experiments. Companies have to create structures at the edge of their organization that are not only entitled but are expected to fail. Otherwise, the data scientists aren't trying hard enough," he says.
It's a tough pill to swallow for cash-strapped organizations. "Sometimes you have to fail at a model to be able to rethink it properly - but that can be risky and expensive," says Ryan Swanstrom, author of the "Data Science 101", a blog about his journey to become a data scientist.
Given the right environment, data scientists have the potential to strike gold for companies - especially those focused, like at PayPal, on how to find new customers and improve service to existing customers. Interestingly, Swanstrom feels that shaking up the trifecta of computer science, business and analytics with peripheral fields such as physics and psychology would improve results.
Modis' Kelley labels the data scientist role "a work in progress." "What companies called a data scientist a year ago is different than their requirements today," she says.
The only sure thing, according to Horn, is demand: "With the availability of data, there is certainly going to be opportunity for as many people as want to pursue data scientist training."
Gittlen is a freelance business and technology journalist in the greater Boston area. Email her at [email protected]
Read more about infrastructure management in Network World's Infrastructure Management section.