Data science is an incredibly important field which is gaining popularity now that technology has made it cheaper and easier to store giga/tera/petabytes of data. At its core, data science isn’t necessarily a technical field, but more of an academic one. Wikipedia says, “Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from data in various forms, both structured and unstructured, similar to data mining.” Nate Silver once stated. “I think data-scientist is a sexed up term for a statistician.”
Nowadays (2018, at the time of this post), data scientists are often much more than statisticians who hang around the office whiteboards. The tools to ingest and process data have greatly evolved in the last decade, forcing workers to do the same. Depending on the team and the goal, a data scientist may be working on spreadsheets formulas or programming directly in AWS cloud.
In fact, the goals of a data science team can be incredibly diverse from company to company. For example, data scientists can…
help optimize call centers or sales floors
help optimize warehouses and logistics
help optimize user experience of a website or app
help protect customers and companies
help us understand the natural world
They can do these things by…
predicting volume spikes in locations or times
developing tools that use machine learning
building or overseeing the infrastructure that collects data
calculating the dollar value of events or actions
building models for mitigating risk
building models to detect fraudulent activity
creating recommendation engines
building and running simulations
Building metrics dashboards
communicating, communicating, communicating
Because of this spread of specialties, it can be difficult to find the right data scientist for your team. It can be even harder to build teams around the data scientists that you already have.
You’ll want to start with the output: what are you trying to optimize or predict? Next, decide on the timeline. Are you going to build a system to process incoming data in real time, or are you just trying to make some one-time decisions?
Let’s talk about AirBnB’s data science team, which has been trying to improve their dynamic pricing by pouring over all of their booking data to predict surges in booking for specific dates and locations. Their goals not only involved aggregating data, but also creating mathematical formulae which had to be fed into machine learning models. Those models, once trained, needed to be integrated into the AirBnB product itself. Clearly, there is more than just statistics at work. This team can only function with a combination of statistics, infrastructure management, machine learning, and application development.
Stay in the know
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.