Amid Continuous Change, the Data Science Revolution Adds Value to Your Organization: Part One in a Three-Part Series
Data scientists help decision-makers shift from ad hoc analysis to an ongoing conversation with data. — Harvard Business Review
The history of data science is short, but its roots are deep. Enabled over the last 15 years or so by the rapid increase in computing power, and the vast amount of data that can be stored, the practice of data science has become critical to operating in the Digital Age.
As far back as the late 19th century, the introduction of machines helped accelerate the process of sifting through and organizing data. For example, in 1890, the U.S. Census Bureau used a newly invented “tabulating machine” to process census data recorded on punch cards, reducing the time required to complete a government activity that previously took seven years to just 18 months.
Today, at a time when the flow of data and technological innovation is swift and continuous, organizations find that the process of storing, managing, mining, and analyzing unimaginable amounts of data can be slashed to days, hours, or seconds. Recently at CVP, our technologists working with a government organization were able to cut the processing time for an immense data set from four weeks to three days by reconfiguring the system to use a distributed Big Data cluster and by automating manual steps
CVP’s Data Science and Engineering Practice embodies the company’s Continuous Everything approach to modernization, aligning its competencies synergistically with the constant change in the technology environment. The Practice’s four competencies reflect the components integral to the data lifecycle—data systems, data engineering, predictive analysis, and descriptive analytics:
- Procuring, managing, and optimizing tools and systems (data systems)
- Identifying and transforming source data so it can be properly analyzed (data engineering)
- Using data to project outcomes and challenges (predictive analytics)
- Analyzing and describing data in the context of the organization’s mission (descriptive analytics)
The data systems competency is the foundation upon which the other competencies are constructed. It provides the technology crucial to collecting and managing data. CVP’s data systems specialists have inside-out knowledge of the unique requirements of this work, such as determining the best way to set up Hadoop, Spark, and Presto clusters to solve problems relating to Big Data and computation. This team also widely uses automation to improve system reliability and reduce costs.
Descriptive and predictive data analytics represent the front-end of the data cycle. Descriptive analytics concentrates on an organization’s mission objectives based on evaluations performed by analysts who create summary views for organization leaders and stakeholders in the form of dashboards, reports, and visualizations that can be easily understood by a spectrum of stakeholders.
In contrast, CVP data technologists use predictive analytics to develop a program to forecast or infer outcomes and potential challenges to help organization leaders make more astute business decisions. This competency merges machine-learning models into the process and taps artificial intelligence techniques, such as supervised learning, unsupervised learning, and natural language processing. Predictive analytics lets companies move away from reactive strategies and focus their business decisions on positive outcomes while reducing potential risks based on traditional forecasting models.
Data engineering ensures that the right data is sourced, prepared, and made available to meet CVP analysts’ requirement for pristine data. Data engineering also creates new metrics, dimensions, and features in support of analytics programs. It’s the complex water filtration system that turns raw, messy data into clean data that is critical for good predictive and descriptive analytics.
While each of the four competencies is a separate discipline, they aren’t stove-piped. They are interdependent and must work in concert to provide CVP customers with the complete data picture they need to make intelligent business decisions. For example, descriptive and predictive analytics are inextricably dependent on data engineering to furnish technologists with clean data with which to work. In turn, the data systems’ competency is the foundation for the other three competencies, supplying the tools and technologies needed to process the data.
Given this interdependency, CVP separates itself from many competitors in the data space in that it retains teams of technologists who are experts in all four competencies, not just one or two. With over a decade of experience, we’ve found that these cross-functional teams lead to better outcomes. Rather than a group of statisticians who can’t put their theoretical model into production, we provide teams that can take a preliminary idea all the way through to testing it with real data and putting it into production usage on a massive scale in the cloud.
Next in the three-part series: Enterprises face a daunting number of choices and options when it comes to moving data to the cloud. It takes skilled and versatile technologists to navigate customers through a cloud migration, making it crucial for organizations to get the process right from the start.