Five Steps To Using Machine Learning Without the Hype
Today, we can see the same behavior—or hype—surrounding Artificial Intelligence (AI) that we saw 20 years ago when everyone wanted to have a website even though they had no idea what it was. Now, everyone wants AI in their application, but not everyone knows what it is, its feasibility, or what problems it will solve.
Artificial Intelligence is a general term that simply means that a machine can learn to work and react like a human. Since AI is too broad a term, let us dive into one that is more specific and is considered a subfield of AI: Supervised Machine Learning (ML). ML is generally the process of giving the computer data for a specific situation and then letting the computer learn the patterns of the data. When the patterns are learned from historical data that has been labeled with outcomes, predictions of these outcomes can be made—this is termed a Supervised Learning model. Once the Supervised Learning Model is available, the computer can then be given some future data and can use it to provide a sensible, predictable answer about what might happen (based on what it “learned” from historical data).
CVP has created a simplified framework with five steps on how to fully utilize ML to solve a problem. Two steps are questions for the business owner to ask; the rest are for the developers to act upon. This framework identifies how to: obtain data, build the right questions, select the proper ML tool, get the prediction needed, and communicate results.
STEP 1 Ask: Obtain Data: What Data Do We Have?
- Figure out what data you have to solve the problem. This data needs to be organized in a spreadsheet/table structure with a large amount of sample data related to the problem. Assuming your outcomes are equally likely, a rough rule of thumb is to have 10 examples per possible value per column. If you have less than one hundred examples overall, you may be better off analyzing the data manually.
STEP 2 Ask: Build the Right Question: What Do We Want?
- From your gathered data, build the question that you are trying to answer. It is a best practice to write a question like: Given <the columns/rows of data I have>, How do I predict <one of the columns I have>?
STEP 3 Act: Select the Proper ML Tool: How to Pick the Right Tool.
- The developer is a key player when it comes to picking the correct tool. First, examine the volume of data—specifically the number of columns and rows. If there are only two columns, one column will predict the other, which will become your typical X/Y coordinate chart and a simple linear or logistic regression might suffice. If there are more than two columns, more complex approaches should be considered, such as Decision Trees, Deep Neural Networks, or Support Vector Machines. In order to best “train” the tool, the more data available, the better.
STEP 4 Act: Get the Prediction Needed: Use the Tool, Review Results; Rinse, Repeat.
- To run the tool, you must take 70-80% of the data and use it to train the ML model. It will be used to calculate an answer. The developer needs to run 20-30% of the remaining data through the algorithm for testing purposes, to see if the results make sense and appear valid for data the model hasn’t seen.
STEP 5 Act: Communicate: Share the Results.
- Once the developer finishes refining the tool, gathering results, and testing them, they need to be shared with the team, especially the Subject Matter Experts (SMEs) for the business process you’re looking at. While reviewing the results, the team needs to make sure that the results make sense, are reasonable, and are of value.
Following this simple framework, CVP is able to train and test an ML model and get valuable results that answer desired questions formed from available data sets. It is critical that the business team that is familiar with the underlying problem is included when evaluating results and doing additional testing, or rework may be needed if the data results don’t match expectations. CVP is able to fully benefit from the capabilities of ML through this process and therefore avoid just getting involved with the “hype” of AI.