What the Amazon Blunder Teaches Us About Big Data



Untitled-1In this era of Big Data, simply producing or collecting nearly unfathomable amounts of data isn’t enough. The best companies are able to sift through that data to find meaningful trends and, ultimately, specific information that sparks a plan of action.

In the rush to harness that data for job selection, numerous companies are turning to experimental AI and machine learning to discover new forms of data collection and new types of analysis human beings might not be capable of. But not all new methods of data collection are created equal. If set up incorrectly, AI data analysis can go horribly wrong – just ask Amazon.

The Internet giant decided to harness its computing power and expertise to create a job screening program that would scan an applicant’s resume and determine if an applicant is suitable. A person familiar with the effort told Reuters the goal was for the program to receive 100 resumes and spit out the top five.

In order to teach this program how to screen candidates, it was fed resumes submitted tothe company over the last decade. In theory, the program would learn what resume terms lead to successful candidates and which terms lead to rejection. In reality, the program learned to reject female candidates.

The show-stopping side effect was the result of Amazon’s own hiring patterns – most of Amazon’s employees are male. Based on that set of data, the program taught itself that male candidates were preferable. Resumes that included the word “women’s” or the names of all-women colleges were downgraded. Since there was no guarantee the program wouldn’t find other blatantly discriminatory ways to reject candidates, executives pulled the plug.

In short, Amazon’s mistake in this experiment was using biased criterion to judge the resumes, and then give the program free reign. Ryne Sherman, chief science officer at Hogan summed up Amazon’s problem:

“If the criteria are deficient, contaminated, or otherwise systematically biased, big data algorithms will pick up the bias and run with it.”

Today’s supercomputers are immensely powerful and capable of amazing feats. They’re also unfailingly literal. No matter how much power you’re working with, if you set up bad parameters, you’ll get a bad result. Amazon’s mistake was easy to find, but even subtle mistakes made by emerging job screening technology can lead to catastrophic results.

The key takeaway from Amazon’s failure is that big data still needs a human touch. Any type of analysis requires clearly-defined parameters before the supercomputers are even turned on in order to eliminate bias or any other type of noise. Start gathering data without a structure, and your efforts will be wasted.

At Hogan, our assessments were built from the ground up to be free of bias. Even though the database we use to determine scoring and job fit has grown to millions of assessments and has become ever more complex, our results remain valid because of the assessments’ focused structure. In fact, company founders Drs. Joyce and Robert Hogan were inspired to create the company in part out of a desire to eliminate bias from assessments.