SIOP 2014 Symposium: A Critical Review of Mechanical Turk as a Research Tool


SIOP Hawaii
As the pace of innovation increases, so does the need to test innovations to determine their worth.  Items enhancing quality of life are widely adopted.  For example, software such as SAS and SPSS allow us to instantly run analyses that would have previously taken days or weeks.  More recently, online data collection has replaced paper-and-pencil data collection and manual entry (Horton, Rand, & Zeckhauser, 2010).  Similarly, websites like Amazon’s Mechanical Turk (MTurk) may allow quick and inexpensive access to hundreds of thousands of participants, but a critical review is needed to determine its worth as an innovative data collection resource.

MTurk is an example of a crowdsourcing website where researchers outsource data collection to online participants rather than using laboratory and other samples (Chandler, Mueller, & Paolacci, in press).  Websites such as Crowd Cloud and Crowd Flower also facilitate crowdsourcing (Gaggioli & Riva, 2008), but we focus on MTurk because it is currently the dominant crowdsourcing application for social scientists.  In fact, research conducted using MTurk has already appeared in peer-reviewed journals (Holden, Dennie, & Hicks, 2013; Jonason, Luevano, & Adams, 2012; Jones & Paulhus, 2011).

Using MTurk, participants called “workers” browse Human Intelligence Tasks (“HITs”) posted by “requesters” conducting research.  After selecting and completing HITs, workers are paid a pre-determined fee.  Because MTurk offers access to a large and diverse pool of over 500,000 participants from over 190 countries, researchers’ interest in MTurk as a potential new data collection resource is understandable (Bohannon, 2011; Mason & Suri, 2011).

The goal of this symposium is to bring professionals together to conduct a critical review of MTurk as an avenue for conducting psychological research.  Before turning our session over to our discussant, presenters will share data to evaluate MTurk against other samples.

The Gaddis and Foster paper uses MTurk to test items for developing and maintaining assessments.  The authors compare MTurk data to samples of students as well as applicants and incumbents from organizations.  This paper also includes lessons learned and recommendations for professionals interested in using MTurk.

The Harms and DeSimone paper explores a data cleaning approach to assessing the quality of MTurk data.  Using seven statistical data screens, the authors investigate the prevalence of low-quality data in a large sample of MTurk data.  Results from this paper differ with those from the existing research literature.

The Woolsey and Jones paper recounts a first-time user’s experience using MTurk to conduct international research.  The authors detail practical, methodological, and ethical issues they encountered using MTurk to collect data in the U.S. and Japan.  The paper concludes with questions about the future of crowdsourcing as a means of collecting data.

The Cavanaugh, Callan, and Landers paper reviews a research study comparing MTurk workers to undergraduates on individual difference variables and an online training task.  This paper fills a gap in the existing literature by examining the feasibility of MTurk as an avenue for conducting research on training processes and outcomes.

This symposium will be held Thursday, May 15.

References available