*This post was authored by Ryne Sherman and Brandon Ferrell.
A recently published study suggests that some of the most common personality assessments (i.e., one’s based on the Big 5) don’t work in other countries. The study was published in a prestigious journal (Science Advances, impact factor > 12), and it has already gained prominent media attention. One outlet said that these personality tests don’t hold up around the world. NPR said that personality tests don’t reveal the real you. Reading these articles might make you conclude that personality assessments just can’t be used in other countries. Fortunately, despite what the economists who contributed to the article and the journalists who are covering it might have you believe, such a conclusion is just wrong. In what follows, we show you why.
Comparing Assessments Across Borders and Languages
If you had a rod that measures 1 meter in Australia, but 2 meters in Kenya, you have a big problem. Clearly, the term “1 meter” doesn’t mean the same thing in different locations or different languages. As a principle of measurement, you want to be sure that whatever you are measuring in one location (or one language) is the same thing that you are measuring in another. In terms of personality assessment, comparing countries (or languages) absolutely requires that the assessments are used in the same fashion across countries and languages. Psychologists use a metric called the congruence coefficient to determine the degree to which instruments are measuring the same thing. Scores on the metric can range from -1.00 to +1.00, with higher scores indicating greater similarity. The accepted standard for declaring the instruments as similar is a congruence coefficient > .84. The recently published study found average congruence coefficients of .73 and .71 in survey data gathered in so-called non-WEIRD countries (e.g., Kenya, Philippines, Colombia, etc.).
At Hogan, we conduct more than 1 million personality assessments per year with data coming from more than 100 countries, in 47 languages, all around the world. Our assessments are completed by working adults who are either applying for jobs, or as part of their current job’s developmental curriculum. Our flagship measure of personality, the Hogan Personality Inventory, measures 7 personality characteristics that are closely related to the Big 5. In examining our archive, we identified 52 countries with sufficient HPI data to conduct the exact same analysis conducted by the economists. Here is what we find across the 52 countries:
Table 1. Average Congruence Coefficient for 52 Countries on the HPI.
|Country||Congruence Coefficient||Country||Congruence Coefficient|
|Spain||.97||United Arab Emirates||.92|
Every single country exceeds the .84 threshold for similarity. The lowest congruence coefficient we found was for Peru (.90). As a direct comparison with the recently published work, we find much higher congruence coefficients for Kenya (.96 vs. .71), Colombia (.91 vs. .72), Philippines (.96 vs. .72), and Serbia (.97 vs. .79). These results are in stark contrast to the conclusions drawn by popular media: high quality personality assessments work – and measures exactly what we think it is measuring – in other countries and languages all around the globe.
We Aren’t the Only Ones
In 2002, a book chapter by Rolland reported average congruence coefficients of .92 with French and .93 with U.S. English across 15 different countries (including non-WEIRD countries: Malaysia, Korean, Philippines, & China). The most comprehensive published research of this question to date found an average congruence coefficient of .93 across 50 different countries (including Turkey, Serbia, Japan, South Korea, Hong Kong, Thailand, Indonesia, Burkina Faso, Kuwait, Philippines, Russia, China, India, Malaysia, Botswana, Nigeria, Ethiopia, Lebanon, Uganda, & Morocco; which collectively averaged .91). Another paper by De Fruyt and colleagues extended this analysis to adolescents, reporting an average congruence coefficient of .92 across 24 different countries (including Malaysia, Serbia, South Korea, Japan Iran, Thailand, Hong Kong, Turkey, China, and Uganda; which collectively averaged .90).
The point here is this: the largest, most comprehensive studies and databases speaking to the universality of personality factor structures have all come to the conclusions that these personality dimensions are universal. So why did this recent study come to the opposite conclusion?
So…What’s Wrong with that Study?
Research published in academic journals typically must go through a rigorous (and at times, somewhat arbitrary) review process. This involves subjecting the research to review by external experts in the field who scrutinize the work for potential errors and mistakes. Despite this process, it is sometimes the case that flawed work, or flawed conclusions, slip through the cracks. Such is the case with the article in question here. There are two critical problems.
First, the analyses and conclusions of this paper rest on data gathered using a 15-item measure of personality. The 15 items are a subset of items from a medium-length (but well-validated) 44-item measure of personality, known as the Big Five Inventory. It is not clear how these 15 items were chosen (as part of a larger survey), or their psychometric properties. However, it is clear that short measures of personality frequently show poor results. By comparison, studies demonstrating the universality of personality structures (including our own data) used longer, and undeniably far superior, measures of personality. Thus, the results of this study could be adequately summarized as garbage in, garbage out.
Second, the study in question regularly notes that many of the people surveyed had trouble understanding the questions they were being asked, in some cases it was not clear that the participants were even literate. It should come as no surprise that if people cannot read, or understand, the items on a personality assessment that their responses to the questions are necessarily nonsense. If responses to a personality assessment are effectively random, it is certain that there will be no congruence. Further, if even only a sizable proportion of the respondents cannot read the questionnaire but respond anyway, this will necessarily drive congruence coefficients down, perhaps even below the threshold for similarity. By comparison purposes, the participants in all of the studies demonstrating the universality of personality structures were educated well-enough to read and understand the questionnaires. Put another way, the results of this study demonstrate that if people cannot read your test, they will not respond in logically coherent ways.
In summary, the sky is not falling for personality assessment. The evidence, to date, overwhelmingly demonstrates that the Big 5 personality structure is universal. When high-quality measures of personality are used, and the respondents can read and understand the questions, the structure of personality looks incredibly similar across culture and language. It is far more likely that this study’s failure to replicate this structural similarity is due to poor data quality rather than the outlandish notion that personality structures are different in non-WEIRD cultures.