Instruction
The instruction is available in English, German and Spanish. It starts with (in English):
“Dear participant, this test is about finding rules in abstract patterns and to complete them in a logical way. Each task shows an incomplete jigsaw puzzle. The patterns you will see follow rules which may apply to a row, a column or to a diagonal. They may apply to the figure as a whole or to parts of it only. They may involve addition, subtraction, the alignment of figures or single components. Only one of the eight pieces given is the correct one required to complete the design. It is your task to select the piece which completes the jigsaw puzzle. Each task needs to be completed within 2:00 minutes“.
To ensure that the task is equally understood across participants, two sample matrices are presented and solutions as well as underlying patterns are explained.
Items
Each of the six items with increasing difficulty comprises 3 x 3 incomplete matrices: One of the nine parts in the matrix is missing and has to be identified by recognizing the underlying rules of the given pattern.
Response specifications
Eight potential solutions to complete the matrix are presented below it. Only one solution fits to the pattern of the incomplete matrix and is, thus, the correct one.
Scoring
Correct solutions of the items are coded with 1, incorrect solutions and missings are coded with 0. The total scale score is the sum across all items, which results in a range from 0 to 6.
Application field
The HMT-S is a web-based intelligence test. It may be applied by researchers (e.g., psychologists, economists, sociologist) or lecturers to samples with non-anomalous intelligence to survey non-clinical adults aged between 18 and 65 (Not only ceiling or floor effects but also boredom or frustration may arise in other samples.) The test was validated in samples with an age range of 17 to 57 years. It should be used for group comparisons and for correlative studies (sample sizes N > 100) and not for individual diagnostics. The average duration of the test is nine minutes: Five minutes for the matrices plus four minutes instruction time.
According to Deary (2012, p. 348), “intelligence predicts important things in life.” In particular, the relevance in job-related fields and education had been demonstrated even decades ago. In an early paper, Harrell and Harrell (1945) demonstrated the association between intelligence and specific occupations. Additionally, intelligence is highly correlated with job performance and job training success, which was proven in a meta-analysis by Salgado and colleagues (2003). In another meta-analysis, Poropat (2009) had shown significant associations between intelligence and academic success.
Intelligence must be understood as the mental ability to learn and solve problems. In other words (Gottfredson, 1997, p.13):
Intelligence is a very general mental capability that, among other things, involves
the ability to reason, plan, solve problems, think abstractly, comprehend complex
ideas, learn quickly, and learn from experience. It is not merely book-learning, a
narrow academic skill, or test-taking smarts. Rather, it reflects a broader and deeper
capability for comprehending our surroundings- “catching on”, “making sense” of
things, or “figuring out” what to do.
The Cattell-Horn-Carroll model (Schneider & McGrew, 2012) allows us a more detailed look at the theoretical structure of intelligence. It distinguishes general intelligence (see also Carroll, 1993), as well as abilities on the broad (e.g., Reading & Writing Ability, Comprehension-Knowledge, and Fluid Reasoning) and narrow level (e.g., of Fluid Reasoning: Induction, General Sequential Reasoning, and Quantitative Reasoning).
According to Carroll (1993), figural matrices primarily measure induction (which is the narrow level of Fluid Reasoning): The test taker’s task is “to inspect a set of materials and from this inspection induce a rule governing the materials, or a particular or common characteristic of one or more stimulus materials, such as relation or a trend” (p. 211). Schneider and McGrew’s (2012) definition of induction is quite similar: It is “the ability to observe a phenomenon and discover the underlying principles or rules that determine its behavior” (p. 112). Accordingly, we conclude that the HMT-S is test of induction. Induction, in turn, is defined by Schneider and McGrew (2012) as a narrow ability. Further, it is the core of the broad ability of fluid reasoning (Schneider & McGrew, 2012). Thus, the HMT-S is a test of induction as well as of fluid reasoning. One may now argue that there are plenty of well-designed, validated, and established tests of intelligence and the HMT-S would be redundant. However, this is core point of the benefit: The aim of the HMT-s was to develop a web-based, economic, and valid alternative which is not very different from other tests using matrices, but: it is free of costs.
All analyses were computed with SPSS (IBM).
Item generation and selection
We selected six items from the 20 items of the Hagen Matrices Test (HMT; Heydasch, 2014). The matrices were designed by using two operations: addition and movement. Addition means that graphic elements in the rows or columns of a matrix emerge. Movement means that graphic elements rotate or shift. Items were selected according to an appropriate item difficulty (p > .20), item discrimination (rit > .30), and item validity defined as correlations with the reasoning score of the Intelligence-Structure-Test 2000 R (Liepmann et al., 2007; r > .30).
Samples
To develop the HMT-S, we used the validation sample of the HMT (Heydasch, 2014). The sample was randomly divided into two groups. We used the analysis of the first group (Study 1; N = 681) for item selection and a first validation. The results were cross-validated with the second group (Study 2; N = 658). To further validate the HMT-S, we administered in Study 3 (N = 233) the test in October 2012 to an ad hoc sample of psychology students from the University of Hagen which received course credit for participation. A total of N = 1,572 students (75% women) with an average age of M = 31.6 years (SD = 8.97) participated (see Table 1).
Table 1
Sample Characteristics
Study |
Sample size |
Women |
Age |
|
|
N |
% |
M (SD) |
Range |
1 |
681 |
74% |
31.5 (8.89) |
18-63 |
2 |
658 |
74% |
31.8 (8.95) |
16-70 |
3 |
233 |
80% |
31.6 (9.28) |
15-57 |
total |
1,572 |
75% |
31.6 (8.97) |
15-70 |
Item analyses
Item parameters
Mean duration, difficulty, and selectivity of each item were calculated in each of the three studies (see Table 2).
Table 2
Mean Duration, Difficulty, and Selectivity of the Short Form of Hagen Matrices Test (HMT-S)
|
Mt |
p |
rit |
||||||||
Item |
1a |
2b |
3c |
1d |
2e |
3f |
1d |
2e |
3f |
|
|
1 |
43 |
44 |
39 |
.88 |
.88 |
.88 |
.38 |
.32 |
.34 |
|
|
2 |
30 |
32 |
28 |
.85 |
.83 |
.87 |
.42 |
.42 |
.35 |
|
|
3 |
53 |
53 |
52 |
.66 |
.65 |
.67 |
.39 |
.38 |
.25 |
|
|
4 |
51 |
53 |
51 |
.65 |
.64 |
.70 |
.47 |
.46 |
.39 |
|
|
5 |
53 |
54 |
53 |
.58 |
.53 |
.57 |
.40 |
.31 |
.29 |
|
|
6 |
76 |
76 |
77 |
.25 |
.25 |
.35 |
.23 |
.20 |
.21 |
|
|
Notes. Mt = Mean duration in seconds. p = difficulty. rit = selectivity. aN1 = 636. bN2 = 633. cN3 = 229. dN1 = 681. eN2 = 658. fN3 = 233. |
|||||||||||
Objectivity
HMT-S is a web-based test. Instruction, administration, scoring, and interpretation of the HMT-S are computer-based and thereby highly standardized. Accordingly, application, evaluation, and interpretation of the HMT-S can be deemed highly objective.
Reliability
To estimate the reliability, we used the Kuder-Richardson Formula 20 (KR20; Kuder & Richardson,1937), because this method of analyzing the internal consistency is appropriate for a unidimensional measure with dichotomous items. The KR20 scores were .64 (Study 1), .61 (Study 2), and .57 (Study 3). The weighted mean was .62 (N = 1,572). Independent studies found quite similar or even better coefficients (e.g., Rammstedt et al., 2018, α = .63; Spurk et al., 2020, α = .71). These results indicate a sufficient reliability for research purposes investigating associations between intelligence and other features or testing group differences if using an appropriate sample size. However, a calculation of the relatively high standard error and the relative wide confidence interval make clear, that a single individual score is not useful.
Validity
The HMT-S is the short form of the HMT. Thus, it should be highly associated with the original HMT (see Silverstein, 1990). The correlations between the HMT-S and HMT were r = .79 (p < .001) in Study 1 and r = .78 (p < .001) in Study 2. Taking the imperfect reliability into account, the correlations points to a great similarity between the two tests. Besides these correlations, the HMT-S was compared with intelligence measures of the Intelligence-Structure-Test 2000 R (Liepmann et al., 2007)[1]. Domains (e.g., reasoning versus knowledge) and content (e.g., figural versus verbal) of the Intelligence-Structure-Test 2000 R similar to the domain (fluid reasoning) and content (figural matrices) of the HMT-S should correlate more strongly with the HMT-S than other domains and contents. The results met these expectation (see Table 3) and demonstrate an appropriate convergent and content validity[2].
Table 3
Convergent Validity: Correlations with the Intelligence-Structure-Test 2000 R (Liepmann et al., 2007)
|
|
|
|
Study |
|
Domain |
Content |
|
1a |
2b |
1+2c |
Reasoning |
Total |
.53*** |
.49** |
.52*** |
|
Gf |
.50*** |
.47** |
.49*** |
||
Verbal |
.46*** |
.03 |
.30** |
||
Numeric |
.42** |
.53** |
.46*** |
||
Figural |
.47*** |
.46** |
.47*** |
||
Knowledge |
Total |
.24 |
.42* |
.31** |
|
Gc |
.15 |
.36* |
.23* |
||
Verbal |
.06 |
.24 |
.13 |
||
Numeric |
.26* |
.36* |
.30** |
||
Figural |
.29* |
.48** |
.35*** |
||
Memory |
Total |
.36** |
.07 |
.24* |
|
Verbal |
.24 |
-.04 |
.12 |
||
Figural |
.34** |
-.16 |
.27* |
||
Notes. gf = fluid intelligence factor; gc = crystallized intelligence factor. aN1 = 56. bN2 = 35. cN12 = N1 + N2 = 91 (for easier interpretation the small samples were aggregated). * p < .05. ** p < .01. *** p < .001. |
Furthermore, we investigated associations of the HMT-S with real life outcomes. Therefore, we measure some indicators of academic success (see Poropat, 2009), which are self-reported school and university grades. Grades represent educational success of everyone and are core criteria in job selection processes. Although the grades were self-reported by students, we assume them to be valid (see Dickhäuser & Plenter, 2005), but less reliable than usual because of the time lag. Correlations between these indicators and the HMT-S are presented in Table 4. The average grade (GPA) and grades in mathematics and statistics are associated more strongly with the HMT-S than grades of other subjects. The finding makes sense as the HMT-S requires abstract and logical reasoning and is, thus, related to mathematics and statistics. This is an additional indication of construct validity and furthermore evidence for criterion-related validity
Table 4
Criterion-Related Validity
|
Study |
||
|
1 |
2 |
3 |
School Gradesa |
|||
GPA |
-.14*-- |
-.16**- |
-.16*-- |
Mathematics |
-.19*** |
-.20*** |
-.19**- |
English |
-.03--- |
-.03--- |
-.09--- |
German |
-.01 |
-.02 |
-.01--- |
Biology |
-.07--- |
-.07--- |
-.16*-- |
Art |
-.01--- |
-.04--- |
-.09--- |
Sports |
-.03--- |
-.09--- |
-.03--- |
University Grades |
|||
GPAb |
-.28**- |
-.18*-- |
-.06--- |
Grade in Statisticsc |
-.22--- |
-.38**- |
-.05--- |
Note. All values are Pearson correlations. Grades were inverted: Higher values indicate better grades. aN1 = 308, N2 = 287, N3 = 191. bN1 = 131, N2 = 124. N3 = 78. cN1 = 70, N2 = 70, N3 = 67. * p < .05. ** p < .01. *** p < .001. |
Descriptive statistics
Descriptive statistics are presented in Table 5.
Table 5
Sample Characteristics
Study |
N |
M (SD) |
Skewness |
Kurtosis |
1 |
681 |
3.88 (1.56) |
-0.65 |
-0.31 |
2 |
658 |
3.78 (1.53) |
-0.55 |
-0.41 |
3 |
233 |
4.03 (1.47) |
-0.76 |
0.05 |
Further quality criteria
Men scored slightly higher than women in the HMT-S, which hints to a slight advantage of men compared to women in the HMT-S. The difference variated between d = 0.15 to d = .0.25 which is according to Cohen (1992) a small effect size. Compared to other findings (e.g., Irwing & Lynn, 2005) the results are not surprising and, taking the smallness of the effect size into account, the HMT-S seems to be quite fair across gender. In addition, age was also not strongly associated with the HMT-s score (r = -.13 in Study 1; r = -.12 in Study 2; r = -.02 in Study 3; 1=female; 0=male).
Further literature
Heydasch, T., Haubrich, J. & Renner, K.-H. (2013). Die Kurzform des Hagener Matrizen-Tests (HMT-S). Ein 6-Item Intelligenztest zum schlussfolgernden Denken. Methoden, Daten, Analysen, 7, 183–208. https://doi.org/10.12758/mda.2013.011
Heydasch, T., Haubrich, J. & Renner, K.-H. (2020). The Short Version of the Hagen Matrices Test (HMT-S). A 6-Item Induction Intelligence Test (T. Heydasch, Trans.). Methods, Data, Analyses, 7, 183e–205e. https://doi.org/10.12758/mda.2013.021
− Dr. Timo Heydasch, University of Hagen, Department of Work and Organisational Psychology, Universitätsstr. 33, 58097 Hagen, Germany, E-Mail: Timo.Heydasch@gmx.de