5 Types of IQ Tests Explained

Wondering which cognitive assessment is right for you? Explore the 5 types of IQ tests, from clinical batteries to online options. Read the full guide!

Dr. Russell T. WarneChief Scientist

One of the most common misconceptions about IQ testing is that all tests are essentially the same or that the differences between them are trivial. That is not true. Every IQ test varies considerably in their format, administration method, content, purpose, and the depth of information they provide. Understanding these differences matters for both past and future examinees.

Before getting into the types, one foundational point is worth establishing. Despite their differences, all legitimate IQ tests converge on the same underlying construct: general intelligence, or g. Charles Spearman named this principle the "indifference of the indicator" — his finding that as long as a task requires thinking, reasoning, or judgment, it will measure intelligence to some meaningful degree. Modern research has consistently confirmed this. That is why tests with very different-looking content can still produce scores that are meaningfully comparable. That does not mean all IQ tests are equally good or useful, though.

With that in mind, here are the five types.

Type 1: Individually Administered IQ Tests

Individually administered tests are given by a trained examiner to a single person at a time, either face-to-face or via a video call. The most widely used examples are the Wechsler Adult Intelligence Scale (WAIS), currently in its fifth edition and the most widely used individual intelligence test for adults in the world, along with the Wechsler Intelligence Scale for Children (WISC), the Stanford-Binet Intelligence Scales, and the Woodcock-Johnson Tests of Cognitive Abilities. All require an examiner with graduate-level training in standardized test administration.

The core advantage of individual administration is that a trained examiner is present, they can observe behaviors invisible on a self-administered test, such as how a person approaches a difficult problem, whether effort seems genuine, or if the examinee is experiencing a high amount of anxiety. Examiners can adapt pacing and verify task comprehension within the bounds of standardized procedures. These tests also produce rich score reports with a full scale IQ alongside index scores and observations from the examiner. The profile of scores and the examiner’s insights make individually administered tests the standard for clinical diagnostic decisions about learning disabilities, giftedness, cognitive decline, or forensic assessments.

The tradeoffs are real. Administration takes about 90 minutes and requires a licensed psychologist or specially trained professional. Clinical testing fees are often several hundred dollars or more. This makes individual testing impractical for any situation where many people need to be assessed at once.

Type 2: Group Administered IQ Tests

Group testing emerged from practical necessity. When the United States entered World War I, the military needed to assess the cognitive abilities of newly drafted recruits. With over 1 million men joining the army per year, individual testing was not feasible. A committee of psychologists chaired by Robert Yerkes developed the Army Alpha (for literate recruits) and Army Beta (for those with low literacy). These were the first intelligence tests designed for simultaneous group administration. More than 1.75 million tests were administered before the war ended, and the approach directly influenced testing in schools and civilian employment afterward.

Today, group administered tests are used primarily in educational settings for screening gifted programs, supporting placement decisions, and gathering population-level cognitive data. The U.S. military continues to use the Armed Services Vocational Aptitude Battery (ASVAB), which functions as an intelligence test and is used to determine enlistment eligibility and occupational placement. Civilian employers also use group cognitive ability tests for hiring and promotion.

Modern group-administered tests are often administered on computers, delivering test questions to all examinees simultaneously, and scoring is automated. The efficiency advantage means one session can generate data for hundreds of examinees. The limitation is precision. Without individual monitoring, there is no way to detect when low scores reflect disengagement, distraction, or poor instruction comprehension rather than low ability. Group tests carry a higher risk of scores being distorted by factors unrelated to intelligence, which is why consequential individual decisions based on group test scores are sometimes followed up with individual testing.

Type 3: Online IQ Tests

Online testing solves one of the oldest access problems in intelligence assessment: how to provide a professionally rigorous test to someone without access to a psychologist. Online tests are self-proctored, allowing examinees to take them on their own schedule from any location with an internet connection.

The quality of online tests varies enormously. The mode of delivery does not determine whether a test is valid. Rigorous development does. Most online IQ tests found through a search engine lack representative norm samples, documented reliability, expert review, and alignment with professional standards. Some exist primarily to collect payment for a flattering score report. Pointing this out is not a criticism of online testing as a format; it is a description of what is currently available. A professionally developed online test is fully capable of producing meaningful data. The problem is scarcity, not impossibility.

Type 4: IQ Test Batteries

The distinction between a battery and a single-format test cuts across all three administration types. A battery can be individually administered, group administered, or online. It is an important enough distinction to address separately.

An IQ test battery is a collection of different subtests, each measuring a different aspect of cognitive ability. The battery approach has a specific theoretical grounding in the Cattell-Horn-Carroll (CHC) theory of intelligence, which is the current mainstream framework in psychometrics. CHC theory holds that intelligence is hierarchical: general intelligence (g) sits at the top, broad abilities such as fluid reasoning, crystallized knowledge, visual-spatial ability, working memory, and processing speed occupy the middle tier, and dozens of narrow abilities sit below those. A battery is designed to sample across this hierarchy rather than measure only one slice of it.

The practical advantage is coverage. By sampling a broad range of cognitive tasks, a battery reduces the risk of a misleading result because one subtest happened to favor or disadvantage the examinee's particular strengths. Someone who is strong in verbal reasoning but relatively weaker in processing speed will have both captured in the report. There is also a statistical argument: a composite of many subtests produces a more stable and precise estimate of intelligence g than any single subtest can because the uniqueness of each subtest gets averaged away.

Type 5: Single-Format Tests

A single-format test has exactly one task type from start to finish. The most well-known example is the Raven's family of tests, first published in 1936 by L.S. Penrose and John C. Raven. All Raven's items share the same format: a 3×3 grid of visual patterns with the final cell missing, and the examinee selects the answer that completes the pattern.

Many psychologists believed that a nonverbal, pattern-based test would be free of cultural bias. Subsequent research has shown that culture-fair tests do not eliminate group differences and still require familiarity with cultural concepts like abstract patterns and logic. The format has real value, but it is definitely not a “culture-free” test.

Beyond Raven's, single-format tests include vocabulary tests, verbal analogy tests, various logical reasoning assessments, and more. Their primary appeal is brevity: a well-constructed matrix reasoning test can be completed in 20 to 45 minutes, making it practical for research screening contexts where time is limited.

The limitation follows directly from the design: one task type cannot sample the full breadth of cognitive abilities. A person with strong visual-spatial fluid reasoning but weaker verbal ability will appear more capable on a Raven's test than on a full battery. The reverse is equally true. Single-format tests remain solid measures of general intelligence (the indifference of the indicator applies) but they provide less diagnostic information and are more vulnerable to a single cognitive strength or weakness skewing the result.

Discussion on the Five Types of IQ Tests

These types are not mutually exclusive. Most individually administered tests are also batteries. Many group tests use a single format. Online tests can be either. Despite the differences, research on score comparability consistently shows that full-scale IQs from different professional batteries are substantially correlated at the group level, but individual-level scores can differ enough to warrant caution when comparing results across tests.

Whatever the test type, a few standards apply universally. The creator should be an identifiable professional with documented psychometrics training willing to attach their name to the work. The norm sample should be clearly described and representative of the intended population. Reliability data should be reported. And the test should reference the professional standards it ...was built to meet.

The type of test is a separate question from its quality. A well-constructed single-format online test can produce better data than a sloppily developed multi-subtest battery. The type informs what kind of information the test can provide. Quality determines whether that information is trustworthy.

Professional online IQ testing

The RIOT occupies an unusual position in this landscape. It is an online battery that combines the accessibility of self-proctored administration with the depth of a multi-subtest format. I developed it after more than 15 years of research on human intelligence, including peer-reviewed publications and a Cambridge University Press book on the subject.

The RIOT was built to meet the same APA/AERA/NCME standards that govern traditional clinical tests. Its content was reviewed by a panel of external experts from cognitive, educational, and developmental psychology. It was normed on a representative U.S. sample (not a self-selected group of people who sought out the test) and its technical properties are documented in a published manual. It produces an overall IQ alongside six index scores: Verbal Reasoning, Fluid Reasoning, Spatial Ability, Working Memory, Processing Speed, and Reaction Time.

For most adults who want a professionally developed measure of their cognitive abilities without scheduling a clinical evaluation, the RIOT was designed to fill that gap.

Sources

Spearman, C. (1904). "General intelligence" objectively determined and measured. American Journal of Psychology, 15(2), 201–292.
Johnson, W., te Nijenhuis, J., & Bouchard, T. J. (2008). Still just 1 g: Consistent results from five test batteries. Intelligence.
Wechsler, D. (2008). Wechsler Adult Intelligence Scale — Fourth Edition (WAIS-IV). Pearson.
Wechsler, D. (2024). Wechsler Adult Intelligence Scale — Fifth Edition (WAIS-V). Pearson.
Carroll, J. B. (1993). Human Cognitive Abilities: A Survey of Factor-Analytic Studies. Cambridge University Press.
American Educational Research Association, APA, & NCME. (2014). Standards for Educational and Psychological Testing.
Yerkes, R. M. (1921). Psychological examining in the United States Army. Memoirs of the National Academy of Sciences, 15.
Penrose, L. S., & Raven, J. C. (1936). A new series of perceptual tests. British Journal of Medical Psychology, 16(2), 97–104.
Warne, R. T. (2021). In the Know: Debunking 35 Myths About Human Intelligence. Cambridge University Press.
Warne, R. T. (2025). Technical Manual for the Reasoning and Intelligence Online Test, Version 1.0. RIOT IQ.
Frey, M. C., & Detterman, D. K. (2004). Scholastic assessment or g? Psychological Science, 15(6), 373–378.
Shuttleworth-Edwards, A. B. (2016). Generally representative is not good enough. Gifted Child Quarterly.
Dragt, J. (2010). Causes of group differences studied with the method of correlated vectors. Gifted Child Quarterly.
Millsap, R. E., & Kwok, O. M. (2004). Evaluating the impact of partial factor invariance on selection. Journal of School Psychology.
Benson, N., Hulac, D. M., & Kranzler, J. H. (2010). Independent examination of the WAIS-IV. Psychological Assessment, 22(1), 121–130.
Beaujean, A. A. (2015). John Carroll's views on intelligence. Intelligence.
Schneider, W. J., & McGrew, K. S. (2018). The Cattell-Horn-Carroll theory of cognitive abilities. In D. P. Flanagan & E. M. McDonough (Eds.), Contemporary Intellectual Assessment (4th ed.). Guilford.
Dutton, E., & Lynn, R. (2015). A negative Flynn Effect in France. Intelligence, 51, 67–70.
Salgado, J. F., et al. (2003). A meta-analytic study of general mental ability validity for different occupations. Journal of Applied Psychology, 88(6), 1068–1081.
Castejon, J. L., et al. (2021). The comparability of intelligence test results. Journal of School Psychology, 88, 1–18.