Creating an IQ Test

Learn why making an IQ test requires psychometric expertise, how professional tests are developed, and why homemade or anonymous IQ tests lack reliability and validity.

Dr. Russell T. WarneChief Scientist

Because intelligence is such a fascinating concept and IQ tests are interesting for the test taker, some people want to create their own IQ tests. These range from well-meaning amateurs to scientists with outside expertise. So, what should someone know before creating an IQ test?

First, creating a psychological test is much more difficult than it looks. An entire science, called “psychometrics,” is dedicated to the creation and study of psychological tests. People go to graduate school for four years or more to become experts in test creation. It takes a lot of training and to obtain the expertise needed to be a competent test creator. Entire books have been written about test development and even more about specific aspects of the test development process, such as writing items (i.e., questions or tasks), screening items for bias, setting score interpretation standards, time limits, and adapting tests to other countries. Anyone who attempts to create a psychological test without in-depth training and knowledge is not going to make a good test.

Second, psychological tests are sophisticated scientific instruments. They should be treated with the same care and exactness as laboratory equipment or medical tests. The Standards for Educational and Psychological Testing is a book describing the professional standards that psychological tests should meet during creation and use. Non-professionals are unlikely to be aware of all of the guidelines in the Standards, let alone how to meet them. Non-professionals can create a test, but it won’t provide accurate data, and its scores will mislead people or lead to harm if used to make important decisions.

A good analogy is a geiger counter. There are many YouTube videos providing instructions for creating homemade geiger counters. But these devices do not accurately measure the strength of a radiation source, and they cannot be used to determine whether radiation levels are safe. If you shouldn’t trust a homemade geiger counter, you shouldn’t trust a homemade IQ test.

The same is true for tests created anonymously. Legitimate test authors are proud of their work and want their name associated with the test. Anonymity prevents accountability and allows inaccurate, shoddy, or even fraudulent tests to thrive.

How are IQ tests made?

Even though non-professionals should not create IQ tests (or any psychological test), it can still be useful for laymen to know the general principles of test creation. This is not an instruction guide, but rather an explanation for the curious test taker who wants to know how their answers to test questions get turned into an IQ score.

First, test creators do a lot of background research on the trait they are trying to measure. Before I started creating the Reasoning and Intelligence Online Test (RIOT), I studied human intelligence for 15 years and published dozens of articles and a book on the topic and on psychological testing. Test creators use this information to learn about and evaluate theories and choose a preferred theory that serves for the basis of their test. That theory serves as a blueprint for all later decisions about a test, including its length, format, intended population of test takers, and scoring system.

Armed with a theory, test creators then select tasks that will appear on the test. Often there are tried-and-true tasks that have appeared on other intelligence tests, though sometimes new ones are created. Task selection requires knowledge and expertise because every task has its strengths and weaknesses, and the best tests will have those strengths and weaknesses balancing each other across the entire test. Creators will then write test questions for these tasks.

After they have a pool of test questions for each task, the test creator tries out the test questions on a pilot sample. The data from the pilot sample is used to evaluate the items through statistical analysis. (There is some qualitative, informal evaluation also, but this is much less effective and can even be counterproductive if a creator relies on it too much.) The analyses are used to drop or revise items, and they are piloted again. Sometimes this cycle happens several times before the test items are ready to exit the piloting stage.

The next step is to assemble the final test and administer it to a norm sample that is representative of the intended population of test takers. The norm sample’s performance will be used to define average performance on the test and inform the scoring system. The data from the norm sample can also be used to make any final adjustments to the test before it is released to the public.

Hopefully, this outline helps readers appreciate all of the careful planning and work that it takes to create a well designed IQ test.

Author

Dr. Russell T. WarneChief Scientist

Contact

Article Categories