

ABCDEFGHIJKLMNOPQRSTUVWXYZ


 abstraction
 The act of considering something as a general
quality or characteristic, apart from concrete realities, specific
objects, or actual instances
 achievement test
 A measure of knowledge and skills in a content
area.
 acquiescence set
 The tendency to agree with statements on
a test or affective measure.
 affective
 Having to do with attitudes, beliefs, and
values.
 affective domain
 The area of human action which emphasizes
the internalized processes such as emotion, feeling, interest,
attitude, value, character development, and motivation.
 affective taxonomy
 A system for classifying different levels
of internalization of an attitude or value.
 algorithm
 A method of computation. Usually set up
so that calculation can be made routinely without mathematical
understanding from the computational scheme.
 analyze
 To separate into constituent parts or elements;
to examine critically, so as to bring out the essential elements
or give the essence of
 anecdotal record
 A written description of an observed event.
 aptitude
 A natural talent or ability.
 assessment
 Collecting data in the context of conducting
measurement.
 association form
 A shortanswer item format in which the
student is given set of words or phrases and must supply corresponding
words or phrases according to a defined basis.
 attitude test
 A measure of one's feelings.
 balance
 The selection and provision of test items
such that subject matter topics and behaviors are sampled in accordance
with established relative weights.
 biserial correlation
 Shows the degree of relationship between
a continuous and a normally distributed variable which has been
dichotomized.
 blind guessing
 The selection of an alternative
for a selectedresponse item without using any knowledge or rational
approach to the choice. The probability of choosing the correct
response is at chance level. If there are two choices, a blind
guess should result in the correct selection about 50 percent
of the time, for four choices, 25 percent of the time, and so
on.
 bluffing
 A strategy for responding to essay
questions; providing an answer that may not directly address the
question.
 Buckley Amendment
 Legislation that gives students and their
parents access to information about themselves, including test
scores.
 centile point
 Is the point on a scoring scale below which
fall a certain percentage of the cases.
 central tendency
 An average or middle value for a distribution
of scores.
 checklist
 A measure of the presence or absence of
listed attributes.
 chi square
 Shows the degree of divergence between observed
and expected frequencies.
 coefficient of determination
 The square of the correlation coefficient;
the percentage of the variance in one variable that is predictable
from another variable.
 cognitive
 Having to do with knowing or understanding.
 cognitive domain
 The area of human action which pertains
to mental processes such as intellectual, learning, and problem
solving.
 cognitive taxonomy
 A system for classifying different levels
of understanding.
 completion form
 A shortanswer item format in which the
student is to supply the missing word or words in a given item.
 comprehensive
 Covering all material taught to date in
a course.
 computerized adaptive testing
 Computerassisted testing in which the items
that are presented are determined by the responses to previous
items.
 concurrent validity
 A form of criterion validity based on the
correlation of test scores with those on a criterion measure obtained
at about the same time.
 construct
 An idea or concept invented to explain an
aspect of human behavior or some other nonphysical characteristic.
Example: hostility.
 construct validity
 The extent to which a test measures certain
psychological traits.
 content bias
 Disproportionate representation of topics
and terms within a test.
 content sampling
 The extent to which the items on a test
represent the entire domain of possible items in a content area.
 content validity
 The extent to which a test or measure
is representative of a defined body of knowledge.
 correction for guessing
 A mathematical adjustment that brings the
score to zero for someone who guessed on each item.
 correlation
 A measure of the strength and direction
of the association between two sets of scores.
 covariation
 Variance that two or more tests have in
common.
 criterion referenced
 A way of interpreting a test score which
compares an individual's performance to an established standard
of performance.
 criterion validity
 Validity based on the correlation between
test scores and scores on some measure representing an identified
criterion.
 Cronbach alpha procedure
 A procedure for estimating internal consistency
reliability, based on parts of a test.
 crossvalidation
 Related to predictive validity; using results
for one sample of individuals to determine if validity coefficients
will remain stable for another sample.
 decile
 Any one of nine centile points (scores)
which divide a distribution into ten parts.
 demographics
 Vital and social statistics
 descriptive statistics
 Summary characteristics of distributions,
such as shape, average, and dispersion.
 diagnostic test
 A test used to measure a student's strengths
and weaknesses in a given area.
 difficulty index
 A measure of the percentage of incorrect
responses determined by dividing the number getting the item wrong
by the number who tried the item. Used to establish how difficult
an item was for the group who took the test.
 direct observation
 Noticing of phenomena without any intervening
factor between the observer and that which is being observed.
A record of the situation is made.
 discrimination
 The ability of a test item to separate high
and low scores on a total test.
 discrimination index
 A value which indicates the ability of an
item to separate highachieving students from lowachieving students.
 dispersion
 The spread among scores in a distribution.
 distractor
 A response for a multiplechoice item that
is classed as an incorrect alternative. It is a plausible wrong
answer designed to be attractive to students who do not know the
correct response.
 distractor analysis
 Item analysis technique concerned with the
options on a multiplechoice item.
 domain
 A sphere of human activity. The
three major categories are cognitive, affective, and psychomotor.
 domain specification
 A precise delineation of a body of content
or a set of behaviors.
 empirical
 Verifiable by experience or experiment;
objective collection of data to test a subjective concept
 equivalence reliability
 The extent to which measurement on two or
more forms of a test is consistent.
 equivalent (parallel) forms
 Two or more forms of a test covering the
same content whose item difficulty levels are similar.
 error
 Variation produced by the inaccuracies of
measurement. The source of the variation may be within the test
instrument, within the subjects of measurement, or in the way
the test was administered.
 essay item
 An item format that requires the student
to structure a rather long written response, up to several paragraphs.
 evaluation
 The process of making a value judgment based
on information from one or more sources.
 experiment
 The modification of the conditions of a
group or groups that have been chosen for study, and the analysis
of the resulting outcomes.
 extended response
 An answer to an essay item which asks or
implies a question which has no definite limits to restrict the
student response. The response set is open ended. (See limited
response.)
 f test
 To determine the significance of the difference
between the variances (*2) of two groups.
 factor analysis
 An analytical procedure that can be used
for identifying the number and nature of constructs underlying
a set of measures.
 factor loading
 From factor analysis; a correlation between
a factor and a test score.
 formative
 Done to monitor progress over a period of
time.
 frequency distribution
 A listing of scores and the number of persons
receiving each score.
 general factor
 From factor analysis; a factor that has
substantial loading with all measures or tests.
 globalquality scaling
 A method of scoring an essay item; also
called holistic scoring, scoring based on the general impression
of overall adequacy and quality of the response.
 grade equivalent scores
 Normreferenced scores that report performance
in terms of grade and month (such as 4.6—fourth grade, sixth
month).
 grading
 The process of evaluating performance
and assigning a mark of performance level; commonly associated
with assigning letters, A, B, C, D, and FA being of better or
higher performance than B, and so on.
 grammatical clue
 A flaw in objective items in which the wording
or punctuation directs the examinee to the correct answer.
 group factor
 From factor analysis; a factor that has
high loadings with two or more but not all measures or tests.
 grouped frequency distribution
 A frequency distribution that categorizes
scores by intervals.
 halo effect
 The tendency to give high scores to students
known to be good students and vice versa, independent of the quality
of the response.
 highstakes test
 A test for which the consequences of doing
well or poorly are costly.
 histogram
 A bar graph that describes a distribution
of scores.
 informed consent
 Giving approval for certain procedures after
indicating an understanding of those procedures.
 intelligence
 The capacity for reasoning and understanding.
 intelligence quotient (IQ)
 The ratio of mental age to chronological
age multiplied by 100 (100 x (MA/CA)); one whose mental age is
average for his or her chronological age group has an IQ of 100.
 internal consistency reliability
 The extent to which parts of a test are
consistent in measurement.
 interval
 A defined distance on a scale of measurement.
 interval measurement
 Measurements that classify, order, and have
equal distances between points on the scales.
 isomorphic
 Something similar or identical in structure
or appearance to something else.
 item analysis
 An examination of student performance for
each item on a test. It consists of reexamination of the responses
to items of a test by applying mathematical techniques to assess
two characteristicsdifficulty and discriminationof each objective
item on the test.
 item sampling
 A technique used in schoolwide, state, or
national testing that administers only a part of a test to each
student. This allows a longer test to be administered but does
not require a long test session for each student involved. If
each student is administered only onefourth of the test, a fourhour
test could be administered with no student giving more than one
hour of time.
 item specifications
 Item writing procedures for criterionreferenced
tests that include sample items and descriptions of the stimulus
and the response.
 item statistics
 Summary descriptions of a group's
performance on a particular test item.
 itemtotal correlation
 the coefficient that describes the association
between the scores on a particular item and the scores on the
entire list.
 Kelly's range
 The distance between the 10th and 90th centile
ranks.
 KuderRichardson Formula 21 procedure (KR21)
 a splithalf approach to estimating reliability
that may be substituted for the KR20 procedure if item difficulty
levels are similar.
 KuderRichardson Formula 20 procedure (KR20)
 a splithalf approach to estimating reliability
that provides the mean of all possible splithalf reliability
coefficients for a test.
 kurtosis
 Refers to the peakedness or flatness of
a frequency distribution as compared with a normal distribution.
 lepotokurtic
 A frequency distribution more peaked than
normal.
 limited response
 Essay item which asks a question or gives
instructions for restricting the area to be covered in responding
to the stated tasks. The coverage expected is well fenced in for
the student. (See extended response.)
 local norm
 The average test performance in some city
or region.
 masterynonmastery
discrimination
 Item analysis technique concerned with decisions
regarding a cutoff score.
 matching item
 An item consisting of a twocolumn formatpremises
and responsesthat requires the student to make a correspondence
between the two.
 mean
 The arithmetic average of a set of scores.
 mean deviation
 A measure of variability or dispersion of
a distribution of scores.
 measurement
 A process that assigns by rule a numerical
description to observation of some attribute of an object, person,
or event.
 measurement scales
 Classifications of measures based on the
amount of information contained in each score.
 median
 The middle score of a distribution.
 mental age
 The average intellectual functioning of
normal persons at al given age, usually expressed in months.
 minimum competency testing
 Testing designed to measure the acquisition
of competence or skills to or beyond a defined standard.
 mode
 The most frequent score of a distribution.
 multifactored assessment
 Assessment that usually includes
the physical, cognitive, psychological, and social factors that
are believed to affect learning.
 multiplechoice item
 A test format in which the examinee selects
the correct answer from a list of possible options.
 national norm
 The average performance of a sample selected
to be representative of the entire country.
 needs assessment
 A process whereby the educational requirements
of students collectively or individually are determined. Usually
thought of as a formal structured approach, but may be done informally
by the teacher.
 negative skewness
 Asymmetry in which most of the scores in
a distribution are at the high end.
 nominal measurement
 Measurement that classifies elements into
mutually exclusive and exhaustive categories.
 norm group
 The set of subjects used to establish the
averages to be used to interpret student scores on a standardized
test.
 norm referenced measurement
 Measurement in which an individual's score
is interpreted by comparing it to the scores of a defined group.
 normal distribution (curve)
 A theoretical distribution of scores which
forms a curve that is bell shaped and symmetrical.
 norms
 The test scores (also possibly statistics
generated from scores) of one or more defined groups considered
to be representative.
 null hypothesis
 A statement that there is no difference
in measures of the criterion vairable except what would be expected
from sampling; requires that a significance level be stated (.05,
.01, . . .).
 objective
 Dealing with things external to the mind
rather than with thoughts or feelings; pertaining to that which
can be known, or that which is an object or a part of an object.
 objective items
 Items that can be objectively scored; items
on which persons select a response from a list of options.
 objectivity
 The degree to which the task to be performed
is clear and the correct response is definite.
 objectivity (in scoring)
 The extent to which equally competent scorers
obtain the same result.
 observation
 Any fact which is used as a basis for evaluation
procedures. The output of the process of observing.
 oral tests
 Examinations in which both the questioning
and answering are done aloud.
 ordinal measurement
 Measurement that classifies and orders along
a continuum.
 parallel forms
 Two or more forms of a test covering the
same content whose item difficulty levels are similar.
 partial correlation
 Shows the relationship between two variables
with the effects of one or more other variables held constant.
 penalty for guessing
 A mathematical procedure for lowering scores
as a function of the number of incorrect answers.
 percentiles
 Normreferenced scores that indicate the
percentage of a norm group that a particular score exceeded.
 performance bias
 Bias introduced when individuals are not
able to perform on a test because they have not had the opportunity
to learn the test content.
 performance test
 Nonpaperandpencil tests that require the
student to engage in some type of process, produce a product,
or both.
 pilot study
 A miniature study conducted with a group
of students that is not used as part of the major study. It is
used to try out procedures or instruments (adapted from Hopkins
& Antes, 1990, p. 461)
 phi coefficient
 Shows the degree of relationship between
two dichotomous variables.
 platykurtic
 A frequency distribution that is flatter
than normal.
 point biserial correlation
 Shows the degree of relationship between
a continuous and a truly dichotomous variable.
 population
 Any defined aggregate of persons, objects,
or events.
 positional preference
 The regular placement of the correct response
in a particular position; for instance, always in choice C.
 positive skewness
 Asymmetry in which most of the scores in
a distribution are low.
 power test
 A test in which time does not affect quality
of performance, that is, students would not perform better if
given additional time.
 practice effect
 The consequences of taking similar tests
or testlike exercises.
 prepost discrimination
 Item analysis technique concerned with assessing
performance before and after instruction.
 predictive validity
 A form of criterion validity based on the
correlation of test scores with scores on a criterion measure
obtained at some time.
 premises
 In a matching item, the column of words
consisting of item stems.
 prescriptive test
 A test designed to identify student deficiencies,
weaknesses or problems, and to suggest corrective learning activities.
 problem solving
 Settlement of a perplexing question
or situation.
 product moment correlation
 Shows the degree of relationship between
two continuous variables.
 project
 Any thrust area activity which is funded
by the National Science Foundation or uses resources designated
as matching funds
 psychomotor
 Having to do with movement or motor skills.
 psychomotor domain
 The area of human action which emphasizes
all types of body movements which are involuntary or voluntary.
 psychomotor taxonomy
 A system for classifying psychomotor behaviors
in terms of the amount of concentration required.
 qualitative
 Information in the form of statements or
narrative
 quantitative
 Information that has been expressed in terms
of mathematically manipulable numbers
 quartile deviation
 A measure of variability or dispersion of
a distribution of scores.
 quartile one
 The point (score) in a distribution that
sets off the lower fourth of the group.
 quartile three
 The point (score) in a distribution that
sets off the higher fourth of the group.
 quartile two
 The point (score) in a distribution which
divides the distribution into two equal parts.
 random sample
 A sample in which every member of the parent
population has an equal chance of being chosen.
 range
 The difference between the highest and lowest
scores in a distribution.
 rank correlation
 Shows the degree of relationship between
two continuous variables by comparing ranks.
 rating scale
 A measure that contains one's estimate of
the value of a person or thing.
 ratio measurement
 Measurement that classifies, orders, has
equal units, and a true zero point.
 raw score
 The original score, as of a test, before
it is statistically adjusted. It may include weighting and a correction
for guessing but no other transformation.
 reading difficulty
 The level of reading ability required to
understand test questions.
 reliability coefficient
 A numerical index of reliability based on
a correlation coefficient; theoretically, the index can range
from O to + 1.0.
 reliability
 The consistency with which a data collection
device measures whatever it is that the device measures.
 representative sample
 Any subset of persons or items selected
to represent a larger group or population which has the same inclinations
as the total group or population with reference to some characteristic
or characteristics. In testing, the test instrument is composed
of tasks which are intended to reflect the characteristics of
the larger population of possible test tasks which could be asked.
 roleplaying
 The act of assuming a pose or role when
responding to affective questions.
 sample
 Any subaggregate of a larger population
 scatterplot
 A twodimensional graph of the relationship
between two sets of scores.
 scorer reliability
 The consistency with which two or more individuals
would score the same response to a test item.
 secure test
 A test (often commercially published) that
is not circulated so it can be used repeatedly.
 separate answer sheets
 Forms provided for item response that are
not attached to nor contained in the test copy; many can be electronically
scored.
 shortanswer item
 A test item for which the student supplies
a brief response, usually consisting of a word or phrase.
 skewness
 The tendency of a distribution to depart
from symmetry or balance.
 socially acceptable response
 An answer to a question that may be inaccurate
but conforms to desired social norms.
 spearmanBrown formula
 A formula for estimating reliability if
test length is changed.
 specific determiners
 Terms such as always, never, every, and
all that provide clues to correct answers.
 specific factor
 From factor analysis; a factor that has
a high loading with only one measure or test.
 speeded test
 A test administered so that students are
required to complete the exam within a specified amount of time.
 splithalf method
 A procedure for estimating test reliability
by which a test is divided into two comparable halves and the
scores on the halves are then correlated.
 stability reliability
 The extent to which measurement on the same
test is consistent over time.
 standard deviation
 A measure of dispersion in a distribution
that is the positive square root of the variance.
 standard error of estimate
 Gives the amount of error involved in predicting
a score from the regression equation.
 standard error of measurement
 The standard deviation of the distribution
of error scores.
 standard error of the mean
 Is the standard deviation of a
distribution of sample means.
 standard score
 A normreferenced measurement that indicates
how many standard deviations a score is above or below the mean.
 standardized
 A process of preparing a test instrument
for use in widely separated locations. The test is standardized
so that administration and scoring procedures are the same for
all test takers. Score interpretation is made to averages of performances
of groups of test takers whose scores are then used for making
comparison to interpret scores obtained from other students.
 stanines
 Normreferenced scores that can range from
1 to 9, they have a mean of 5 and a standard deviation of 2.
 statistics
 Descriptive characteristics of a distribution
of scores; also, that area of mathematics dealing with the collection,
organization, and interpretation of numerical data.
 stem
 The introductory part of an objective test
item.
 subjective
 Existing in the mind; belonging to the thinking
subject rather than to the object of thought; relating to the
nature of an object as it is known in the mind as distinct from
a thing in itself
 summative testing
 Done at the conclusion of a course or some
larger instructional period.
 t test for a correlation
 A test to discover if a correlation shows
a real (significant) relationship, or a relationship due merely
to chance.
 t test between means or proportions
 A test to discover if the difference between
two means or two proportions is significant, or merely due to
chance.
 table of specifications
 A twodimensional grid, content by cognitive
process, used in planning a test.
 takehome test
 A test that a student completes outside
of class, usually in an uncontrolled setting.
 taxonomy
 A system of classification and the concepts
of identification, naming, and categorization underlying the coordination.
 teacher competency test
 A test for (prospective) teachers on knowledge
and skills essential for effective teaching.
 technical adequacy
 The level of test reliability and validity
necessary before the test can be recommended for use.
 technical problem
 A complex situation from a specialized field
of study which is presented to a student for solution within the
structure of that field. Usually used for assessment of general
understandings of a wide set of principles and ideas rather than
for special skills and talents.
 test anxiety
 A psychological state of stress caused by
a testing situation.
 test bias
 A systematic error in the measurement process.
 test item file
 A collection of individual items on cards
which are arranged by content areas for future use in test assembly.
 testretest method
 A procedure of estimating test reliability
by which the same test is administered twice to the same individuals
and the scores from the two administrations are then correlated.
 test
 The set of items or questions presented
to one or more individuals under specified conditions for purposes
of measurement.
 testing arrangement
 The setting in which a test is administered.
 testing
 The process of administering or taking a
test.
 tetrachoric correlation
 Shows the degree of relationship between
two normally distributed variables which are categorized into
dichotomies.
 transformed standard scores
 Zscores that have been converted to a distribution
with a prespecified mean and standard deviation.
 true component
 The part of an individual's score that is
nonerror; the score if the test were perfectly reliable.
 truefalse item
 A test format in which examinees indicate
whether given statements are correct (true) or incorrect (false).
 unobtrusive observation
 Instances of noticing made in such a way
that persons being observed do not know that they are being observed.
 usability
 The practical factors that must be considered
in test selection: cost, testing time, examiner training, and
so on.
 validity
 The extent to which a test measures what
it is intended to measure.
 validity coefficient
 The correlation between a test of known
validity and a test of unknown validity.
 variance
 A measure of dispersion.
 weighted scores
 The composite scores that are weighted combinations
of two or more separate scores.
 work sample
 A nontest measurement of student
learning.




References
Koenker, R. H. (1971). Simplified statistics.
Totowa, NJ: Littlefield, Adams & Co. Wiersma, W., & Jurs, S. G. (1990). Educational
measurement and testing, 2nd ed. Boston: Allyn & Bacon. Hopkins, C. D., & Antes, R. L. (1990).
Classroom testing: Construction. Itasca, IL: F. E. Peacock
Publishers. Hopkins, C. D., & Antes, R. L. (1990).
Educational research: A structure for inquiry, (3rd ed.).
Itasca, IL: F. E. Peacock Publishers. Webster's Encyclopedic Unabridged Dictionary
of the English Language. (1989).
New York: Gramercy Books.



©
2001 Foundation Coalition. All rights reserved. Last modified



