Definitions-Statistics and Research

By Ted Nissen M.A. M.T.

Copyright © February 2007 Ted Nissen


Abscissa. The horizontal or X axis of a graph.

Absolute Value. The value of a number without consideration of its algebraic sign.

Additive. Can legitimately be summed.

Alternative Hypothesis. The hypothesis that the mean of the population treated in a certain way is not equal to the mean of the population not treated in that way; symbolized HI, .

Analysis of Variance (ANOVA). A statistical method for determining the significance of the differences among a set of means.       .

Asymptotic. A line that continually approaches but never reaches a specified level.

Bar Graph. A frequency graph for nominal or qualitative data. Bars are raised from each designation of a nominal variable on the X axis to the level of its frequency on the Y axis. Space is left between the bars.

Biased Sample. A sample that does not provide all members of the population an equal prob­ability of selection.

Bimodal Distribution. A distribution with two modes.

Binomial Distribution. A distribution of events that have only two possible outcomes.

Bivariate Distribution. A joint distribution of two variables, the individual scores of which are paired in some logical way.

Cell. The portion of an ANOV A table containing the scores of subjects treated alike.

Central Limit Theorem. The theorem in mathematical statistics that the sampling distribution of the mean approaches a normal curve as N gets larger, and that the standard deviation of this sampling distribution is equal to CT/VN.

Central Value. The mean, median, or mode; a statistic that describes the typical score in a dis­tribution.

Chi Square Distribution. A theoretical sampling distribution of chi square values. There is a chi square distribution for each number of degrees of freedom.

Class Interval. A range of scores grouped together in a grouped frequency distribution.

Coefficient of Determination. A squared correlation coefficient; an estimate of common variance.

Common Variance. Variance held in common by two variables. It is assumed to be determined or caused by the same factors.

Confidence Interval. An interval of scores within which, with specified confidence, a para­meter is expected to lie.                       .

Confidence Limits. Two numbers that define the boundaries of a confidence interval.

Constant. A mathematical value that remains the same within a series of operations; for ex­ample, regression coefficients a and b have the same value for all predictions from the same regression line.

Control Group. A group in an experiment against which other groups are compared.

Correlated-Samples Design. An experimental design in which measures from different groups are not independent of each other. Some writers call this a dependent-samples design.

Correlation. A relationship between variables such that increases or decreases in the value of one variable tend to be accompanied by increases or decreases in the other.

Critical Region. The area of the sampling distribution that covers the values of the test statistic that are not due to chance.

Critical Value. The value from a sampling distribution against which a computed statistic is compared to determine whether the null hypothesis may be rejected.

Degrees of Freedom. The number of observations minus the number of necessary relations obtaining among these observations.

Dependent Variable. The variable that is measured and analyzed in an experiment. Its values are tested to determine whether they are dependent upon values of the independent variable.

Descriptive Statistic. Index number that summarizes or describes a set of data.

Deviation Score. A raw score minus the mean of the distribution from which the raw score was drawn.

Dichotomous Variable. A variable taking two, and only two, values.

Distribution-Free Statistics. Statistical methods that do not assume any particular population distribution.

Empirical Distribution. An arrangement from highest to lowest of actual scores from real observations. .

Error Variance. Variance due to factors not controlled in the experiment; within-group variance.

Expected Value. The mean value of a random variable over an infinite number of samplings. The expected value of a statistic is the mean of the sampling distribution of the statistic.

Experimental Group. A group that receives a treatment in an experiment and whose dependent­ variable scores are compared with those of a control group.

Extraneous Variable. A variable, other than the independent variable, that may affect the dependent variable.

F Distribution. A theoretical sampling distribution of F values. There is a different F distribu­tion for each combination of degrees of freedom.

F Test. A method of determining the significance of the difference among two or more means.

Factor. Independent variable.

Factorial Design. An experimental design using two or more levels of two or more factors and permitting an analysis of interaction effects between independent variables.

Frequency. The number of times a score occurs in a distribution.

Frequency Polygon. A graph with quantitative scores on the X axis and frequencies on the

Y axis. Each point on the graph represents a score and the frequency of occurrence of that score. Points are connected by a line.

Goodness of Fit. Degree to which observed data coincide with theoretical expectations.

Grand Mean. The mean of all the scores in an experiment.

Grouped Frequency Distribution. An arrangement of scores from highest to lowest in which scores are grouped together into equal-sized ranges called class intervals. The number of scores occurring in each class interval is placed in a column beside the appropriate Class interval.

HHistogram. A graph with quantitative scores on the X axis and frequencies on the

  Y axis. A bar covering the range from the lower to upper limit of each score or class interval is raised to the level of that score's frequency. There is no space between the bars.

Hypothesis. A statement about the relationship between two or more phenomena.

Hypothesis Testing. The process of hypothesizing a parameter and comparing (or testing) the parameter with an empirical statistic in order to decide whether the parameter is reasonable.

Independent. Events that have nothing to do with each other. Occurrence or variation of one does not affect the occurrence or variation of the other. Two sets of uncorrelated scores are

     independent of each other.

Independent-Samples Design. An experimental design using samples whose dependent-variable scores cannot logically be paired.

Independent Variable. The treatment variable; it is selected by the experimenter.

Inferential Statistics. A method of deciding between two or more alternative conclusions.

Interaction. A relationship between two factors such that the effect of one treatment on the dependent variable depends upon the level of the other treatment.

Interpolation. A method for determining a value known to lie between two other values.

Interval Scale. A measurement scale in which equal differences between numbers stand for equal differences in the thing measured. The zero point is arbitrarily defined.

Least-Squares Solution. Method of fitting a regression line such that the sums of the squared deviations from the straight regression line will be a minimum.

Level. A treatment chosen from an independent variable.

Level of Confidence. The confidence (1 - a) that a parameter lies within a given interval.

Level of Significance. The probability level at which the null hypothesis is rejected.

Line Graph. A graph presenting the relationship between two variables.

Linearity. The condition wherein the "line of best fit" through a scatterplot is a straight line. Lower Limit. The bottom of the range of possible values that a score on a quantitative variable

can take; for example, a score of 5 has 4.5 as its lower limit.

Main Effect. The deviation of one or more treatment means from the grand mean.

Mann-Whitney U Test. A nonparametric method used to determine whether two sets of ranked data based on two independent samples came from the same population.

Matched Pairs. A correlated-samples design in which pairs of scores are matched.

Mean. The arithmetic average; the sum of the scores divided by the number of scores.

Mean Square. An ANOV A term for the variance; a sum of squares divided by its degrees of freedom.

Median. The point that divides a distribution of scores into two equal halves, so that half the scores are above the median and half are below it.

Mode. The score that occurs most frequently in a distribution.

Multiple Comparisons. Tests of differences between treatment means or combinations of means following an ANOV A.

Multiple Correlation. A correlation method that combines intercorrelations among more than two variables into a single statistic.

Natural Pairs. A correlated-samples design, in which pairing occurs prior to the experiment.

Nominal Scale. A scale of measurement in which numbers are used simply as names and have no real quantitative value.

Nonparametric Methods. Statistical methods that do not require the estimation of parameters.

Normal Distribution. A theoretical distribution based on frequency of occurrence of chance events.

Normality. The condition of being distributed in the form of the normal curve.

Null Hypothesis. The assumption that the difference between an observed statistic and a pro­posed parameter is the result of chance.

Observed Frequency. Number of observations actually occurring in a category.

One-Tailed Test. A statistical test in which the critical region lies in one tail of the distribution.

Operational Definition. A definition that specifies a concrete meaning for a variable. The vari­able is defined in terms of the operations of the experiment; for example, hunger may be defined as "24 hours of food deprivation."

Ordinal Scale. A rank-ordered scale of measurement in which equal differences between num­bers do not represent equal differences between the things measured.

Ordinate. The vertical or Y axis of a graph.

Orthogonal. Independent; uncorrelated.

Parameter. Some numerical characteristic of a population.

Parameter Estimation. Estimating one particular point to be the parameter of a population.

Partial Correlation. Technique that allows the separation or partialing out of the effects of one variable from the correlation of two other variables.

Population. All members of a specified group.

Proportion. A part of a whole.

P-Value= probability that the significant differences between groups is due to chance alone

Qualitative Variable. A variable that exists in different kinds; measured on a nominal scale.

Quantitative Variable. A variable that exists in different amounts.                  .

Random Sample. A subset of a population chosen in such a way that all samples of the specified size have an equal probability of being selected.

Range. The difference between the highest and lowest scores plus I.

Ratio Scale. A scale that has all the characteristics of an interval scale, plus a true zero point.

Raw Score. A score as it is obtained in an experiment.

Rectangular Distribution. A distribution in which all scores have the same frequency.

Regression Coefficients. The values a (point where the regression line intersects the Yaxis) and b (slope of the regression line).

Regression Equation. An equation used to predict particular values of Y for specific values of X. Regression Line. The "line of best fit" that runs through a scatterplot.

Repeated Measures. An experimental design in which more than one dependent-variable measure is taken on each subject.

Sample. A subset of a population.

Sampling Distribution. A theoretical distribution of a statistic based on all possible random samples drawn from the same population; used to determine probabilities.

Sampling Error. The tendency of sample statistics from the same population to vary from one   sample to another.

Scatter plot. The plot of points that results when a distribution of paired X and Y values are plotted on a graph.   '

Scheffe Test. A method of making all possible comparisons after ANOV A.

Simple Effect. The difference between cell means in a factorial ANOV A.

Simple Frequency Distribution. Scores arranged from highest to lowest, with the frequency of each score placed in a column beside the score.

Skewed Distribution. An asymmetrical distribution. The skew may be positive (more low scores than high, so that the frequency polygon is pointed toward the right) or negative (more high scores than low, so that the frequency polygon is pointed toward the left).

Spearman's Rho. A con-elation statistic for two sets of ranked data.

Standard Deviation. The square root of the mean of the squared deviations.

Standard Error. The standard deviation of a sampling distribution.

Standard Error of Estimate. The standard deviation of the differences between predicted out­ comes and actual outcomes.

Standard Error of the Difference The standard deviation of a sampling distribution of differences between means.

Standard Score. A score expressed in standard-deviation units.

Statistic. Some numerical characteristic of a sample.

Stratified Sample. A sample drawn in such a: way that it reflects exactly a known characteristic of the population.

Subsample. A subset of a sample.

Sum of Squares. The sum of the squared deviations from the mean; the numerator of the formula for the standard deviation.

t Distribution. Theoretical distribution used to determine significance of experimental results based on small samples.          .

t Test. Significance test that uses the t distribution.

Theoretical Distribution. Arrangement of hypothesized scores based on mathematical formulas and logic.

Theoretical Frequency. Number of observations expected in a category if the null hypothesis I is true; expected frequency.

Treatment. A level of an independent variable.

Two- Tailed Test of Significance. Any statistical test in which the critical region is divided into the two tails of the distribution.

Type I Error. Rejection of the null hypothesis when it is true.

Type II Error. Retention of the null hypothesis when it is false.

Upper Limit. The top of the range of values a score from a quantitative variable can take; for example, the number 5 has 5.5 as its upper limit.

U Value. Statistic used in the Mann-Whitney U test.

Variability. Differences among scores in a distribution.

Variable. Something that exists in more than one amount or in more than one form.

Variance. The square of the standard deviation.

Wilcoxon and Wilcox Multiple Comparisons. A nonparametric method for independent samples in which all possible pairs of treatments are compared.

Wilcoxon Rank-Sum Test. A nonparametric test for testing the difference between two independent samples.

Wilcoxon Matched-Pairs Signed-Ranks Test. between two correlated samples.

Yates' Correction. A correction for a 2 x 2 chi square when expected frequencies are few.

z Score. A score expressed in standard-deviation units; used to compare the relative standing of scores in two different distributions.