Statistics Calculator
Statistics Calculator Menu
Statistics Calculator is an easy-to-use program
designed to perform a series of basic statistical procedures related to distributions
and probabilities. Most of the procedures are called inferential
because data from a sample is used to infer to a population.
The menu bar of Statistic Calculator contains six
types of operations that can be performed by the software.
The Distributions menu item is the
electronic equivalent of probability tables. Algorithms are included
for the z, t, F, and chi-square distributions. This selection may be used
to find probabilities and critical values for the four statistics.
The Counts menu item contains routines to
analyze a contingency table of counts, compute Fisher's exact probability for
two-by-two tables, use the binomial distribution to predict the probability
of a specified outcome, and the poisson distribution to test the likelihood
of observing a specified number of events.
The Percents menu item is used to compare
two percents. Algorithms are included to compare proportions drawn from
one or two samples. There is also a menu option to calculate confidence
intervals around a percent.
The Means menu item is used to calculate a
mean
and standard deviation of a sample, compare two means to each other,
calculate a confidence interval around a mean, compare a sample mean to a
population mean, compare two standard deviations to each other, and compare
three or more standard deviations.
The Correlation menu item is used to
calculate correlation and simple linear regression statistics for paired
data. Algorithms are included for ordinal and interval
data.
The Sampling menu item is used to determine
the required sample size for a study. The software can be used for
problems involving percents and means.
The Distributions menu selection is used to
calculate critical values and probabilities for various distributions. The
most common distributions are the z (normal) distribution, t
distribution, F distribution, and the chi-square distribution. Within
the last 20 years, computers have made it easy to calculate exact
probabilities for the various statistics. Prior to that, researchers made
extensive use of books containing probability tables.
Normal
distribution
The normal distribution is the most well-known
distribution and is often referred to as the z distribution or the
bell shaped curve. It is used when the sample size is greater than 30.
When the sample size is less than 30, the t distribution is used
instead of the normal distribution.
The menu offers three choices: 1) probability of a z
value, 2) critical z for a given probability, and 3) probability of a defined
range.
Probability of a z Value
When you have a z (standardized) value for a variable,
you can determine the probability of that value. The software is the
electronic equivalent of a normal distribution probability table. When you
enter a z value, the area under the normal curve will be calculated.
The area not under the curve is referred to as the rejection region.
It is also called a two-tailed probability because
both tails of the distribution are excluded. The Statistics Calculator reports
the two-tailed probability for the z value. A one-tailed probability is used
when your research question is concerned with only half of the distribution.
Its value is exactly half the two-tailed probability.
Example
z-value: 1.96
-----------------------------------------
Two-tailed probability =
.0500
Critical z for a Given Probability
This menu selection is used to determine the critical z
value for a given probability.
Example
A large company designed a
pre-employment survey to be administered to perspective employees.
Baseline data was established by administering the survey to all current
employees. They now want to use the instrument to identify job
applicants who have very high or very low scores. Management has
decided they want to identify people who score in the upper and lower 3% when
compared to the norm. How many standard deviations away from the mean is required to define the upper and lower 3% of the
scores?
The total area of rejection
is 6%. This includes 3% who scored very high and 3% who scored very
low. Thus, the two-tailed probability is .06. The z value
required to reject 6% of the area under the curve is 1.881. Thus, new
applicants who score higher or lower than 1.881 standard deviations away from
the mean are the people to be identified.
Two tailed probability:
.06
---------------------------------
z-value = 1.881
Probability of a Defined Range
Knowing the mean and standard deviation of a sample allows you to establish the area
under the curve for any given range. This menu selection will calculate the
probability that the mean of a new sample would fall between two specified
values (i.e., between the limits of a defined range).
Example
A manufacturer may find that
the emission level from a device is 25.9 units with a standard deviation of
2.7. The law limits the maximum emission level to 28.0 units. The
manufacturer may want to know what percent of the new devices coming off the
assembly line will need to be rejected because they exceed the legal limit.
Sample mean = 25.9
Unbiased standard deviation =
2.7
Lower limit of the range = 0
Upper limit of the range =
28.0
----------------------------------------------------------------
Probability of a value
falling within the range = .7817
Probability of a value
falling outside the range = .2183
The area under the curve is
the sum of the area defined by the lower limit plus the area defined by the
upper limit.
The area under the normal
curve is the probability that additional samples would fall between the lower
and upper limits. In this case, the area above the upper limit is the
rejection area (21.83% of the product would be rejected).
T distribution
Mathematicians used to think that all distributions
followed the bell shaped curve. In the early 1900's, an Irish chemist named Gosset, discovered that distributions were much flatter
than the bell shaped curve when working with small sample sizes. In
fact, the smaller the sample, the flatter the distribution. The t
distribution is used instead of the normal distribution when the sample size
is small. As the sample size approaches thirty, the t distribution
approximates the normal distribution. Thus, the t distribution is
generally used instead of the z distribution, because it is correct for both
large and small sample sizes, where the z distribution is only correct for
large samples.
The menu offers three choices: 1) probability of a t
value, 2) critical t value for a given probability, and 3) probability of a
defined range.
Probability of a t Value
If you have a t value and the degrees of freedom associated with the value, you can use this
program to calculate the two-tailed probability of t. It is the equivalent
of computerized table of t values.
Example
t-value: 2.228
df: 10
------------------------------------
Two-tailed probability = .050
Critical t Value for a Given
Probability
This program is the opposite of the previous program. It
is used if you want to know what critical t value is required to
achieve a given probability.
Example
Two-tailed probability:
.050
Degrees of freedom: 10
-----------------------------------
t-value = 2.228
Probability of a Defined Range
Knowing the mean and standard deviation of a
sample allows you to establish the area under the curve for any given
range. You can use this program to calculate the probability that the
mean of a new sample would fall between two values.
Example
A company did a survey of 20
people who used its product. The mean average age of the sample was
22.4 years and the unbiased standard deviation was 3.1 years. The company now
wants to advertise in a magazine that has a primary readership of people who
are between 18 and 24, so they need to know what percent of its potential
customers are between 18 and 24 years of age?
Sample mean: 22.4
Unbiased standard
deviation: 3.1
Sample size = 20
Lower limit of the range = 18
Upper limit of the range = 24
----------------------------------------------------------------
Probability of a value
falling within the range = .608
Probability of a value
falling outside the range = .392
Because of the small sample
size, the t distribution is used instead of the z distribution. The
area under the curve represents the proportion of customers in the population
expected to be between 18 and 24 years of age. In this example, we would
predict that 60.8% of the its customers would be
expected to be between 18 and 24 years of age, and 39.2% would be outside of
the range. The company decided not to advertise.
F distribution
The F-ratio is used to compare variances of two
or more samples or populations. Since it is a ratio (i.e., a fraction), there
are degrees of freedom for the numerator and denominator. This menu selection
may be use to calculate the probability of an F -ratio or to determine
the critical value of F for a given probability. These menu selections
are the computer equivalent of an F table.
Probability of a F-Ratio
If you have a F-ratio
and the degrees of freedom associated with the numerator and denominator, you
can use this program to calculate the probability.
Example
F-ratio: 2.774
Numerator degrees of
freedom: 20
Denominator degrees of
freedom: 10
----------------------------------------------
Two-tailed probability =
.0500
Critical F for a Given Probability
If you know the critical alpha level and the degrees of
freedom associated with the numerator and denominator, you can use this
program to calculate the F-ratio.
Example
Two-tailed probability =
.0500
Numerator degrees of
freedom: 20
Denominator degrees of
freedom: 10
-----------------------------------------------
F-ratio: 2.774
Chi-square distribution
The chi-square statistic is used to compare the observed
frequencies in a table to the expected frequencies. This menu selection may
be use to calculate the probability of a chi-square statistic or to determine
the critical value of chi-square for a given probability. This menu selection
is the computer equivalent of an chi-square table.
Probability of a Chi-Square Statistic
If you have a chi-square value and the degrees of
freedom associated with the value, you can use this program to calculate the
probability of the chi-square statistic. It is the equivalent of computerized
table of chi-square values.
Example
Chi-square value: 18.307
Degrees of freedom: 10
------------------------------------
Probability = .050
Critical Chi-Square for a Given Probability
If you have the critical alpha level and the degrees of
freedom, you can use this program to calculate the probability of the
chi-square statistic. It is the equivalent of computerized table of
chi-square values.
Example
Probability = .0500
Degrees of freedom: 10
------------------------------------
Chi-square value: 18.307
The Counts menu selection has four tests that can
be performed for simple frequency data. The chi-square test is used to
analyze a contingency table consisting of
rows and columns to determine if the observed cell frequencies differ
significantly from the expected frequencies. Fisher's exact test is similar
to the chi-square test except it is used only for tables with exactly two
rows and two columns. The binomial test is
used to calculate the probability of two mutually exclusive outcomes. The
poisson distribution events test is used to describe the number of events
that will occur in a specific period of time.
Chi-square test
The chi-square is one of the most popular statistics
because it is easy to calculate and interpret. There are two kinds of
chi-square tests. The first is called a one-way analysis, and the second is
called a two-way analysis. The purpose of both is to determine whether the
observed frequencies (counts) markedly differ from the frequencies that we
would expect by chance.
The observed cell frequencies are organized in rows and
columns like a spreadsheet. This table of observed cell frequencies is
called a contingency table, and the chi-square test if part of a contingency
table analysis.
The chi-square statistic is the sum of the contributions
from each of the individual cells. Every cell in a table contributes
something to the overall chi-square statistic. If a given cell differs
markedly from the expected frequency, then the contribution of that cell to
the overall chi-square is large. If a cell is close to the expected frequency
for that cell, then the contribution of that cell to the overall chi-square
is low. A large chi-square statistic indicates that somewhere in the table,
the observed frequencies differ markedly from the expected frequencies. It
does not tell which cell (or cells) are causing the high chi-square...only
that they are there. When a chi-square is high, you must visually examine the
table to determine which cell(s) are responsible. When there are
exactly two rows and two columns, the chi-square statistic becomes
inaccurate, and Yate's correction for continuity is often applied.
If there is only one column or one row (a one-way chi-square
test), the degrees of freedom is the number of cells minus one. For a
two way chi-square, the degrees of freedom is the
number or rows minus one times the number of columns minus one.
Using the chi-square statistic and its associated
degrees of freedom, the software reports the probability that the differences
between the observed and expected frequencies occurred by chance.
Generally, a probability of .05 or less is considered to be a significant
difference.
A standard spreadsheet interface is used to enter the
counts for each cell. After you've finished entering the data, the
program will print the chi-square, degrees of freedom and probability of
chance.
Use caution when interpreting the chi-square statistic
if any of the cell frequencies are less than five. Also, use caution
when the total for all cells is less than 50.
Example
A drug manufacturing company
conducted a survey of customers. The research question is: Is there a
significant relationship between packaging preference (size of the bottle
purchased) and economic status? There were four packaging sizes: small,
medium, large, and jumbo. Economic status was: lower, middle, and upper. The
following data was collected.
lower middle upper
small
24
22 18
medium
23
28 19
large
18
27 29
jumbo
16
21 33
------------------------------------------------
Chi-square statistic = 9.743
Degrees of freedom = 6
Probability of chance = .1359
Fisher's Exact Test
The chi-square statistic becomes inaccurate when used to
analyze contingency tables that contain exactly two rows and two columns, and
that contain less than 50 cases. Fisher's exact probability is not
plagued by inaccuracies due to small N's. Therefore, it should be used
for two-by-two contingency tables that contain fewer than 50 cases.
Example
Here are the results of a
recent public opinion poll broken down by gender. What is the exact
probability that the difference between the observed and expected frequencies
occurred by chance?
Male Female
Favor
30 42
Opposed
70 58
-------------------------------------------
Fisher's exact probability =
.0249
Binomial Test
The binomial distribution is used for calculating the
probability of dichotomous outcomes in which the two choices are mutually
exclusive. The program requires that you enter the number of trials,
probability of the desired outcome on each trial, and the number of times the
desired outcome was observed.
Example
If we were to flip a coin one
hundred times, and it came up heads seventy times, what is the probability of
this happening?
Number of trials: 100
Probability of success on
each trial (0-1): .5
Number of successes: 70
---------------------------------------------------------
Probability of 70 or more
successes < .0001
Poisson Distribution
Events Test
The poisson distribution, like the binomial
distribution, is used to determine the probability of an observed
frequency. It is used to describe the number of events that will occur
in a specific period of time or in a specific area or volume. You need
to enter the observed and expected frequencies.
Example
Previous research on a
particular assembly line has shown that they have an average daily defect
rate of 39 products. Thus, the expected number of defective products
expected on any day is 39. The day after implementing a new quality
control program, they found only 25 defects. What is the probability of
seeing 25 or fewer defects on any day?
Observed frequency: 25
Expected frequency: 39
---------------------------------------------------
Probability of 25 or fewer
events = .0226
Percents are understood by nearly everyone, and
therefore, they are the most popular statistics cited in research.
Researchers are often interested in comparing two percentages to determine
whether there is a significant difference between them.
Choosing the Proper Test
There are two kinds of t-tests between percents.
Which test you use depends upon whether you're comparing percentages from one
or two samples.
Every percentage can be expressed as a fraction.
By looking at the denominator of the fraction we can determine whether to use
a one-sample or two-sample t-test between percents.
If the denominators used to calculate the two percentages represent the same
people, we use a one-sample t-test between percents to compare the two
percents. If the denominators represent different people, we use the
two-sample t-test between percents.
For example suppose you did a survey of 200 people.
Your survey asked,
Were you satisfied with the program?
___ Yes ___ No ___ Don't know
Of the 200 people, 80 said yes,
100 said no, and 20 didn't know. You could summarize the responses as:
Yes
80/200 = .4 = 40%
No
100/200 = .5 = 50%
Don't know
20/200 = .1 = 10%
Is there a significant difference between the percent saying
yes (40%) and the percent saying no (50%)? Obviously, there is a
difference; but how sure are we that the difference didn't just happen by
chance? In other words, how reliable is the difference?
Notice that the denominator used to calculate the percent
of yes responses (200) represents the same people as the denominator used to
calculate the percent of no responses (200). Therefore, we use a
one-sample t-test between proportions. The key is that the denominators
represent the same people (not that they are the same number).
After you completed your survey, another group of
researchers tried to replicate your study. They also used a sample size
of 200, and asked the identical question. Of the 200 people in
their survey, 60 said yes, 100 said no, and 40
didn't know. They summarized their results as:
Yes
60/200 = .3 = 30%
No
100/200 = .5 = 50%
Don't know
40/200 = .2 = 20%
Is there a significant difference between the percent
who said yes in your survey (40%) and the percent that said yes in their
survey (30%)? For your survey the percent that said yes was calculated
as 80/200, and in their survey it was 60/200. To compare the yes
responses between the two surveys, we would use a two-sample t-test between
percents. Even though both denominators were 200, they do not represent
the same 200 people.
Examples that would use a one-sample t-test
Which proposal would you vote
for?
___
Proposal A ___ Proposal B
Which product do you like
better?
___
Name Brand ___ Brand X
Which candidate would you
vote for?
___
Johnson ___ Smith ___
Anderson
When there are more than two
choices, you can do the t-test between any two of them. In this
example, there are three possible combinations: Johnson/Smith,
Johnson/Anderson, and Smith/Anderson.
Thus, you could actually perform three separate t-tests...one for each pair
of candidates. If this was your analysis plan, you would also use
Bonferroni's theorem to adjust the critical alpha level because the plan
involved multiple tests of the same type and family.
Examples that would use a two-sample t-test
A previous study found that
39% of the public believed in gun control. Your study found the 34%
believed in gun control. Are the beliefs of your sample different than
those of the previous study?
The results of a magazine
readership study showed that 17% of the women and 11% of the men recalled
seeing your ad in the last issue. Is there a significant difference
between men and women?
In a brand awareness study,
25% of the respondents from the Western region had heard of your
product. However, only 18% of the respondents from the Eastern region
had heard of your product. Is there a significant difference in product
awareness between the Eastern and Western regions?
One Sample t-Test between Percents
This test can be performed to determine whether
respondents are more likely to prefer one alternative or another.
Example
The research question
is: Is there a significant difference between the percent of people who
say they would vote for candidate A and the percent of people who say they
will vote for candidate B? The null hypothesis is: There
is no significant difference between the percent of people who say they will
vote for candidate A or candidate B. The results of the survey were:
Plan to vote for candidate A
= 35.5%
Plan to vote for candidate B
= 22.4%
Sample size = 107
The sum of the two percents
does not have to be equal to 100 (there may be candidates C and D, and people
that have no opinion). Use a one-sample t-test because both percentages
came from a single sample.
Use a two-tailed probability because the null hypothesis does not state
the direction of the difference. If the hypothesis is that one
particular choice has a greater percentage, use a one-tailed test (divide the
two-tailed probability by two).
Enter the first
percent: 35.5
Enter the second
percent: 22.4
Enter the sample size:
107
-----------------------------------------
t-value = 1.808
Degrees of freedom = 106
Two-tailed probability = .074
You might make a statement in
a report like this: A one-sample t-test between proportions was
performed to determine whether there was a significant difference between the
percent choosing candidate A and candidate B. The t-statistic was not
significant at the .05 critical alpha level, t(106)=1.808,
p=.073. Therefore, we fail to reject the null hypothesis and conclude
that the difference was not significant.
Two Sample t-Test between Percents
This test can be used to compare percentages drawn from
two independent samples. It can also be used to compare two subgroups from a
single sample.
Example
After conducting a survey of
customers, you want to compare the attributes of men and women. Even
though all respondents were part of the same survey, the men and women are
treated as two samples. The percent of men with a particular attribute
is calculated using the total number of men as the denominator for the
fraction. And the percent of women with the attribute is calculate using the total number of women as the
denominator. Since the denominators for the two fractions represent
different people, a two-sample t-test between percents is appropriate.
The research question is: Is
there a significant difference between the proportion of men having the
attribute and the proportion of women having the attribute? The null
hypothesis is: There is no significant difference between the proportion of
men having the attribute and the proportion of women having the attribute.
The results of the survey were:
86 men were surveyed and 22
of them (25.6%) had the attribute.
49 women were surveyed and 19
of them (38.8%) had the attribute.
Enter the first
percent: 25.6
Enter the sample size for the
first percent: 86
Enter the second
percent: 38.8
Enter the sample size for the
second percent: 49
-------------------------------------------------------------
t-value = 1.603
Degrees of freedom = 133
Two-tailed probability = .111
You might make a statement in
a report like this: A two-sample t-test between proportions was
performed to determine whether there was a significant difference between men
and women with respect to the percent who had the attribute. The t-statistic
was not significant at the .05 critical alpha level, t(133)=1.603,
p=.111. Therefore, we fail to reject the null hypothesis and conclude
that the difference between men and women was not significant.
Another example
Suppose interviews were
conducted at two different shopping centers. This two sample t-test
between percents could be used to determine if the responses from the two
shopping centers were different.
The research question is: Is
there a significant difference between shopping centers A and B with respect
to the percent that say they would buy product X? The null hypothesis
is: There is no significant difference between shopping centers A and B with
respect to the percent of people that say they would buy product X. A two-tailed
probability will be used because the hypothesis does not state the direction
of the difference. The results of the survey were:
89 people were interviewed as
shopping center A and 57 of them (64.0%) said they would buy product X.
92 people were interviewed as
shopping center B and 51 of them (55.4%) said they would buy product X.
Enter the first
percent: 64.0
Enter the sample size for the
first percent: 89
Enter the second
percent: 55.4
Enter the sample size for the
second percent: 92
-------------------------------------------------------------
t-value = 1.179
Degrees of freedom = 179
Two-tailed probability = .240
You might write a paragraph
in a report like this: A two-sample t-test between proportions was performed
to determine whether there was a significant difference between the two
shopping centers with respect to the percent who said they would buy product
X. The t-statistic was not significant at the .05 critical alpha level,
t(179)=1.179, p=.240. Therefore, we fail to
reject the null hypothesis and conclude that the difference in responses
between the two shopping centers was not significant.
Confidence Intervals around a Percent
Confidence intervals are used to determine how much
latitude there is in the range of a percent if we were to take repeated
samples from the population.
Example
In a study of 150 customers,
you find that 60 percent have a college degree. Your best estimate of the
percent who have a college degree in the population
of customers is also 60 percent. However, since it is just an estimate,
we establish confidence intervals around the estimate as a way of showing how
reliable the estimate is.
Confidence intervals can be
established for any error rate you are willing to accept. If, for example,
you choose the 95% confidence interval, you would expect that in five percent
of the samples drawn from the population, the percent who
had a college degree would fall outside of the interval.
What are the 95% confidence
intervals around this percent? In the following example, note that no
value is entered for the population size. When the population is very
large compared to the sample size (as in most research), it is not necessary
to enter a population size. If, however, the sample represents more
than ten percent of the population, the formulas incorporate a finite
population correction adjustment. Thus, you only need to enter the
population size when the sample size exceeds ten percent of the population
size.
Enter the percent: 60
Enter the sample size:
150
Enter the population
size: (left blank)
Enter the desired confidence
interval (%): 95
----------------------------------------------------------
Standard error of the proportion = .040
Degrees of freedom = 149
95% confidence interval
= 60.0% 7.9%
Confidence interval range =
52.1% to 67.9%
Therefore, our best estimate
of the population proportion with 5% error is 60% 7.9%. Stated
differently, if we predict that the proportion in the population who have a
college degree is between 52.1% and 67.9%, our prediction would be wrong for
5% of the samples that we draw from the population.
Researchers usually use the results from a sample to
make inferential statements about the population. When the data is
interval or ratio scaled, it usually described in terms of central tendency
and variability. Means and standard deviations are usually reported in
all research.
Mean
and Standard Deviation of a Sample
This menu selection will let you enter data for a
variable and calculate the mean, unbiased standard deviation, standard error
of the mean, and median. Data is entered using a standard spreadsheet
interface. Finite population correction is incorporated into the calculation
of the standard error of the mean, so the population size should be specified
whenever the sample size is greater than ten percent of the population size.
Example
A sample of ten was randomly
chosen from a large population. The ten scores were:
20 22
54 32 41 43 47
51 45 35
----------------------------------------------------
Mean = 39.0
Unbiased standard deviation =
11.6
Standard error of the mean =
3.7
Median = 42.0
Matched Pairs t-Test between Means
The matched pairs t-test is
used in situations where two measurements are taken for each
respondent. It is often used in experiments where there are
before-treatment and after-treatment measurements. The t-test is used
to determine if there is a reliable difference between the mean of the
before-treatment and the mean of the after treatment measurements.
Pretreatment
Posttreatment
Johnny -------------------- Johnny
Martha -------------------- Martha
Jenny ---------------------- Jenny
Sometimes, in very sophisticated (i.e., expensive)
experiments, two groups of subjects are individually matched on one or more
demographic characteristics. One group is exposed to a treatment
(experimental group) and the other is not (control group).
Experimental
Control
Johnny -------------------------- Fred
Martha --------------------------
Sharon
Jenny ---------------------------- Linda
The t-test works with small or large N's because it
automatically takes into account the number of cases in calculating the
probability level. The magnitude of the t-statistic depends on the
number of cases (subjects). The t-statistic in conjunction with the degrees
of freedom are used to calculate the probability
that the difference between the means happened by chance. If the
probability is less than the critical alpha level, then we say that a
significant difference exists between the two means.
Example
A example of a matched-pairs
t-test might look like this:
Pretest
Posttest
8
31
13
37
22
45
25
28
29
50
31
37
35
49
38
25
42
36
52
69
-----------------------------------------------------------
Var.1: Mean =
29.5 Unbiased SD = 13.2
Var. 2: Mean =
40.7 Unbiased SD = 13.0
t-statistic = 2.69
Degrees of freedom = 9
Two-tailed probability = .025
You might make a statement in
a report like this: The mean pretest score was 29.5 and the mean
posttest score was 40.7. A matched-pairs t-test was performed to determine if
the difference was significant. The t-statistic was significant at the .05
critical alpha level, t(9)=2.69, p=.025.
Therefore, we reject the null hypothesis and
conclude that posttest scores were significantly higher than pretest scores.
Independent
Groups t-Test between Means
This menu selection is used to determine if there is a
difference between two means taken from different samples. If you know
the mean, standard deviation and size
of both samples, this program may be used to determine if there is a reliable
difference between the means.
One measurement is taken for each respondent. Two
groups are formed by splitting the data based on some other variable. The
groups may contain a different number of cases. There is not a
one-to-one correspondence between the groups.
Score
Sex
Males
Females
25
M
25
27
27
F
-----becomes---->
19
17
17
F
21
19
M
21
F
Sometimes the two groups are formed because the data was
collected from two different sources.
School A Scores
School B Scores
525
427
492
535
582
600
554
520
There are actually two different formulas to calculate
the t-statistic for independent groups. The t-statistics
calculated by both formulas will be similar but not identical. Which
formula you choose depends on whether the variances of the two groups are
equal or unequal. In actual practice, most researchers assume that the
variances are unequal because it is the most conservative approach and is
least likely to produce a Type I error. Thus, the formula used in Statistics
Calculator assumes unequal variances.
Example
Two new product formulas were
developed and tested. A twenty-point scale was used to measure the
level of product approval. Six subjects tested the first formula. They
gave it a mean rating of 12.3 with a standard deviation of 1.4. Nine subjects
tested the second formula, and they gave it a mean rating of 14.0 with a
standard deviation of 1.7. The question we might ask is whether the
observed difference between the two formulas is reliable.
Mean of the first
group: 12.3
Unbiased standard deviation
of the first group: 1.4
Sample size of the first
group: 6
- - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - -
Mean of the second
group: 14.0
Unbiased standard deviation
of the second group: 1.7
Sample size of the second
group: 9
-------------------------------------------------------------------
t value = 2.03
Degrees of freedom = 13
Two-tailed probability = .064
You might make a statement in
a report like this: An independent groups
t-test was performed to compare the mean ratings between the two formulas.
The t-statistic was not significant at the .05 critical alpha level, t(13)=2.03, p=.064. Therefore, we fail to reject the
null hypothesis and conclude that there was no significant difference between
the ratings for the two formulas.
Confidence Interval around a Mean
You can calculate confidence intervals around a mean if
you know the sample size and standard deviation.
The standard error of the mean
is estimated from the standard deviation and the sample size. It is used to
establish the confidence interval (the range within which we would expect the
mean to fall in repeated samples taken from the population). The standard error of the mean is an estimate
of the standard deviation of those repeated samples.
The formula for the standard error of the mean provides
an accurate estimate when the sample size is very small compared to the size
of the population. In marketing research, this is usually the case
since the populations are quite large. Thus, in most situations the
population size may be left blank because the population is very large
compared to the sample. However, when the sample is more than ten
percent of the population, the population size should be specified so that
the finite population correction factor can be used to adjust the estimate of
the standard error of the mean.
Example
Suppose that an organization
has 5,000 members. Prior to their membership renewal drive, 75 members
were randomly selected and surveyed to find out their priorities for the
coming year. The mean average age of the sample was 53.1 and the
unbiased standard deviation was 4.2 years. What is the 90% confidence
interval around the mean? Note that the population size can be left
blank because the sample size of 75 is less than ten percent of the
population size.
Mean: 53.1
Unbiased standard
deviation: 4.2
Sample size: 75
Population size: (left
blank -or- 5000)
Desired confidence interval
(%): 90
-------------------------------------------------
Standard error of the mean =
.485
Degrees of freedom = 74
90% confidence interval =
53.1 .8
Confidence interval range =
52.3 - 53.9
Compare a Sample Mean to a Population Mean
Occasionally, the mean of the population is known
(perhaps from a previous census). After drawing a sample from the
population, it might be helpful to compare the mean of your sample to the
mean of the population. If the means are not significantly different
from each other, you could make a strong argument that your sample provides
an adequate representation of the population. If, however, the mean of
your sample is significantly different than the population, something may
have gone wrong during the sampling process.
Example
After selecting a random
sample of 18 people from a very large population, you want to determine if
the average age of the sample is representative of the average age of the
population. From previous research, you know that the mean age of the
population is 32.0. For your sample, the mean age was 28.0 and the
unbiased standard deviation was 3.2. Is the mean age of your sample
significantly different from the mean age in the population?
Sample mean = 28
Unbiased standard deviation =
3.2
Sample size = 18
Population size = (left
blank)
Mean of the population = 32
---------------------------------------
Standard error of the mean =
.754
t value = 5.303
Degrees of freedom = 17
Two-tailed probability =
.0001
The two-tailed probability of
the t-statistic is very small. Thus, we would conclude that the mean age of
our sample is significantly less than the mean age of the population. This
could be a serious problem because it suggests that some kind of age bias was
inadvertently introduced into the sampling process. It would be prudent
for the researcher to investigate the problem further.
Compare Two Standard Deviations
The F-ratio is used to compare variances. In its
simplest form, it is the variance of one group
divided by the variance of another group. When used in this way, the
larger variance (by convention) is the numerator and the smaller is the
denominator. Since the groups might have a different sample sizes, the
numerator and the denominator have their own degrees of freedom.
Example
Two samples were taken from
the population. One sample had 25 subjects and the standard deviation 4.5
on some key variable. The other sample had 12 subjects and had a
standard deviation of 6.4 on the same key variable. Is there a
significant difference between the variances of the two samples?
First standard
deviation: 4.5
First sample size: 25
Second standard
deviation: 6.4
Second sample size: 12
-----------------------------------------
F-ratio = 2.023
Degrees of freedom = 11 and
24
Probability that the
difference was due to chance = .072
Compare Three or more Means
Analysis of variance (ANOVA) is used when testing for
differences between three or more means.
In an ANOVA, the F-ratio is used to compare the variance
between the groups to the variance within the groups. For example,
suppose we have two groups of data. In the best of all possible worlds,
all the people in group one would have very similar scores. That is,
the group is cohesive, and there would be very little variability in scores
within the group. All the people in group two would also have similar
scores (although different than group one). Again, there is very little
variability within the group. Both groups have very little variability within
their group, however, there might be substantial
variability between the groups. The ratio of the between groups
variability (numerator) to the within groups variability (denominator) is the
F-ratio. The larger the F-ratio, the more certain we are that there is
a difference between the groups.
If the probability of the F-ratio is less than or equal
to your critical alpha level, it means that there is a significant difference
between at least two of groups. The F-ratio does not tell which group(s) are different from the others...just that there
is a difference.
After finding a significant F-ratio, we do "post-hoc"
(after the fact) tests on the factor to examine the differences between
levels. There are a wide variety of post-hoc tests, but one of the most
common is to do a series of special t-tests between all the combinations of
levels for that factor. For the post-hoc tests, use the same critical
alpha level that you used to test for the significance of the F-ratio.
Example
A company has offices in four
cities with sales representatives in each office. At each location, the
average number of sales per salesperson was calculated. The company
wants to know if there are significant differences between the four offices
with respect to the average number of sales per sales representative.
Group
Mean
SD N
1
3.29 1.38
7
2
4.90
1.45 10
3
7.50
1.38 6
4
6.00
1.60 8
-----------------------------------------------------------------------------------
Source
df
SS
MS
F
p
-----------------------------------------------------------------------------------
Factor
3
62.8
20.9
9.78 .0002
Error
27
57.8 2.13
Total
30 120.6
Post-hoc t-tests
Group
Group t-value
df p
1
2
2.23
15 .0412
1
3
5.17
11 .0003
1
4
3.58
13 .0034
2
3
3.44
14 .0040
2
4
1.59
16 .1325
3
4
1.09
12 .0019
Correlation is a measure of association between two
variables. The variables are not designated as dependent or independent. The
two most popular correlation coefficients are: Spearman's correlation
coefficient rho and Pearson's product-moment correlation coefficient.
When calculating a correlation coefficient for ordinal
data, select Spearman's technique. For interval or ratio-type data, use
Pearson's technique.
The value of a correlation coefficient can vary from
minus one to plus one. A minus one indicates a perfect negative correlation,
while a plus one indicates a perfect positive correlation. A correlation of
zero means there is no relationship between the two variables. When there is
a negative correlation between two variables, as the value of one variable
increases, the value of the other variable decreases, and vise versa. In
other words, for a negative correlation, the variables work opposite each
other. When there is a positive correlation between two variables, as the
value of one variable increases, the value of the other variable also
increases. The variables move together.
The standard error of a correlation coefficient is used to
determine the confidence intervals around a true correlation of zero.
If your correlation coefficient falls outside of this range, then it is
significantly different than zero. The standard error can be calculated
for interval or ratio-type data (i.e., only for Pearson's product-moment
correlation).
The significance (probability) of the correlation
coefficient is determined from the t-statistic. The probability of the
t-statistic indicates whether the observed correlation coefficient occurred
by chance if the true correlation is zero. In other words, it asks if
the correlation is significantly different than zero. When the t-statistic is
calculated for Spearman's rank-difference correlation coefficient, there must
be at least 30 cases before the t-distribution can be used to determine the
probability. If there are fewer than 30 cases, you must refer to a
special table to find the probability of the correlation coefficient.
Example
A company wanted to know if
there is a significant relationship between the total number of salespeople
and the total number of sales. They collect data for five months.
Var.
1 Var. 2
207
6907
180
5991
220
6810
205
6553
190
6190
--------------------------------
Correlation coefficient =
.921
Standard error of the
coefficient = ..068
t-test for the significance of the coefficient = 4.100
Degrees of freedom = 3
Two-tailed probability =
.0263
Another Example
Respondents to a survey were
asked to judge the quality of a product on a four-point Likert scale
(excellent, good, fair, poor). They were also asked to judge the
reputation of the company that made the product on a three-point scale (good,
fair, poor). Is there a significant
relationship between respondents perceptions of the
company and their perceptions of quality of the product?
Since both variables are
ordinal, Spearman's method is chosen. The first variable is the rating
for the quality the product. Responses are coded as 4=excellent,
3=good, 2=fair, and 1=poor. The second variable is the perceived
reputation of the company and is coded 3=good, 2=fair, and 1=poor.
Var.
1 Var. 2
4
3
2
2
1
2
3
3
4
3
1
1
2
1
-------------------------------------------
Correlation coefficient rho =
.830
t-test for the significance
of the coefficient = 3.332
Number of data pairs = 7
Probability must be
determined from a table because of the small sample size.
Regression
Simple regression is used to examine the relationship
between one dependent and one independent variable. After performing an
analysis, the regression statistics can be used to predict the dependent
variable when the independent variable is known. Regression goes beyond
correlation by adding prediction capabilities.
People use regression on an intuitive level every
day. In business, a well-dressed man is thought to be financially
successful. A mother knows that more sugar in her children's diet
results in higher energy levels. The ease of waking up in the morning
often depends on how late you went to bed the night before.
Quantitative regression adds precision by developing a mathematical formula
that can be used for predictive purposes.
For example, a medical researcher might want to use body
weight (independent variable) to predict the most appropriate dose for a new
drug (dependent variable). The purpose of running the regression is to
find a formula that fits the relationship between the two variables.
Then you can use that formula to predict values for the dependent variable when
only the independent variable is known. A doctor could prescribe the
proper dose based on a person's body weight.
The regression line (known as the least squares line)
is a plot of the expected value of the dependent variable for all values of
the independent variable. Technically, it is the line that
"minimizes the squared residuals". The regression line is the one
that best fits the data on a scatterplot.
Using the regression equation, the dependent variable
may be predicted from the independent variable. The slope of the
regression line (b) is defined as the rise divided by the run. The y
intercept (a) is the point on the y axis where the regression line would
intercept the y axis. The slope and y intercept are incorporated into
the regression equation. The intercept is usually called the constant, and
the slope is referred to as the coefficient. Since the regression model
is usually not a perfect predictor, there is also an error term in the
equation.
In the regression equation, y is always the dependent
variable and x is always the independent variable. Here are three equivalent
ways to mathematically describe a linear regression model.
y = intercept + (slope x) + error
y = constant + (coefficient x) + error
y = a + bx + e
The significance of the slope of the regression line is
determined from the t-statistic. It is the probability that the
observed correlation coefficient occurred by chance if the true correlation
is zero. Some researchers prefer to report the F-ratio instead of the
t-statistic. The F-ratio is equal to the t-statistic squared.
The t-statistic for the significance of the slope is
essentially a test to determine if the regression model (equation) is
usable. If the slope is significantly different than zero, then we can
use the regression model to predict the dependent variable for any value of
the independent variable.
On the other hand, take an example where the slope is
zero. It has no prediction ability because for every value of the
independent variable, the prediction for the dependent variable would be the
same. Knowing the value of the independent variable would not improve our
ability to predict the dependent variable. Thus, if the slope is not
significantly different than zero, don't use the model to make predictions.
The coefficient of determination (r-squared) is the
square of the correlation coefficient. Its value may vary from zero to
one. It has the advantage over the correlation coefficient in that it
may be interpreted directly as the proportion of variance in the dependent
variable that can be accounted for by the regression equation. For
example, an r-squared value of .49 means that 49% of the variance in the
dependent variable can be explained by the regression equation. The
other 51% is unexplained.
The standard error of
the estimate for regression measures the amount of variability in the points around the regression line.
It is the standard deviation of the data points as they are distributed
around the regression line. The standard error of the estimate can be
used to develop confidence intervals around a prediction.
Example
A company wants to know if
there is a significant relationship between its advertising expenditures and
its sales volume. The independent variable is advertising budget and
the dependent variable is sales volume. A lag time of one month will be
used because sales are expected to lag behind actual advertising
expenditures. Data was collected for a six month period. All figures
are in thousands of dollars. Is there a significant relationship between
advertising budget and sales volume?
IV
DV
4.2
27.1
6.1
30.4
3.9
25.0
5.7
29.7
7.3
40.1
5.9
28.8
--------------------------------------------------
Model: y = 10.079 +
(3.700 x) + error
Standard error of the
estimate = 2.568
t-test for the significance
of the slope = 4.095
Degrees of freedom = 4
Two-tailed probability =
.0149
r-squared = .807
You might make a statement in
a report like this: A simple linear regression was performed on six
months of data to determine if there was a significant relationship between
advertising expenditures and sales volume. The t-statistic for the slope was significant
at the .05 critical alpha level, t(4)=4.10,
p=.015. Thus, we reject the null hypothesis and conclude
that there was a positive significant relationship between advertising
expenditures and sales volume. Furthermore, 80.7% of the variability in sales
volume could be explained by advertising expenditures.
The formula to determine sample size depends upon
whether the intended comparisons involve means or percents.
Sample Size for Percents
This menu selection is used to determine the required
size of a sample for research questions involving percents.
Four questions must be answered to determine the sample
size:
1. Best estimate of the population size: You do not need to know the exact size of
the population. Simply make your best estimate. An inaccurate population size
will not seriously affect the formula computations. If the population
is very large, this item may be left blank.
2. Best estimate of the rate in the population
(%): Make your best estimate of what the actual percent of the survey
characteristic is. This is based on the null hypothesis. For example,
if the null hypothesis is "blondes don't have more fun", then what
is your best estimate of the percent of blondes that do have more fun?
If you simply do not know, then enter 50 (for fifty percent).
3. Maximum acceptable difference (%): This is the
maximum percent difference that you are willing to accept between the true
population rate and the sample rate. Typically, in social science research,
you would be willing to accept a difference of 5 percent. That is, if your
survey finds that 25 percent of the sample has a certain characteristic, the
actual rate in the population may be between 20 and 30 percent.
4. Desired confidence level (%): How confident
must you be that the true population rate falls within the acceptable
difference (specified in the previous question)? This is the same as
the confidence that you want to have in your findings. If you want 95 percent
confidence (typical for social science research), you should enter 95.
This means that if you took a hundred samples from the population, five of
those samples would have a rate that exceeded the difference you specified in
the previous question.
Example
A publishing wants to know
what percent of the population might be interested in a new magazine on
making the most of your retirement. Secondary data (that is several
years old) indicates that 22% of the population is retired. They are
willing to accept an error rate of 5% and they want to be 95% certain that
their finding does not differ from the true rate by more than 5%. What
is the required sample size?
Best estimate of the
population size: (left blank)
Best estimate of the rate in
the population (%): 22
Maximum acceptable difference
(%): 5
Desired confidence level
(%): 95
-------------------------------------------------------------
Required sample size = 263
Sample Size for Means
This menu selection is used to determine the required
size of a sample for research questions involving means.
Three questions must be answered to determine the sample
size:
1. Standard deviation of the population: It is
rare that a researcher knows the exact standard deviation of the population.
Typically, the standard deviation of the population is estimated a) from the
results of a previous survey, b) from a pilot study, c) from secondary data,
or d) or the judgment of the researcher.
2. Maximum acceptable difference: This is the
maximum amount of error that you are willing to accept. That is, it is the
maximum difference that the sample mean can deviate from the true population
mean before you call the difference significant.
3. Desired confidence level (%): The confidence
level is your level of certainty that the sample mean does not differ from
the true population mean by more than the maximum acceptable difference.
Typically, social science research uses a 95% confidence level.
Example
A fast food company wants to determine
the average number of times that fast food users visit fast food restaurants
per week. They have decided that their estimate needs to be accurate
within plus or minus one-tenth of a visit, and they want to be 95% sure that
their estimate does differ from true number of visits by more than one-tenth
of a visit. Previous research has shown that the standard deviation is
.7 visits. What is the required sample size?
Population standard
deviation: .7
Maximum acceptable
difference: .1
Desired confidence interval
(%): 95
--------------------------------------------
Required sample size = 188
|