Statistics Calculator software for statistical analysis significance tests

The Statistics Calculator

Statistical Analysis Tests At Your Fingertips

Download the Free Version

Means Menu

Researchers usually use the results from a sample to make inferential statements about the population. When the data is interval or ratio scaled, it usually described in terms of central tendency and variability. Means and standard deviations are usually reported in all research.

The Means menu has seven selections:

Mean and standard deviation of a sample

This menu selection will let you enter data for a variable and calculate the mean, unbiased standard deviation, standard error of the mean, and median. Data is entered using a standard spreadsheet interface. Finite population correction is incorporated into the calculation of the standard error of the mean, so the population size should be specified whenever the sample size is greater than ten percent of the population size.


A sample of ten was randomly chosen from a large population. The ten scores were:

20 22 54 32 41 43 47 51 45 35


Mean = 39.0
Unbiased standard deviation = 11.6
Standard error of the mean = 3.7
Median = 42.0

Matched pairs t-test between means

The matched pairs t-test is used in situations where two measurements are taken for each respondent. It is often used in experiments where there are before-treatment and after-treatment measurements. The t-test is used to determine if there is a reliable difference between the mean of the before-treatment and the mean of the after treatment measurements.

Pretreatment         Post-treatment

Johnny -------------------- Johnny

Martha -------------------- Martha

Jenny ---------------------- Jenny

Sometimes, in very sophisticated (i.e., expensive) experiments, two groups of subjects are individually matched on one or more demographic characteristics. One group is exposed to a treatment (experimental group) and the other is not (control group).

Experimental                 Control

Johnny -------------------------- Fred

Martha -------------------------- Sharon

Jenny ---------------------------- Linda

The t-test works with small or large N's because it automatically takes into account the number of cases in calculating the probability level. The magnitude of the t-statistic depends on the number of cases (subjects). The t-statistic in conjunction with the degrees of freedom are used to calculate the probability that the difference between the means happened by chance. If the probability is less than the critical alpha level, then we say that a significant difference exists between the two means.


A example of a data set for a matched-pairs t-test might look like this:

Pretest Post-test
8 31
13 37
22 45
25 28
29 50
31 37
35 49
38 25
42 36
52 69


Var.1: Mean = 29.5 Unbiased SD = 13.2
Var. 2: Mean = 40.7 Unbiased SD = 13.0
t-statistic = 2.69
Degrees of freedom = 9
Two-tailed probability = .025

You might make a statement in a report like this: The mean pretest score was 29.5 and the mean post-test score was 40.7. A matched-pairs t-test was performed to determine if the difference was significant. The t-statistic was significant at the .05 critical alpha level, t(9)=2.69, p=.025. Therefore, we reject the null hypothesis and conclude that post-test scores were significantly higher than pretest scores.

Independent groups t-test between means

This menu selection is used to determine if there is a difference between two means taken from different samples. If you know the mean, standard deviation and size of both samples, this program may be used to determine if there is a reliable difference between the means.

One measurement is taken for each respondent. Two groups are formed by splitting the data based on some other variable. The groups may contain a different number of cases. There is not a one-to-one correspondence between the groups.

Original Data Set   After Splitting Data into 2 Groups
Score Sex   Males Females
25 M   25 27
27 F   19 17
17 F     21
19 M      
21 F      

Sometimes the two groups are formed because the data was collected from two different sources.

School A Scores School B Scores
525 427
492 535
582 600

There are actually two different formulas to calculate the t-statistic for independent groups. The t-statistics calculated by both formulas will be similar but not identical. Which formula you choose depends on whether the variances of the two groups are equal or unequal. In actual practice, most researchers assume that the variances are unequal because it is the most conservative approach and is least likely to produce a Type I error. Thus, the formula used in Statistics Calculator assumes unequal variances.


Two new product formulas were developed and tested. A twenty-point scale was used to measure the level of product approval. Six subjects tested the first formula. They gave it a mean rating of 12.3 with a standard deviation of 1.4. Nine subjects tested the second formula, and they gave it a mean rating of 14.0 with a standard deviation of 1.7. The question we might ask is whether the observed difference between the two formulas is reliable.

Mean of the first group: 12.3
Unbiased standard deviation of the first group: 1.4
Sample size of the first group: 6

Mean of the second group: 14.0
Unbiased standard deviation of the second group: 1.7
Sample size of the second group: 9


t value = 2.03
Degrees of freedom = 13
Two-tailed probability = .064

You might make a statement in a report like this: An independent groups t-test was performed to compare the mean ratings between the two formulas. The t-statistic was not significant at the .05 critical alpha level, t(13)=2.03, p=.064. Therefore, we fail to reject the null hypothesis and conclude that there was no significant difference between the ratings for the two formulas.

Confidence interval around a mean

You can calculate confidence intervals around a mean if you know the sample size and standard deviation.

The standard error of the mean is estimated from the standard deviation and the sample size. It is used to establish the confidence interval (the range within which we would expect the mean to fall in repeated samples taken from the population). The standard error of the mean is an estimate of the standard deviation of those repeated samples.

The formula for the standard error of the mean provides an accurate estimate when the sample size is very small compared to the size of the population. In marketing research, this is usually the case since the populations are quite large. Thus, in most situations the population size may be left blank because the population is very large compared to the sample. However, when the sample is more than ten percent of the population, the population size should be specified so that the finite population correction factor can be used to adjust the estimate of the standard error of the mean.


Suppose that an organization has 5,000 members. Prior to their membership renewal drive, 75 members were randomly selected and surveyed to find out their priorities for the coming year. The mean average age of the sample was 53.1 and the unbiased standard deviation was 4.2 years. What is the 90% confidence interval around the mean? Note that the population size can be left blank because the sample size of 75 is less than ten percent of the population size.

Mean: 53.1
Unbiased standard deviation: 4.2
Sample size: 75
Population size: (left blank -or- 5000)
Desired confidence interval (%): 90


Standard error of the mean = .485
Degrees of freedom = 74
90% confidence interval = 53.1 .8
Confidence interval range = 52.3 - 53.9

Compare a sample mean to a population mean

Occasionally, the mean of the population is known (perhaps from a previous census). After drawing a sample from the population, it might be helpful to compare the mean of your sample to the mean of the population. If the means are not significantly different from each other, you could make a strong argument that your sample provides an adequate representation of the population. If, however, the mean of your sample is significantly different than the population, something may have gone wrong during the sampling process.


After selecting a random sample of 18 people from a very large population, you want to determine if the average age of the sample is representative of the average age of the population. From previous research, you know that the mean age of the population is 32.0. For your sample, the mean age was 28.0 and the unbiased standard deviation was 3.2. Is the mean age of your sample significantly different from the mean age in the population?

Sample mean = 28
Unbiased standard deviation = 3.2
Sample size = 18
Population size = (left blank)
Mean of the population = 32


Standard error of the mean = .754
t value = 5.303
Degrees of freedom = 17
Two-tailed probability = .0001

The two-tailed probability of the t-statistic is very small. Thus, we would conclude that the mean age of our sample is significantly less than the mean age of the population. This could be a serious problem because it suggests that some kind of age bias was inadvertently introduced into the sampling process. It would be prudent for the researcher to investigate the problem further.

Compare two standard deviations

The F-ratio is used to compare variances. In its simplest form, it is the variance of one group divided by the variance of another group. When used in this way, the larger variance (by convention) is the numerator and the smaller is the denominator. Since the groups might have a different sample sizes, the numerator and the denominator have their own degrees of freedom.


Two samples were taken from the population. One sample had 25 subjects and the standard deviation 4.5 on some key variable. The other sample had 12 subjects and had a standard deviation of 6.4 on the same key variable. Is there a significant difference between the variances of the two samples?

First standard deviation: 4.5
First sample size: 25
Second standard deviation: 6.4
Second sample size: 12


F-ratio = 2.023
Degrees of freedom = 11 and 24
Probability that the difference was due to chance = .072

Compare three or more means

Analysis of variance (ANOVA) is used when testing for differences between three or more means.

In an ANOVA, the F-ratio is used to compare the variance between the groups to the variance within the groups. For example, suppose we have two groups of data. In the best of all possible worlds, all the people in group one would have very similar scores. That is, the group is cohesive, and there would be very little variability in scores within the group. All the people in group two would also have similar scores (although different than group one). Again, there is very little variability within the group. Both groups have very little variability within their group, however, there might be substantial variability between the groups. The ratio of the between groups variability (numerator) to the within groups variability (denominator) is the F-ratio. The larger the F-ratio, the more certain we are that there is a difference between the groups.

If the probability of the F-ratio is less than or equal to your critical alpha level, it means that there is a significant difference between at least two of groups. The F-ratio does not tell which group(s) are different from the others...just that there is a difference.

After finding a significant F-ratio, we do "post-hoc" (after the fact) tests on the factor to examine the differences between levels. There are a wide variety of post-hoc tests, but one of the most common is to do a series of special t-tests between all the combinations of levels for that factor. For the post-hoc "lsd" (least significant difference) t-tests, use the same critical alpha level that you used to test for the significance of the F-ratio.


A company has offices in four cities with sales representatives in each office. At each location, the average number of sales per salesperson was calculated. The company wants to know if there are significant differences between the four offices with respect to the average number of sales per sales representative.

Group Mean SD N
1 3.29 1.38 7
2 4.90 1.45 10
3 7.50 1.38 6
4 6.00 1.60 8


Source df SS MS F p
Factor 3 62.8 20.9 9.78 .0002
Error 27 57.8 2.13    
Total 30 120.6      

Post-hoc t-tests

Groups Compared t-value df p
1 & 2 2.23 15 .0412
1 & 3 5.17 11 .0003
1 & 4 3.58 13 .0034
2 & 3 3.44 14 .0040
2 & 4 1.59 16 .1325
3 & 4 1.09 12 .0019

How to Order Statistics Calculator

5-Star Rating


Copyright 2014 StatPac Inc., All Rights Reserved