STATISTICAL INFERENCE

 

1.0 HYPOTHESIS TESTING

A hypothesis is a statement, inference or tentative explanation about a population that can be tested by further investigation. A hypothesis test is a statistical test that assists in the decision to prove or disprove the statement. In most cases, it may be easier to disprove a hypothesis than to verify it. Sample measurements are taken and inferences made about the total population based on the sample statistics. The inferences and the resulting decisions could be correct or incorrect. In hypothesis testing as well as all other statistical analyses, the risks of making a wrong decision are predetermined.

1.1 Null and Alternate Hypotheses

The symbol m represents the population or universe mean and represents the sample mean. The symbol represents the population standard deviation and or s represent the standard deviation calculated from the sample. The population values m and s are parameters and the sample values and or s are statistics.

The following hypothesis is to be tested: The number 5 wheel bearings made by the Canape Bearing Company have an average diameter of 0.65 inches.

Based on the sample statistics, is this statement true or false? The objective of hypothesis testing is to verify the validity of the statement.

The null hypothesis is the hypothesis to be proven or disproven. It is represented by H0.

The alternate hypothesis is the hypothesis that is accepted if the null hypothesis is rejected by the test. It is represented by H1.

For the test, the hypotheses are stated as follows:

Null Hypothesis, H0: m = 0.65
Alternate Hypothesis, H1: m ¹ 0.65

 

1.2 Types of Hypothesis Tests

Hypotheses concerning both measurements (variables or continuous data) or counts (attribute or discrete data) may be tested. The hypothesis H0: m = 0.65 is an example using continuous data.

A hypothesis concerning discrete data may be stated as follows:

The population from which a sample of one hundred parts was taken has a process average of 2% defective.

H0 : p = .02 or H0 : np = 100 x .02 = 2

The symbol p represents the population mean and represents the sample mean. The symbol np represents the average number of defective parts in the sample for both the hypergeometric and the binomial distributions. For the Poisson distribution, np is the average number of defects in the sample.

In hypothesis testing, a general rule is to use the normal curve areas (Z - statistics) when the sample size n is thirty or more. When n is less than thirty, the t distribution areas (t - statistics) and corresponding tables are used. When n is very large (), the Z table and t table are virtually identical. The method of testing a hypothesis is exactly the same; the only difference is the table that is used.

In the previous chapter n - 1 was used instead of n in the denominator of the variance formula. Division by n - 1 is called division by degrees of freedom. The term degrees of freedom is so named because only n - 1 linear comparisons can be made among n choices. If there are three job candidates and three jobs to fill, there are three choices for the first job. Once it is filled, there are two choices for the second job. After the second job is filled, there is only one choice for the third job. The third job can only be filled by the remaining person, so there are n - 1 or two degrees of freedom. To use the t table, the number of degrees of freedom is required.

 

2.0 DECISION ERRORS

Anytime sample data are used to make decisions or inferences about the entire population, decision errors can be made. In hypothesis testing and acceptance sampling, two types of errors can occur.

Type I error, alpha , rejection of a true null hypothesis

Type II error, beta , acceptance of a false null hypothesis

 

 

Null hypothesis (Ho)
is true
Null hypothesis (Ho)
is false
Do not reject Ho no error (1 - a ) Type II error (b )
Reject Ho Type I error (a ) no error (1 - b )

The level of significance is denoted by a . This is the probability of rejecting a null hypothesis when it is true or the probability of making a wrong decision. The confidence interval is denoted by 1 - µ. The confidence interval is the probability of not rejecting a null hypothesis when it is true or the probability of making a right decision. The probability of not rejecting a null hypothesis when it is false is b . This is the probability of making a wrong decision. The term 1 - is referred to as the power of test and is the probability of rejecting a null hypothesis when it is false. This is the probability of making a right decision when the null hypothesis is truly false. In acceptance sampling, and are referred to as the producer’s risk and consumer’s risk respectively.

 

3.0 HOW A HYPOTHESIS IS TESTED

3.1 One or two tailed test?

The alternate hypothesis determines whether the test will be a one tailed test or a two tailed test. If H1 is stated as H1: m ¹ .65, then the test is a two tailed test. If H1 is stated as H1: m < .65 or H1: m > .65, the test is a one tailed test.

Sketch for a two tailed test (a = .05)

Alternate Hypothesis, H1: m ¹ .65

 

 

Sketches for a one tailed test (= .05)

 

3.2 Examples

Example 1

Example 1 is a step by step procedure on how to perform a hypothesis test. A sample of 100 bearings is taken. The diameter is measured and recorded for each part in the sample. The mean and standard deviation are calculated. The following results are obtained:

= 0.67 inches, s = 0.08 inches, n = 100

The distribution of individual data values has a mean of 0.67" and a standard deviation of 0.08". Since a sample mean is to be tested against a population mean, the test distribution is a distribution of averages. The distribution of averages has a mean of 0.67" and a standard deviation or standard error of .

Could the population from which this sample was selected have an average diameter of 0.65 inches? Test this hypothesis against a level of significance of .05 ( = .05).

Step 1) State the null and alternate hypotheses.

H0: m = 0.65

H1; m ¹ 0.65

 

Step 2) Draw a sketch showing m = .65 and = .67 on the scale.

 

Step 3) Determine the critical values of Z and the region of rejection.

Since H1: m ¹ .65, this will be a two tailed test. Half of a will define the rejection region on the left tail of the curve and half of a will define the rejection region on the right tail of the curve. Half of a is .05/2 = .025. The sample size is greater than thirty, therefore the correct table to use is the standard normal curve or Z table.

Either draw another sketch or superimpose a Z scale on the first sketch. Look up

.50 - .025 = .475 in the table. The corresponding Z values are ± 1.96. These are the critical values. The critical values are the Z values that separate the rejection region from the acceptance region. On the sketch, shade in the areas to the left of -1.96 and to the right of +1.96. The shaded areas indicate the regions of rejection. The area of the acceptance region is .95, therefore this test will have a confidence interval of .95.

Step 4) Calculate the value of Z that corresponds with using the to Z conversion formula.

If the calculated value of Z is between the critical values, it falls in the acceptance region and the hypothesis is not rejected. If the calculated value of Z is to the left

of -1.96 or to the right of +1.96, the hypothesis that m = .65 is rejected in favor of the alternate hypothesis that m ¹ .65.

 

Step 5) Make a decision to reject or not reject the null hypothesis.

The calculated Z value of +2.5 lies in the rejection region, therefore H0 is rejected in favor of H1. It is concluded that m ¹ .65.

 

Example 2

Given = 30.5, s = 1.4, and n = 20. Test the hypothesis = 30 against the alternative

> 30 at a level of significance of .05. This will be a one tailed test because the alternate hypothesis is > 30. Since n = 20, the t table is used to determine the critical value and rejection region.

The calculated value of t is

                                               

 

  •                            
  • The critical value is t = +1.79. The area to the right of the critical value is the region of rejection. The calculated value t = 1.61 falls in the acceptance region, therefore H0 is not rejected. It can be concluded that the sample could have reasonably come from a population whose mean is thirty ( = 30).

    Example 3

    A sample of fifty parts is taken from a large shipment of a purchased product. The parts are inspected and categorized as either defective or non-defective. The inspector finds two defective parts. ( = 2/50 = .04 or 4% defective). The supplier states that his process yields 3% defective product. Could the supplier be telling the truth? Test at a level of significance of .01. This problem can be worked out by stating the null hypothesis in terms of p or in terms of np.

    The standard deviation is modified depending on which null hypothesis is used. The Z value and the resulting decision will be the same. The formulas below are from chapter 4, page 60.

    The standard deviation in terms of np for the binomial distribution

                                              

    The standard deviation in terms of p for the binomial distribution

                                      

    The hypothesis test in terms of p is Ho: p = .03, H1: p > .03,

    The hypothesis test in terms of np is Ho: np = 1.5, H1 > 1.5,

     

     

     

     

     

     

     

    The critical value of Z is +2.33. The calculated value of Z is +.41. This value lies in the acceptance region, therefore H0 is not rejected. The test supports the supplier’s statement that the process yields 3% defective product.

     

    4.0 TEST FOR GOODNESS OF FIT

    To test a sample distribution to determine if it fits the pattern of a specific distribution such as the normal, a Goodness of Fit test is performed. Goodness of Fit refers to the comparison of an observed sample distribution with a known theoretical distribution. In the following example the chi square distribution is used to compare a sample distribution with the normal distribution. This is a test to determine if the universe from where the sample data were obtained could be normally distributed.

    Chi Square =

    fi = observed frequency

    Fi = theoretical frequency

    Based on the sample data, a hypothesis that the universe may be normally distributed against an alternative that it is not normally distributed is to be tested.

       H0: universe may be normally distributed

    H1: universe is not normally distributed

    For a chi square test involving intervals or cells, the degrees of freedom are k - 3, where k is the number of intervals. For tests using row and column matrices called contingency tables, the degrees of freedom are (# rows – 1) x (# columns – 1) or (r – 1)(c – 1).

    Example 4

    The measurements from a sample of one hundred number 10 bearings from the Canape Bearing Company are grouped into seven intervals or cells. The data table is on the next page. The number of intervals is arbitrary. More cells will give a higher degree of accuracy. Six to Twelve cells will yield satisfactory results. The number of occurrences in each cell is compared to the number of occurrences that should be in the cell based on the known theoretical distribution. In this example, the data are compared to the normal distribution. The chi square test sums the differences in each cell and the total value of chi square is used to test the hypothesis that the data may be normally distributed. The mean and standard deviation are calculated using the midpoint (mi) of the interval and the observed frequencies (fi).

    Could the universe from where these data were obtained be normally distributed? Test at a level of significance of .05. The null and alternate hypotheses are

    H0: universe may be normally distributed

    H1: universe is not normally distributed

    Interval

    Observed Frequency

    (fi)

    Theoretical Frequency

    (Fi)

    140 or above

    8

    5.9

    .75

    130 - < 140

    11

    10.4

    .03

    120 - < 130

    15

    18.1

    .53

    110 - < 120

    20

    22.7

    .32

    100 - < 110

    21

    20.5

    .01

    90 - < 100

    16

    13.4

    .50

    below 90

    9

    9.0

    0

         

    = 2.14

    The degrees of freedom are k - 3. There are seven intervals, therefore d.f. are 7 - 3 = 4

    Chi square = = 2.14. From the chi square table, the critical value for = .05 and 4 degrees of freedom is 9.49. Therefore, the null hypothesis is not rejected. It is concluded that the universe could be normally distributed.

    4.1 Sketch of the Chi Square Distribution

    The chi square distribution is not a symmetrical distribution. Its shape depends on the degrees of freedom. If it is known that a universe is normal, the distribution of sample variances has the form of the chi square distribution. The chi square values that are computed in the example are sample variances. This is why the chi square distribution is used to perform the test.

    The critical value of chi square is defined as the value on the abscissa that indicates where the region of rejection begins. The numbers on the abscissa are the various values of chi square. The region of rejection is the area to the right of the critical value and is indicated by the shading.

     

     

    If goodness of fit tests are made to compare sample data to other distributions, such as the uniform or binomial, the theoretical frequencies would be obtained using the specific distribution.

    4.2 Theoretical Frequencies for the Example

    The theoretical frequencies are obtained from the standard normal curve table.

     

    = 113.1 and s = 17.21

     

    5.0 INFERENCES ABOUT TWO MEANS

    When a control group is compared with an experimental group, or when two experimental groups are compared, the concern is primarily with differences in their means. This test assumes that the means are independent. Two samples are independent when the outcome of one does not affect the other. The pooled variance is the combined variance of two or more samples and is denoted by sp2.

    The pooled variance for two samples is

    and n1 and s12 are the sample size and variance for the first sample and n2 and s22 are the sample size and variance for the second. The pooled standard deviation is . The denominator (n1 + n2 - 2) indicates the total degrees of freedom.

    Two processes are being evaluated. A study shows that 10 employees working on process A have an average assembly time of 250 seconds with variance of 100. In addition, 20 employees on process B have an average assembly time of 225 seconds with variance of 80. Are the average assembly times for the two processes significantly different?

    For this type of test, the variances are always tested first for significant differences. The F ratio and corresponding F table values are used to test the variances. If the variances are significantly different, the test for differences in means is not valid. Before testing for differences in means, the assumption is that the variances are homogeneous.

    The t table is used for the test of differences between means since both sample sizes are less than thirty.

    The formula for conversion from the scale to the t scale:

    The variances are tested for significant differences (a = .05).

    1)             Ho: s12 = s22

                    H1: s12 > s22

     

    2)           

    The F ratio is a value on the abscissa of the F distribution curve. The symbol sL2 represents the larger of the two variances and the symbol ss2 represents the smaller of the two variances. The sample size for the larger variance is denoted by nL and the sample size for the smaller variance is denoted by ns. There are nL - 1 degrees of freedom for the numerator and ns - 1 degrees of freedom for the denominator.

    3 )                 Region of rejection: F ³ 2.42. This is the critical value of F from the F.05 table with 9 degrees of freedom for the numerator and 19                      degrees of freedom for the denominator.

    4)         The calculated value of F is less than the critical value of 2.42. The null hypothesis that s12 = s22 is not rejected. It is concluded that there                is no significant difference between the two variances.

     

                The means are now tested for significant differences (a = .05).

    1)         Ho: m1 = m2

                H1: m1 ¹ m2

     

    2)     Region of rejection: -2.048 > t > +2.048 (These are the critical values of t).

     

    3)             ,

                               =

     

       

    1. The calculated value of t, t = +6.94, falls beyond the critical value of t = +2.048, therefore the null hypothesis that m1 = m2 is rejected. The conclusion is that there is a statistically significant difference between the two means.
    2.