Friday, December 26, 2008

8. Analyse Experimental Data

Experimental/Quasi-experimental Questions

Descriptive data analysis answers the question, "What is it like?"

Experimental or Quasi-experimental data analysis is usually answering the question, "Is A different to B?"

To answer these questions, hypotheses are tested using the data collected and statistical tests. the statistical tests tell how confident one can be that the results from the study is what will happen in other similar situations.

Before discussing how to choose the right test, we will look at the things that are common to all statistical tests (hypothesis and probability) and the basic features of the study which will determine which is the most suitable test.

Hypothesis

A hypothesis is an explicit operational form of the research question. They are usually written in the method section. There are 2 forms of hypothesis for each question, the null hypothesis (HO) and the alternate hypothesis (HA). The null hypothesis states that there is no difference between A and B. the alternate hypothesis states the outcome the researcher expects based on an understanding of the principle and theories applicable to the situation. The principles and theories usually result in the alternate hypothesis stating not only that there will be a difference between A and B, but also whether A and B will be greater. The alternate hypothesis is often called the research hypothesis and should be logically defensible, usually information available in the literature.

Hypothesis Testing

Unless you measure every A and every B you can never be sure that A=B or A does no equal B. You use probability to indicate how confident you can be that what your study showed is the real situation.

Probability

Probability (p) is reported as a number between 0 and 1. If the probability is 1 it always happens, if probability = 0 it never happens.

As a researcher, one has to choose which level of probability you can call statistically significant in your study. Remember that there are 2 types of significance, statistical and clinical or practical.

Statistical significance is to do with the probability of getting results like your study by chance alone.

Clinical significance is to do whether any statistical significance differences are of practical or clinical importance

Statistical significance levels commonly used in health sciences are 0.05 and 0.01, but remember there is a lot of difference between p=0.06 and p-0.05 (95%) these p values are the chance that you are making an error in the interpretation of your results (type 1 error)

This chance that you have said there IS and difference when there IS NOT a difference in the real world is represented by the symbol alpha (a). This is called a type 1 error. This is the probability usually reported in data analysis. For example, p<0.03, 05 =" .95" style="color: rgb(51, 51, 255);"> beta (b). This is called a type 2 error.


The chance that when you are correct if your results say there IS NOT a difference , and there IS NOT a difference is the
1 - b, which is called power.

The way a is determined by calculating a statistic representing the difference between A and B and comparing the statistic to a distribution table to see how likely it is to get a statistic as large as that if the null hypothesis is true. The
statistic used include z, t and F.

Obviously it is desirable to have the smallest a and b possible, ie. the least chance that what you have said is wrong. the chances of making errors is related to teh number of subjects you sample adn the size of the effect of the treatment. to reduce the chance of a type 2 error (b) and increase the power, increaase teh number of subjects , reduce the variance of your data or look for a larger effect. Power should be at least > 0.7 and preferably 0.8 or 0.9 You will note that a higher chance for making a type 2 error is accepted taht for making a type 1 error - ie. interpretation errs on the conservative "nor difference" side.

Type of data

The categories of dta previously discussed (nominal, ordinal and metric) are important when choosing an appropriate statistical test. There are 2 major classes of tests; one suitable for parametric data adn one suitable for non-parametric data.

Parametric data is metric, normally distributed and the different data sets are of equal variance. remember metric means the intervals between numbers are equal. Normally distributed means the scores form a bell shaped curve when a histogram is plotted.

Nominal and ordinal are non-parametric. Metric data which is not close to normally distributed or which has very different variances in teh different groups will probably have to treated as non-parametric.


Choose a Statistic Test

All the preamble is to assist you in choosing a suitable statistical test. The decision pathway to choosing the correct test is often represented with a flow diagram. However, you need to know the following information prior to starting the decision process;

Question.........Are you asking about differences?
Design............Are there one or two or more dependent variables?
....................Are the sets of data independent or are they related?
.....................Are there two or more than two sets of data to compare?
Data..............Is it parametric or non-parametric?
Probability?.....What level of probability will you accept as significant?

Choosing a Statistical Test (Flow chart)

.....................Differences.................................

One IV............... or ..............Two+ IVs

2 sets of data .....or .............>2 sets of data

Related........Unrelated....Related.......Unrelated.....Related.........Unrelated
Para............Para...........Para............Para............Para..............Para
related.........unrelated.....1 way..........1 way...........2 way/>2way....2 way/>2way
t-test...........t-test..........ANOVA.........ANOVA.........ANOVA............ANOVA
.................related.........unrelated......related.........unrelated
Non-para......Non-para......Non-para......Non-para......Non-para
Wilcoxon.......Mann..........Friedman......Kruskal-Wallis

Analysing the Data using the computer statistical packages

Excel, Statwiew, SuperANOVA, SPSSx and now many more using multi regression correlation packages. Suggestion is to put the data into an Excel file first and copy into others later. The format used to enter data can be quite important.

No comments: