Vistas de página en total

martes, 23 de mayo de 2017

LESSON 10

UNIT 10: HYPOTHESIS STATISTICS. HYPOTHESIS TEST.


1. CONTRIBUTIONS OF HYPOTHESIS

To control random errors, in addition to the calculation of confidence intervals, we have a second tool in the process of statistical inference: hypothesis tests or contrasts.

With the contrasts of the strategy hypothesis is as follows:
- We establish a priori a hypothesis near the value of the parameter.
- Performs the collection of data.
- We analyze the coherence between the previous hypothesis and the data obtained.

Tools to answer research questions: allows to quantify the compatibility between an established hypothesis and the results obtained.
Whatever the desires of the researchers, the hypothesis test will always contrast the null hypothesis.
Type of statistical analysis according to the type of variables involved in the study



2. HYPOTHESIS ERRORS.

The hypothesis test measures the probability of error that I make if I reject the null hypothesis.
With the same sample we can accept or reject the null hypothesis. Everything depends on an error, which we call α.

• The error α is the probability of mistakenly rejecting the null hypothesis.
• The smallest error at which we can reject H0 is the error p. (P is synonymous with minimized α)
We usually reject H0 for a maximum α level of 5% (p <0.05). Above 5% of error, we accept the null hypothesis. This is what we call "statistical significance".


3. TYPES OF ERRORS IN HYPOTHESIS TEST.



The most important error for us is the alpha type. We accept that we can be mistaken up to 5%.


4. CHI-SQUARE HYPOTHESIS TEST.

To compare qualitative variables (dependent and independent).





5. STUDENT TEST (comparison of means)

It is used when the independent variable is qualitative (dichotomous) and the dependent variable is continuous quantitative. It only serves to compare two groups.


6. JOINT STUDY OF TWO VARIABLES.

For this we collect the data in some tables:
· In each row we have the data of an individual. Each column represents the values ​​that a variable takes on them. Individuals are not displayed in any particular order.
· These observations can be represented in a scatter diagram. In them each individual is a point whose coordinates are the values ​​of the variables.


7. DISPERSION AND POINT CLOUD DIAGRAM.

If we have the heights and weights of x individuals represented in a scatter diagram I place them on a graph to observe the distribution they have since there is a RELATIONSHIP BETWEEN BOTH VARIABLES.




8. PREDICTION OF VARIABLES IN THE FUNCTION OF ANOTHER.

Apparently the weight increases X kg for each Y cm of height


9. SIMPLE LINEAR REGRESSION: CORRELATION AND DETERMINATION.

· It is a question of studying the linear association between two quantitative variables.
· Example: influence of age on systolic blood pressure
· Deterministic linear models: the independent variable determines the value of the dependent variable. Then for each value of the independent variable there would be only one value of the dependent.
· Probabilistic linear models: for each value of the independent variable there is a probability distribution of values ​​of the dependent, with a probability between 0 and 1.
· There is no deterministic model: there is a cloud of points and we look for the line that best explains the behavior of the dependent variable as a function of the independent variable.

· Correlation coefficient (Pearson and Speerman): Non-dimensional number (between -1 and 1) that measures the strength and the meaning of the linear relationship between variables,
R = ß1 x Sx / Sy

· Coefficient of determination: dimensionless number (between 0 and 1) giving idea of ​​the relationship between linearly related variables, is r2.





No hay comentarios:

Publicar un comentario