Statistical Analysis in R: Tests with Examples
Get statistical analysis in R. R is a statistical computing language and environment. It is fairly versatile and offers a wide range of statistical manipulation, including linear and nonlinear modeling, standard statistical tests, classification, time series analysis, clustering, and graphical tools. R’s ease of statistical manipulation and modeling to produce simple summaries and graphics from complex, unstructured data is one of its primary strengths.
Statistics are the set of mathematical procedures that analysts employ to draw conclusions from data. These tests allow us to draw decisions based on observable patterns in data. There are numerous statistical tests available. The sort of statistical test to use is determined by the data format, distribution, variable type, and the overall objective(s) of the statistical analysis.
R is capable of performing a wide variety of statistical analysis. Some of the basic statistical analysis in R include:
- Matched pairs tests
- Association tests
- Goodness of Fit tests
Paired Test
When your data is in matched pairs, you can use both the t-test and the U-test. Depending on the circumstances, this type of test is sometimes known as a repeated measures test. By appending paired = TRUE to the relevant command, you can execute the test.
Below is an example in R of statistics showing how effective rat traps are in catching rats. Each trap has two sections: black and white. A matched pair can be used to compare black-and-white sections.
Similar statistical analyses can be drawn with the wilcox.test() command.
Association Tests (i.e., Chi Squared Test)
The chisq.test() function allows a statistician to do association tests. Your data should be organized as a contingency table. Below is an illustration:
Locations | |||
Quality | A | B | C |
Plastic | 14 | 6 | 20 |
Glass | 10 | 17 | 15 |
Metal | 12 | 9 | 8 |
Cardboard | 16 | 10 | 11 |
> library(readxl)
> View(Tables)
data=chisq.test(Tables)
|
In this dataset, the columns represent sets of categories (locations), whereas the rows represent another (table materials). The columns in the initial spreadsheet (CSV file) before being imported in R, contain the corresponding locations of each material of the tables.
Goodness of Fit Test
A goodness of fit test is a subset of an association test. It can be used when given a set of categorized data to compare to a recognized standard.
> maize
[1] 110 35 68 40 > ratio [1] 8 2 2 3
|
The counts of maize plants (four phenotypes) shown above are from a cross-pollination experiment. According to genetic theory, the ratio should be 8:2:2:3. If the results are in that ratio, the test of goodness of fit will tell.
> g.fit=chisq.test(maize,p=ratio,rescale.p=TRUE)
> g.fit Chi-squared test for given probabilities data: maize X-squared = 41.684, df = 3, p-value = 4.682e-09 |
To verify that the anticipated probabilities sum to one, the rescale.p = TRUE parameter was employed. The end result is not statistically significant, indicating that the maize observed does not differ from the expected ratio in a statistically significant manner.