Aabel Statistics & Multivariate Data Analysis Methods
Aabel Stats Analyzer
Inferential Statistics and Multivariate Data Analysis MethodsTesting for NormalityMany inferential tests makes certain assumptions about the shape of an underlying population distribution, and in this regard, the most commonly encountered assumption is that the underlying population from which each of the samples is derived is normal. The shape of a normal distribution (also referred to as Gaussian distribution) is such that the closer a score is to the mean, the more frequently it occurs. The more extreme the deviation of a score from the mean, the less frequently it occurs.
Aabel provides: Testing for Homogeneity of VarianceHomogeneity of variance is a reference to equal variances across groups/samples.
These tests evaluate whether or not the population variances represented by k >= 2 samples/groups are equal. Analysis of Variance (ANOVA)Aabel provides the following ANOVA options:
Multiple Comparisons/Post-Hoc Tests Accompanying ANOVAMultiple comparison/post-hoc tests are in essence methods used to evaluate the difference between group means: In an analysis of variance, if the null hypothesis is rejected, i.e., at least two of the group means differ significantly, we still may not know which group means differ and how many group means differ from each other. The multiple comparison/post-hoc tests are designed to answer these questions.
Multiple comparisons/post-hoc tests accompanying the analysis of variance methods in Aabel include: For more information regarding multiple comparisons accompanying ANCOVA, click here. Single-Factor Between-Subjects Analysis of Covariance (ANCOVA)This is an extension of single-factor between-subjects ANOVA. If some of the variations in the dependent variable scores are caused by the effect of another continuous variable (covariate), use of ANCOVA will remove this variation from the error or random variance, resulting in increased sensitivity of the test for treatment effects. The test output includes:
For more information regarding single-factor between-subjects ANCOVA, click here. Chi-Square TestsAabel provides the following ANOVA options:
t-TestsAabel provides the following t-Tests:
F-TestThis is a parametric test that evaluates whether or not two populations from which the two sample are taken have equal variances (or equal standard deviations). The number of subjects/objects can be the same or different for the two samples. z-TestsAabel provides the following z-Tests:
CorrelationsAabel provides the following correlation methods:
Internal Consistency ReliabilityCertain quantities of interest in medicine, psychology, etc., can not be measured explicitly. Accordingly, the assessment is approached by asking a series of questions and combining the answers into a single numerical value, or by a scale such as pass/fail, yes/no, or other dichotomous items. To form a scale in this manner requires internal consistency, i.e., the items should all measure the same construct. Aabel provides the following methods for internal consistency reliability estimates:
For more information, click here. Non-Parametric TestsNon-parametric tests are used when assumptions required by the parametric counterpart tests are not met or are questionable. All tests involving ranked data are non-parametric. Aabel provides the following non-parametric tests:
Contingency Table Analysis and Mosaic Matrix DiagramsThe contingency table analysis in Aabel includes:
Kaplan-Meier Survival AnalysisAabel provides methods for Kaplan-Meier survival analysis using either raw survival data or summarized survival data. If the Kaplan-Meier curve is plotted without taking into account the number of subjects remaining at risk beyond completion of the study, the shape of the curve will be unaffected, but the survival probability values will be affected.
The Kaplan-Meier survival analysis output includes: The logrank is a non-parametric test that makes use of the full survival data without making any assumption about the shape of the survival curve or the distribution of survival times. The logrank test results include the hazard Ratio (i.e., the risk factor for one group, treatment, etc. compared to another group, treatment. etc.), the logrank chi-square and p values For more information about Kaplan-Meier survival analysis and the accompanying log-rank test, click here. Receiver Operating Characteristic Curves (ROC)ROC curves were originally developed to analyze noisy radio signals. Today, however ROC curves are widely used to display a plot of the true positive rate against the false positive rate for the different possible cut-points of a diagnostic test.
Aabel's implementation of ROC allows plotting: For more information, see examples. Regression Analysis Concerning Two Continuous VariablesThe regression methods grouped under this title either deal with finding the relationship between one outcome (dependent) variable and one predictor (independent) variable or finding the relationship between two variables where the designation of dependent and independent variables is irrelevant. Pre-defined regression including:
For more information and examples, click here. Cubic Spline interpolation:
User-Defined Non-Linear Regression For more information regarding regression methods in Aabel, click here. Logistic RegressionLogistic regression allows you to predict a discrete outcome from a set of independent variables that may be continuous, discrete, or binary. The dependent variable is binary/dichotomous/binominal. Logistic regression in Aabel includes probability and logit (probability). Aabel allows generating:
Logistic regression statistics (for probability plots with multiple dimension projections) includes logistic regression full model summary report and logistic regression parameters summary report. For more information and examples, click here. Multiple RegressionThe multiple regression model is a general linear model with two or more independent (predictor) variables. In Aabel, the analysis output includes:
For more information, click here. Partial Least Squares Regression (PLS)The PLS method implemented in Aabel (i) uses principal component analysis (PCA) to derive the prediction functions from factors calculated from cross-product matrices involving both Y and X variables, and (ii) allows predicting one or more dependent variables from a set of independent (predictor variables).
For more information, click here. Principal Component Analysis (PCA)PCA is a data reduction method with the primary objective of extracting a few uncorrelated variables that may capture most of the variability in a data set, while preserving the orthogonality of these new optimal reference axes/variables (i.e., principal components). The 1st principal component captures the maximum variation in the data set. The 2nd principal component has the next most variation, and so on.
For more information, click here. Factor AnalysisThe defining characteristic that distinguishes between PCA and factor analysis is that in PCA we assume that all variability in an item should be used in the analysis, while in factor analysis, we define a priory the number of factors that we want to extract, and the extracted axes will be scaled to the variance along the new improved axes.
You can use Aabel graphing capabilities to display the loadings, the fraction of variance of the variables explained by the model (i.e., communality) and the fraction that is not (i.e., "unique"). For more information, click here. Outlier AnalysisOutlier detection is import for two reasons:
Aabel provides the following outlier analysis methods: Log Center transformation is necessary for proportion or percentage data, i.e., for data where the row sums of the selected variables have a constant sum. The outlier analysis UI controls include the necessary pre-processing transformation methods (such as log-centering), which can be chosen, if needed. For more information, click here. Polynomial Trend Surface Analysis (Map Analysis)The trend surface analysis methods in Aabel can be used to e.g., derive a continuous smooth surface from irregular data or isolating regional trends from local variations. Aabel allows:
K-Means Cluster AnalysisThis multivariate method is used for clustering/grouping data with similar characteristics.
For graphical representation of k-means analysis, you can use a 2-D (binary or matrix) or 3-D plot that allows displaying scatter data points; members of each cluster, which have similar characteristics in multidimensional space, will be displayed with identical marker properties. Aabel allows optional pre-processing of the data prior to the main k-means clustering analysis. The options include: For more information, click here. Hierarchical Cluster AnalysisThe method used in Aabel is called the weighted pair-group with arithmetic averaging. When objects/observations are defined by a set of numeric variables, each object (worksheet row) is positioned in a multi-dimensional space of a dimension proportional to the number of variables (worksheet columns) used to define the object.
The dissimilarity/similarity measures in Aabel are based on one of the following options: For an example, click here. Statistical Quality Control Using Shewhart and Other Control ChartsControl charts are used to monitor a process for some quality characteristic such as e.g., thickness, weight, defective fractions etc.
Shewhart Control Charts for Variables: These charts are based on quality characteristics that can be measured and expressed numerically: Individual Measurements and Moving Range: Control charts for individual measurements use the moving range of two successive observations to measure the process variability. Shewhart Control Charts for Attributes: The QC charts for attributes are based on quality characteristics that are attributes and expressed categorically, for example "conforming" or "non-conforming", "defective" or "non-defective", etc.
Levey-Jennings chart: This chart plots the original process variable against time, date, run number, etc. Special Cause Variation (Test for Special Causes):
For more information about QC charts and examples, click here.
Dot plots are an alternative to histograms of continuous data; in a dot plot, each data point (individual observation) is plotted on a continuous scale using a symbol (on the X-axis). Aabel applies the method published by Leland Wilkinson (1999) to generate dot plots, but in addition, uses modifications necessary to allow:
For more information and examples, click here. Parallel Dot Plot of Repeated MeasuresThis graph is designed to display dot plots of score (response values) form k >=2 repeated measures (dependent samples) on axes that are parallel to one another and equally spaced, with all axes having the same value range (see the right-hand side image).
For more information and an example, click here.
Box & Whisker and Box-Percentile PlotsThe box & whisker graphs display rank statistics. They can have the form of a box or notched box that spans the distance between the two quartiles surrounding the median.
Bar and Line Plots of Mean, Median, Max., Min.These plots are used for comparing mean, medians, maximum, or minimum values of multiple variables, or of subgroups/categories of variables.
Aabel provides the following graph types: For more information regarding bar plots, click here; for more information regarding line plots, click here. Interaction plotsInteraction plots are stacks of mean lines, used to display the effect of one factor at each level of another factor. With no or insignificant interaction, the mean lines are approximately parallel. The more the lines deviate from being parallel, the more significant the interaction effect.
To add error bars to interaction plots, you can choose one of the following options: For more information and an example graph, click here. 3-Way Mean PlotsThese plots compare the response values (scores, measurements) obtained from k >=2 samples/groups, each of which representing data from pqs levels of experimental conditions. The plot options include:
For more information and example plots, click here. Diamond Mean Comparison PlotsIn these plots, the horizontal dashed line is the overall mean. The line through the center of each diamond is the group mean. The top and bottom diamond vertices are the respective upper and lower 95% confidence limits (CI) about the group mean. In groups with equal sample sizes, overlapping marks indicate that the two group means are not significantly different at the 95% confidence level.
Aabel provides the following diamond plot types: For more information, click here. Bland & Altman and Paired t-Test Difference PlotsThe Bland & Altman method comparison compares two methods of measurement or two paired variables, and provides a plot of difference vs. mean, in which the standard deviation of differences between measurements made by the two methods provides an index of the comparability of the methods. In Aabel, the Bland & Altman method provides:
For example graphs, click here. | |||||||||||||||||||||||












