Tests of numerical precision and accuracy of statistical algorithms of the main computational engines of STATISTICA (by StatSoft, Inc.)
The following selection of 52 datasets and analysis designs included in these validation benchmarks represent:
To the best of our knowledge, STATISTICA is the only statistics package available on the market which has successfully passed every test included in this set of benchmarks (and some tests reported here cannot be passed by any program other than STATISTICA).
* We are grateful to Dr. Lynn Brecht (UCLA), Dr. John Castellan (Indiana University), Dr. Elazar Pedhazur (New York University), Dr. Dallas Johnson (Kansas State University), Dr. Geoffrey Keppel (University of California, Berkeley), Dr. Michael Kutner (Emory University), Dr. George Milliken (Kansas State University), Dr. Paul Switzer (Stanford University), Dr. William Wasserman (Syracuse University), Dr. Thomas Wickens (UCLA), and Dr. Arthur Woodward (UCLA) for their advice, and for recommending to us some of the datasets used in these validation benchmarks, and to Drs. A. Woodward and L. Brecht for allowing us to use datasets from the technical documentation for Ganova. We are also grateful to all those researchers and practitioners who generously provided us with their raw datasets and allowed us to use them in the validation benchmarks. We would appreciate readers' suggestions concerning any additional benchmarks which could be included in this set.
Example 1: The "small relative variance test" of numerical precision
In the following sample dataset, variable var2 (the second column) which features a small relative variance is a linear function of var3 (the third column); thus, the correlation coefficient between any variable (e.g., variable var1) and var2 should be identical to the correlation between that variable and var3.
| var1 | var2 | var3 |
|---|---|---|
| 1.0 | 100000.00000001 | 1.0 |
| 2.0 | 100000.00000002 | 2.0 |
| 3.0 | 100000.00000001 | 1.0 |
| 4.0 | 100000.00000002 | 2.0 |
| 5.0 | 100000.00000001 | 1.0 |
| 6.0 | 100000.00000002 | 2.0 |
| 7.0 | 100000.00000005 | 5.0 |
Here are the two correlation coefficients (var1*var2 and var1*var3) calculated by STATISTICA (using its extended precision optimization algorithm), and displayed with the highest precision available.
| variables | Pearson r | p-level |
|---|---|---|
| var1 * var2 | 0.65465367070798 | 0.111 |
| var1 * var3 | 0.65465367070798 | 0.111 |
To our knowledge, STATISTICA is the only program available on the market that will correctly compute these correlations (or correlations from other datasets featuring very small relative variances).
Example 2: A medium size multi-factor unbalanced ANOVA design
The following design is a 5 x 5 x 5 x 3 (between-group) x 3 x 3 x 3 (repeated measures) design (with unequal N). Thus there are 375 groups and 27 dependent variables (data file ANOVA4 is available from StatSoft). The between-group design matrix for the highest order interaction has 128 degrees of freedom. Shown below are the univariate and multivariate results for the highest order (7-way) interaction.
| general manova | INTERACTION: 1 x 2 x 3 x 4 x 5 x 6 x 7 1-IV1, 2-IV2, 3-IV3, 4-IV4, 5-RFACT1, 6-RFACT2, 7-RFACT3 |
||||
|---|---|---|---|---|---|
| Univar. Test | Sum of Squares | df | Mean Square | F | p-level |
| Effect Error | 8664.99 24854.14 | 1024 3008 | 8.461903 8.262680 | 1.02411 | 31744 |
| Test | Value | p-level |
|---|---|---|
| Wilks' Lambda Rao R (1024,2966) Pillai-Bartlett Trace V(1024.3008) | .088651 1.027036 2.071145 1.026166 | .29812 .30355 |
Example 3: A medium size multi-factor unbalanced ANOVA design (very large and very small values)
Example 3.1. In the first part of this test, the dataset used in the previous example (Example 2, original range of values 0.1 to 10.0) was transformed by multiplying each dependent variable in the original dataset by 100,000; then, the analysis of variance reported in the previous example was performed on the transformed data. Shown below are the univariate and multivariate results for the highest order (7-way) interaction (cf. Example 2).
| Univar. Test | Sum of Squares | df | Mean Square | F | p-level |
|---|---|---|---|---|---|
| Effect Error | 8664.99 24854.14 | 1024 3008 | 8.461903 8.262680 | 1.02411 | 31744 |
| Test | Value | p-level |
|---|---|---|
| Wilks' Lambda Rao R (1024,2966) Pillai-Bartlett Trace V(1024.3008) | .088651 1.027036 2.071145 1.026166 | .29812 .30355 |
Example 3.2. In the second part of this test, the dataset used in Example 2 (original range of values 0.1 to 10.0) was transformed by dividing each dependent variable in the original set by 100,000; the analysis of variance reported in Example 2 was then performed on the transformed data. Shown below are the univariate and multivariate results for the highest order (7-way) interaction (cf. the first part of this example and Example 2).
| Univar. Test | Sum of Squares | df | Mean Square | F | p-level |
|---|---|---|---|---|---|
| Effect Error | 8664.99 24854.14 | 1024 3008 | 8.461903 8.262680 | 1.02411 | 31744 |
| Test | Value | p-level |
|---|---|---|
| Wilks' Lambda Rao R (1024,2966) Pillai-Bartlett Trace V(1024.3008) | .088651 1.027036 2.071145 1.026166 | .29812 .30355 |
Example 4: A large multi-factor unbalanced ANOVA design
The following design is a 20 x 10 x 2 x 2 (between-group) x 3 (repeated measures) design with unequal N. Thus, there are 800 groups and 3 dependent variables (data file ANOVA44 is available from StatSoft). The between-group design matrix for the highest order interaction has 171 degrees of freedom. Shown below are the univariate and multivariate results for the highest order (5-way) interaction.
| general manova | INTERACTION: 1 x 2 x 3 x 4 x 5 1-COUNTRY, 2-RAINFALL,3-REGION, 4-STATUS, 5-RFACTOR |
||||
|---|---|---|---|---|---|
| Univar. Test | Sum of Squares | df | Mean Square | F | p-level |
| Effect Error | 17.9462 181.8289 | 342 3202 | .052474 .056786 | .92406 | .82876 |
| general manova | INTERACTION: 1 x 2 x 3 x 4 x 5 1-COUNTRY, 2-RAINFALL, 3-REGION, 4-STATUS, 5-RFACTOR |
|
|---|---|---|
| Test | Value | p-level |
| Wilks' Lambda Rao R (342,inf) Pillai-Bartlett Trace V(342,3202) | .826507 .935296 .181690 .935531 | .78876 .78788 |
To our knowledge, STATISTICA is the only program available on the market that can process ANOVA designs of this size.
Example 5: Precision of ANOVA routines (small within-cell variances relative to the between-group variance)
Here is a test of the precision of computations in ANOVA: A data file was created with 10 cases and 5 groups (2 cases per group), and 12 dependent variables. The groups in the grouping variable IV were coded 1 through 5. The dependent variables DVi (i =1 to 12) were then computed as DVi = IV + casenumber/10**i (each successive dependent variable was computed as a constant plus the case number divided by 10 to the power of i). This results in small within-cell variances relative to the between-group variance.
| general manova | MAIN EFFECT: IV 1-IV |
||
|---|---|---|---|
| depend. variable | Mean Sqr Effect | Mean Sqr Error | F(df1,2) 4,5 |
DV1 DV2 DV3 DV4 DV5 DV6 DV7 DV8 DV9 DV10 DV11 DV12 | 5.202000 5.020020 5.002000 5.000200 5.000020 5.000002 5.000000 5.000000 5.000000 5.000000 5.000000 5.000000 | .00005 .0000005 .5E-8 .5E-10 .5E-12 .5E-14 .5E-16 .5E-18 .5E-20 .500E-22 .500E-24 .502E-26 | 104040. 1004E4 10004E5 100004E6 1E13 1E15 1E17 1E19 1E21 99996E18 99996E20 99584E22 |
To our knowledge, STATISTICA is the only program available on the market that will correctly compute the within ms error component for all dependent variables in this design.
Examples 6 and 7: Logistic regression, maximum likelihood
Example 6. Cox (1970, p. 86) reports data describing the failure (variable Failure) of objects as a function of time (Time). Cox fitted the data by the logistic model. Shown below are the maximum likelihood estimates and their standard errors produced via STATISTICA: Nonlinear Estimation (see also Brown et al., 1983, p. 317).
| nonlin. estimat. | Parameter Estimates Std. Errs were computed after scaling ms-err. to 1. |
Param. | Const. | TIME | Estimate Std. Err. t(5) p-level | -5.415 .728 -7.438 .00069 | .0807 .0224 3.6099 .0154 |
|---|---|---|
Example 7. A dataset reported in Neter, Wasserman, and Kutner (1985, p. 365) describes the results of a study of coupon redemption. The coupons differed in their value, that is, with regard to the price reduction offered. The dependent variable of interest is how many coupons of each type were redeemed. Shown below are the maximum likelihood parameter estimates for the logistic regression model computed by STATISTICA: Nonlinear Estimation (weighted least squares estimates are reported in Neter et al., p. 365).
| nonlin. estimat. | Parameter Estimates Std. Errs were computed after scaling ms-err. to 1. |
|
|---|---|---|
| Param. | Const. | REDUCTN |
| Estimate Std. Err. t(8) p-level | -2.185 .165 -13.267 .000 | .1087 .0089 12.2894 .0000 |
Example 8: Exponential regression, ordinary least squares
This example is based on a dataset reported in Neter, Wasserman, and Kutner (1985, p. 469). The data contain information on the number of days that each of 15 severely injured patients were hospitalized (variable Days) and an index of the prognosis for long-term recovery for each patient (variable Prognos). Shown below are the parameter estimates produced by STATISTICA: Nonlinear Estimation for the exponential regression model: Prognos=g0 * exp(g1*Days) [g0 and g1 are parameters], the loss function is least squares (see also Neter et al., p. 478, Table 14.3).
| nonlin. estimat. | Parameter Estimates | |
|---|---|---|
| Param. | g0 | g1 |
| Estimate Std.Err. t(13) p-level | 58.60662 1.54984 37.81474 .00000 | -.0396 .0019 -20.8667 .0000 |
Example 9: User-defined (exponential) regression, ordinary least squares
The dataset for this example is again based on Neter, Wasserman, & Kutner (1985, p. 484). To study the efficiency of two new manufacturing plants, a ratio was computed of the per-unit-production cost expected in a modern facility after learning has occurred, over the actual per-unit-production cost for selected weeks over a 90-week span. Neter et al. fit the following model to these data: y = b0 + b1 * xg + b3 * exp(b2*x), where xg is an indicator variable to denote the two plants, x denotes the number of weeks, y is the efficiency index, and b0, b1, b2, and b3 are parameters. This formula can be typed "as is" into the user-defined model specification editor. Shown below are the results computed by STATISTICA: Nonlinear Estimation (using the Rosenbrock pattern search method to find start values, followed by quasi-Newton iterations; Neter et. al. report the results on p. 484-485).
| nonlin. estimat. | Parameter Estimates y = b0 + b1*xg + b3*exp(b2*x) |
|||
|---|---|---|---|---|
| Param. | B0 | B1 | B3 | B2 |
| Estimate Std. Err. t(26) p-level | 1.0156 .0037 274.5491 0.0000 | -.0473 .0041 -11.5026 .0000 | -.5524 .0083 -66.6689 .0000 | -.1348 .0046 -29.5186 .0000 |
Example 10: Discontinuity (breakpoint) in regression function
This example is also based on a dataset reported in Neter, Wasserman, & Kutner (1985, p. 348). Specifically, the dataset pertains to a production process in which the per-unit cost is related to the lot size. Supposedly, for lots greater than 500, the relationship between the variables changes; Neter et al. (1985) fit a linear model that allowed for different slopes for lots of sizes less than or equal to 500, and lots greater than 500. Specifically, Neter et al. fit the following model: y = b0 + b1*x + b2*(x-500)*(x>500) (b0, b1, and b2 are parameters). In this model, the logical expression (x>500) serves as a multiplier: If the expression is true, it will evaluate to 1, if it is false, it will evaluate to 0. Therefore, this equation actually represents two models: y = b0 + b1*x for x<=500, and y = b0 + b1*x + b2*(x-500) for x>500. The model can again be typed in to the user-model specification editor "as is"; shown below are the parameter estimates computed by STATISTICA: Nonlinear estimation (see Neter et al., p. 348).
| Parameter Estimates y = b0 + b1*x + b2*(x-500)*(x>500) |
|||
|---|---|---|---|
| Param. | B0 | B1 | B2 |
| Estimate Std. Err. t(5) p-level | 5.895447 .604213 9.757232 .000192 | -.00395 .00149 -2.64990 .04543 | -.00389 .00231 -1.68515 .15277 |
Example 11: Weighted Least Squares
Weighted least squares or any other (user-specified) loss function can be specified in STATISTICA: Nonlinear Estimation. An example of weighted least squares is presented in Neter et al. (1985, p. 169). The example dataset contains information concerning the cost for preparing a bid and the size of the bid. Neter et al. fit a linear regression model (Bid cost = b0 + b1 * Bid size), using the residuals weighted by the inverse of the squared Bid size values in the loss function [Loss = ((Predicted-Observed) **2)*(1/Bid size**2)]. Here are the results computed by STATISTICA: Nonlinear Estimation (see Neter et al., p. 169-170).
| Parameter Estimates y=b0+b1*x |
||
|---|---|---|
| Param. | B0 | B1 |
| Estimate Std.Err. t(10) p-level | 5.656852 .965238 5.860577 .000159 | 4.19055 .40366 10.38127 .00000 |
Example 12: Robustness against collinearity problems (a linear model test of accuracy of nonlinear estimation)
The so-called Longley data (Longley, 1967) is a well-known dataset for testing linear-least-squares regression programs for their ability to handle regression problems with redundant predictor variables (this dataset is also referenced below for STATISTICA: Multiple Regression, Example 27). In this example, it will be used to test the accuracy of the general nonlinear estimation module of STATISTICA. In the user-model specification editor of STATISTICA: Nonlinear Estimation, we can specify the linear regression model, and request least squares parameter estimates. The parameter estimates computed by STATISTICA: Nonlinear Estimation (via quasi-Newton iterations) and their (asymptotic) standard errors (computed via finite difference approximation) are shown below (for comparison, see also Elliott, Reisch, & Campbell, 1989, p. 296). Note that STATISTICA: Multiple Regression will reproduce the parameter estimates with all 12 digits of precision.
| multiple regress. | Parameter Estimates | |||
|---|---|---|---|---|
| Param. | A | B1 | B2 | B3 |
| Estimate Std.Err. | -34822E2 890420. | 15.06195 84.91493 | -.03582 .03349 | -2.02023 .48840 |
| multiple regres. | Parameter Estimates | ||
|---|---|---|---|
| Param. | B4 | B5 | B6 |
| Estimate Std.Err. | -1.03323 .21427 | -.051103 .226073 | 1829.155 455.479 |
Example 13: Unbalanced ANOVA designs (Type I and III Sums of Squares)
Milliken and Johnson (1984, p. 129) discuss in some detail the analysis of a 2 x 3 unbalanced (due to unequal N) between-group design. Shown below are the summary ANOVA tables for that design; both the results for Type I Sequential Sums of Squares (see Milliken & Johnson, p. 142) and Type III Sums of Squares (see Milliken & Johnson, p. 132) are shown below.
| general manova | Summary of all Effects (Type I SS); design: 1-T, 2-B |
|||||
|---|---|---|---|---|---|---|
| Effect | df Effect | ms Effect | df Error | ms Error | F | p |
| *1 *2 *12 | 1 2 2 | 76.563 45.372 35.815 | 10 10 10 | 2.0000 2.0000 2.000 | 38.281 22.686 17.908 | .0001 .0002 .0005 |
| general manova | Summary of all Effects (Type I SS); design: 1-T, 2-B |
|||||
|---|---|---|---|---|---|---|
| Effect | df Effect | ms Effect | df Error | ms Error | F | p |
| *1 *2 *12 | 1 2 2 | 61.714 38.585 35.815 | 10 10 10 | 2.0000 2.0000 2.000 | 30.857 19.292 17.908 | .0002 .0004 .0005 |
Example 14: A 2-way nested design
Lindman (1974, p. 167) discusses a two-way nested design where factor A has three levels, and factor B has six levels, with two levels each nested in each level of factor A. Here is the results summary computed by STATISTICA: ANOVA/MANOVA (see Lindman, p. 172).
| general manova | Summary of all Effects; design: 1-A, 2-B |
|||||
|---|---|---|---|---|---|---|
| Effect | df Effect | ms Effect | df Error | ms Error | F | p |
| *1 *2 | 2 3 | 114.67 46.83 | 12 12 | 4.8889 4.8889 | 23.455 9.580 | .0001 .0017 |
Example 15: A 3-way nested design with customized error term
Milliken & Johnson (1984, p. 418) present an example of a 3-way nested design. In this experiment, male and female subjects were randomly assigned to one of 9 environmental chambers; the 9 environmental chambers, in turn, were assigned to 3 levels of a temperature factor. Thus, in this design Chamber is nested in Temperature, and subjects are nested in chambers. To produce the table of sums of squares as presented in Milliken & Johnson (1984, p. 419), the Gender by Chambers interaction was pooled into the error term before computing the table of all effects.
| general manova | Summary of all Effects; design: 1-TEMPERAT, 2-GENDER, 3-CHAMBER Customized Error Term |
|
|---|---|---|
| Effect | df Effect | ms Effect |
| 1 2 3 12 Error | 2 1 6 2 24 | 79.194 3.361 11.083 7.861 1.653 |
Example 16: A nested design with a random effect
STATISTICA: ANOVA/MANOVA will automatically handle random effects. Lindman (1974, p. 173) shows an example of a nested design, where the nested factor is random. Factor A has four levels, factor B has 3 levels, and factor C (subjects) is a random effect with 9 levels. Shown below is the summary table for this design (see Lindman, 1974, p. 178).
| general manova | Summary of all Effects; design: 1-A, 2-B,3-C |
|||||
|---|---|---|---|---|---|---|
| Effect | df Effect | ms Effect | df Error | ms Error | F | p |
| *1 *2 3 12 13 | 3 2 6 6 18 | 86.44 365.08 44.31 11.19 8.53 | 18 6 18 | 8.53 44.31 8.53 | 10.14 8.24 1.31 | .0004 .0190 .3016 |
Example 17: Weighted means analysis of a nested design with unequal N (and missing cells)
The next example was taken from Searle (1987, p. 62). The data presented there describe a two-way nested classification of student opinions concerning computers. There were two classes -- English and Geology (factor Course) -- with different numbers of sections (taught by different teachers): English had two sections, Geology had 3 sections. To test the main effect for Course, Searle constructs a weighted means comparison. Shown below is the result of that comparison as computed by STATISTICA: ANOVA/MANOVA (see also Searle, 1987, p. 71).
| general manova | Planned Comparison 1-COURSE,2-SECTION |
||||
|---|---|---|---|---|---|
| Univar. Test | Sum of Squares | df | Mean Square | F | p |
| Effect Error | 24.0000 26.0000 | 1 7 | 24.0000 3.7143 | 6.462 | .0386 |
Example 18: A split-plot design with customized error term
Milliken and Johnson (1984, p. 297) present an example of a split-plot design. The design pertains to the effectiveness of 4 different fertility regimes on two varieties of wheat. Each of the four fertilizer levels was randomly assigned to one whole plot within each of one of two blocks. Shown below are the results for the Fertility and the Variety factor with the appropriate error terms (see Milliken & Johnson, 1984, p. 299).
general manova | MAIN EFFECT: FERTILTY ERROR: FERTILTY x BLOCK |
||||
|---|---|---|---|---|---|
| Univar. Test | Sum of Squares | df | Mean Squares | F | p |
| Effect Error | 40.1900 6.9275 | 3 3 | 13.3967 2.3092 | 5.802 | .0914 |
general manova | MAIN EFFECT: VARIETY ERROR: VARIETY x BLOCK FERTILTY x VARIETY x BLOCK |
||||
|---|---|---|---|---|---|
| Univar. Test | Sum of Squares | df | Mean Squares | F | p |
| Effect Error | 2.25000 8.43000 | 1 4 | 2.2500 2.1075 | 1.068 | .3599 |
Example 19: Strip-plot designs
Milliken and Johnson (1984, p. 320) discuss an experiment on the relationship between two irrigation methods and three levels of nitrogen on the yield of wheat. Again, the analysis requires the specification of custom error terms. All sums of squares are automatically computed by STATISTICA: ANOVA/MANOVA for the Table of all Effects. Note that there is a typographical error in the table presented in Milliken and Johnson (p. 320); specifically the sum of squares for factor Irrigation is 570.4 (and not 507.4).
| manova | Summary of all Effects; design: 1-REPLICAT, 2-IRRIGAT, 3-NITROGEN |
|
|---|---|---|
| Effect | df Effect | ms Effect |
| 1 2 3 12 13 23 123 | 3 1 2 3 6 2 6 | 41.154 570.375 169.542 10.931 2.819 47.375 1.431 |
Example 20: Split-plot designs with unequal numbers of subplots
Milliken and Johnson (1984, p. 385) discuss an example of a such a design. Five patients suffering from depression were randomly assigned to one of two treatment conditions (Treatment: Placebo vs. Drug). They were then examined after one week and after five weeks; the dependent variable was the patients' depression score during those examinations. Two patients did not return for the second examination, creating an unequal number of subplots in the design. In STATISTICA: ANOVA/MANOVA the results were produced via analysis of covariance, with covariates coding the effect for subjects within-treatment conditions. Here are the Type III sums of squares for the effects of interest (for a discussion of the choice of error terms see Milliken & Johnson, p. 394).
| manova | Summary of all Effects; design: 1-TREATMNT, 2-WEEK |
|
|---|---|---|
| Effect | df Effect | ms Effect |
| 1 2 12 | 1 1 1 | 15.5648 24.0833 4.0833 |
Example 21: Youden square designs
An example of a 4 x 4 Youden square with three factors A, B, and C is presented in Lindman (1974, p. 209). Factor A is "rotated" in its position with respect to factor B. Here is the table of all effects computed by STATISTICA: ANOVA/MANOVA, with the B x C interaction as the error term (see also the results table in Lindman, page 209).
| general manova | Summary of all Effects; design: 1-B, 2-C ERROR: B x C |
|||||
|---|---|---|---|---|---|---|
| Effect | df Effect | ms Effect | df Error | ms Error | F | p |
| 1 2 | 3 2 | 47.000 39.000 | 3 3 | 7.000 7.000 | 6.7143 5.5714 | .076 .098 |
Example 22: A 4 x 11 nested design with unequal numbers of levels (missing cells)
Milliken and Johnson (1984, p. 415) present an example dataset, comparing 11 insecticides produced by four different companies. One company makes three insecticides, another makes four, and the remainder make two each. The effect for Insecticide (nested within Company) was tested via planned comparisons. Shown below are the results computed by STATISTICA: ANOVA/MANOVA (note that these results are slightly different than those reported in Milliken and Johnson on page 422; the analysis reported there is not consistent, and a typographical error must have found its way into the presentation; e.g., compare the mean reported on page 417 for the last group with the data from page 415).
general manova | Planned Comparison 1-COMPANY 2-PRODUCT |
||||
|---|---|---|---|---|---|
| Univar. Test | Sum of Squares | df | Mean Square | F | p |
| Effect Error | 1500.58 1260.00 | 7 2 | 214.369 57.273 | 3.743 | .0081 |
Example 23: A complex design with many missing cells (testing Type IV hypotheses)
Milliken and Johnson (1984, p. 202) discuss a complex example of a design with many missing cells. The design contains 3 factors: Group (2 levels; whether or not subject received food stamps), Age (classified into three groups), and Race (black, hispanic, white). For brevity, only the results for the main effect for Race, and for the Race by Group interaction are shown below (see also results reported for the so-called Type IV analysis in Milliken and Johnson, Table 17.2, p. 203).
| manova | Planned Comparison (RACE) 1-AGE, 2-GROUP, 3-RACE |
||||
|---|---|---|---|---|---|
| Univar. Test | Sum of Squares | df | Mean Square | F | p |
| Effect Error | 11.68 2627.47 | 2 92 | 5.8385 28.5595 | 1.991 | .1424 |
| manova | Planned Comparison (RACE x GROUP) 1-AGE, 2-GROUP, 3-RACE |
||||
|---|---|---|---|---|---|
| Univar. Test | Sum of Squares | df | Mean Square | F | p |
| Effect Error | 113.70 2627.47 | 2 92 | 56.8517 28.5595 | 1.991 | .1424 |
Example 24: A 2 (between) x 3 x 3 (repeated measures) design with missing cells
This example is based on a (fictitious) dataset reported in Winer (1962, p. 324). The design has two repeated measures factors, each with 3 levels. Shown below is the summary univariate ANOVA table as computed by STATISTICA: ANOVA/MANOVA (see also Winer, p. 328); the multivariate tests for the Noise x Time interaction are also shown.
| manova | Summary of all Effects; design: 1-NOISE, 2-TIME, 3-DIALS |
|||||
|---|---|---|---|---|---|---|
| Effect | df Effect | ms Effect | df Error | ms Error | F | p |
| 1 *2 *3 *12 13 23 123 | 1 2 2 2 2 4 4 | 468.17 1861.17 1185.17 166.50 25.17 2.67 2.83 | 4 8 8 8 8 16 16 | 622.78 29.36 13.19 29.36 13.19 7.94 7.94 | .75 63.39 89.82 5.67 1.91 .34 .36 | .435 .000 .000 .029 .210 .850 .836 |
| manova | INTERACTION: 1 x 2 1-NOISE, 2-TIME, 3-DIALS |
|
|---|---|---|
| Test | Value | p-level |
| Wilks' Lambda Rao R Form 2 (2,3) Pillai-Bartlett Trace V (2,3) | .15607 8.11102 .84393 8.11102 | .06166 .06166 |
Example 25: A multivariate repeated measures split-plot design
This example is based on data reported in the documentation for Ganova (Brecht & Woodward, 1985). The design is a multivariate repeated measures split-plot design with two between-group factors (2x3), two repeated measures factors (2x3), and two dependent variables. Shown below are the summary results computed by STATISTICA: ANOVA/MANOVA.
| general manova | Summary of all Effects; design: 1-A, 2-B, 3-FACTOR3, 4-FACTOR4 |
||||
|---|---|---|---|---|---|
| Effect | Wilks' Lambda | Rao's R | df 1 | df 2 | p |
| 1 2 3 4 12 13 23 14 24 34 123 124 *134 234 1234 | .75263 .71467 .62400 .66764 .28476 .52986 .40968 .54561 .20559 .81136 .23354 .24201 .06450 .13046 .02388 | .8217 .4573 1.5064 .3734 2.1849 2.2183 1.4059 .6246 .9041 .1744 2.6732 .7746 10.8778 1.3265 4.1035 | 2 4 2 4 4 2 4 4 8 4 4 8 4 8 8 | 5 10 5 3 10 5 10 3 6 3 10 6 3 6 6 | .4914 .7656 .3076 .8175 .1442 .2044 .3008 .6777 .5654 .9376 .0945 .6410 .0394 .3756 .0512 |
Example 26: Multivariate analysis of covariance, multivariate tests of parallelism
In this example we will specify a multivariate analysis of variance design with multiple covariates and test the parallelism hypothesis. The example is based on a dataset reported by Finn (1974); the design has 4 groups, 2 dependent variables, and 3 covariates. Shown below are the results for the between-group factor (see Finn, 1974, p. C-54; see also Enslein, Ralston, & Wilf, 1977, p. 262), the summary for the covariates (see Finn, 1974, p. C49/50; Enslein et al., p. 258), and tests of the parallelism hypothesis (see Finn, 1974, p. C-45; Enslein et al., p. 255).
| general manova | MAIN EFFECT: GROUP 1-GROUP |
|
|---|---|---|
| Test | Value | p-level |
| Wilks' Lambda Rao R Form 1(6,80) Pillai-Bartlett Trace V(6,82) | .69357 2.67672 .31790 2.58289 | .02031 .02422 |
| general manova | MULTIVARIATE TESTS Within Cells Regression 3 Covariates |
|
|---|---|---|
| Test | Value | p-level |
| Wilks' Lambda Rao R Form 1 (6,80) Pillai-Bartlett Trace V (6,82) | .90843 .65590 .09279 .66493 | .68527 .67810 |
| general manova | MULTIVARIATE TESTS OF PARALLELISM |
|
|---|---|---|
| Test | Value | p-level |
| Wilks' Lambda Rao R Form 1 (18,62) Pillai-Bartlett Trace V (18,64) | .74076 .55759 .27393 .56428 | .91589 .91194 |
Example 27: Longley dataset (linear regression) The so-called Longley data (Longley, 1967) is a well known dataset for testing multiple regression programs for their ability to handle regression problems with redundant predictor variables. Shown below are the (partial) results computed by STATISTICA: Multiple Regression (see Longley, 1967; Elliott, Reisch, & Campbell, 1989, p. 296).
Dependent Variable: TOTAL
Multiple R: .997736942
Multiple R-Square: .995479005
Adjusted R-Square: .992465008
Number of cases: 16
F ( 6, 9 ) = 330.2853 p < .000000
Standard Error of Estimate: 304.85407356
Intercept: -3482258.635 Std.Error: 890420.4
| multiple regress. | Parameters | |
|---|---|---|
| variable | B | St. Err. of B |
| DEFLATOR GNP UNEMPLOY ARMFORCE POPULATN TIME | 15.06187227143 -.03581917929 -2.02022980382 -1.03322686717 -.05110410565 1829.15146461400 | 84.914925774771 .033491007772 .488399681652 .214274163162 .226073200069 455.478499142310 |
Note that there is a typographical error in the table presented in Elliott et al., 1989 (Table 4.3.1, p. 296); specifically, the B coefficient for TIME is 1829.151464614 (as reported in STATISTICA) and not 1829.15146416 (6 and 1 are reversed).
To our knowledge, STATISTICA is the only statistics program available on the market that will correctly compute and report regression coefficients for the Longley dataset with this level of precision (Excel will correctly report the first 8 significant digits, Lotus will correctly report all 12 digits).
Example 28: Polynomial regression
Elliott, Reisch, and Campbell (1989, p. 295) present a data file to test polynomial regression. Shown below are the (partial) results computed by STATISTICA: Multiple Regression for the sixth degree polynomial fit (see Elliott, Reisch, and Campbell, 1989, p. 297). Note that this test is even more "demanding" than the previous one and an extremely low setting of the minimum tolerance parameter is required to obtain the parameter estimates.
Dependent Variable: Y_HR
Multiple R: .996793635
Multiple R-Square: .993597550
Adjusted R-Square: .990396325
Number of cases: 19
F ( 6, 12) = 310.3804 p < .000000
Standard Error of Estimate: .308965061
Intercept: 157.88215543 Std.Error: 73.68338
| multiple regress. | Regression Weights | |
|---|---|---|
| variable | B | St. Err. of B |
| X_KG P2 P3 P4 P5 P6 | -330.97580114610 364.04271758509 -199.36108558038 58.11303781881 -8.60698967739 .50963834084 | 192.284963071360 201.286163909620 108.400947552150 31.758798390784 4.813032615799 .295596359040 |
Example 29: Kaplan-Meier product limit estimates
Lee (1992, p. 25) discusses a dataset first presented by King et. al. (1979). Shown below is part of the product-limit analysis for the low-fat group of rats as computed by STATISTICA: Survival Analysis (see also Lee, 1992, p. 74-75).
| survival | Kaplan-Meier (Product-limit) analysis Note: Censored cases are marked with + |
||
|---|---|---|---|
| Case Number | Time | Cumulatv Survival | Standard Error |
| 3 12 4 13 14 9 10 5 11 . . . | 50.0000 56.0000 65.0000 66.0000 73.0000 77.0000 84.0000 86.0000 87.0000 . . . | .966667 .933333 .900000 .866667 .833333 .800000 .766667 .733333 .700000 . . . | .032773 .045542 .054772 .062063 .068041 .073030 .077220 .080737 .083666 . . . |
Example 30: Comparing multiple samples of censored survival times
Lee (1992, p. 127) presents a dataset of initial remission times for leukemia patients as a function of three treatments. Shown below is the summary of the comparison computed by STATISTICA: Survival Analysis (see also Lee, p. 127).
Variable: TIME
Variable with censoring indicator: CENSORED
Grouping variable: GROUP (3 Groups)
Total number of valid observations: 66
uncensored: 52 ( 78.79%)
censored: 14 ( 21.21%)
Chi-square = 3.61183 df = 2 p = .16434
Example 31: Proportional hazard regression for censored data
Crowley and Hu (1977) present an analysis of the well-known Stanford heart transplant data. Shown below are the (partial) results of the (Cox) proportional hazard regression analysis computed by STATISTICA: Survival Analysis (see also Brown, Engelman, Jennrich, 1990, p. 773).
Regression Results: Proportional hazard (Cox) regression
Total number of valid observations: 65
uncensored: 29 ( 44.62%)
censored: 36 ( 55.38%)
| survival | Parameter Estimates Log-Likelihood of final solution: -87.867 |
|||
|---|---|---|---|---|
| Variable | Beta | Standard Error | t-level | exponent Beta |
| AGE ANTIGEN MISMATCH | .10909 -.04878 1.06372 | .03329 .47165 .39460 | 3.27658 -.10342 2.69570 | 1.11526 .95239 2.89713 |
Example 32: Exponential regression model for censored data
Lawless (1982, p. 287) discusses an example censored dataset pertaining to lung cancer survival and fits to it an exponential regression model with six covariates (plus a constant). Shown below are the parameter estimates and their asymptotic standard errors computed by STATISTICA: Survival Analysis (see also Lawless, p. 288).
| survival | Parameter Estimates | ||
|---|---|---|---|
| Variable | Beta | Std. Err | t-level |
| X1 X2 X3 X4 X5 X6 X7 Constant | .05442 .00887 .00336 .33865 -.12069 -.86560 -.28398 4.74008 | .01082 .01977 .01166 .44556 .48623 .58663 .38902 .40562 | 5.0302 .4484 .2882 .7601 -.2482 -1.4756 -.7300 11.6861 |
Example 33: Stepwise discriminant function analysis and canonical analysis
The "classic" Iris dataset (Fisher, 1936) is widely referenced to discuss discriminant function analysis. Shown below is the summary of the stepwise discriminant function analysis for those data, and the summary of the canonical analysis with all variables in the model (see also Jennrich 1977, pp. 92-94; Brown et al., 1990, p. 341-342).
Number of variables in the model: 4
Wilks' Lambda: .023439
Approx. F (8,288) = 199.145 p <0.00000
| discrim. | Summary of Stepwise Analysis | ||||
|---|---|---|---|---|---|
| Variable Entered | No. of vars.in | Lambda | F-level | df 1 | df 2 |
| PETALLEN SEPALWID PETALWID SEPALLEN | 1 2 3 4 | .05863 .03688 .02498 .02344 | 1180.16 307.11 257.50 199.15 | 2 4 6 8 | 147 292 290 288 |
| discrim. | Standardized Coefficients for Canonical Variables |
|
|---|---|---|
| Variable | Root 1 | Root 2 |
| PETALLEN SEPALWID PETALWID SEPALLEN Eigenval. | -.94726 .52124 -.57516 .42695 32.19193 | -.401038 .735261 .581040 .012408 .285391 |
Example 34: Log-linear model (a 5-way frequency table)
Bishop, Fienberg, & Holland (1978, p. 103) present a complex 5-way frequency table describing the three-year survival of cancer patients in different locations. Shown below are the tests of all models of full order (see also Brown et al., 1983, p. 180; note that delta=0.5 was added to each cell in the frequency table).
| log-lin. | Results of Fitting all K-Factor Interactions |
||||
|---|---|---|---|---|---|
| K-Factor | df | Max.Lik. Chi-squ. | p | Pearson Chi-sq. | p |
| 1 2 3 4 | 8 23 28 12 | 632.156 134.425 30.909 9.012 | .0000 .0000 .3212 .7019 | 881.251 141.228 31.233 8.928 | .0000 .0000 .3069 .7091 |
Example 35: Experimental Design: A 2**(7-4) fractional factorial design
Box, Hunter, and Hunter (1978, p. 391) present an example data set for a 2-level fractional factorial design; specifically the design is a 2**(7-4) fractional factorial. Shown below are the effect estimates as computed by STATISTICA: Experimental Design (see also Box, Hunter, & Hunter, p. 392).
| experim. design | 2**(7-4) design of resolution R = III TIME; m = 66.50000 s = 13.84609 |
|
|---|---|---|
| Effect | Effect Estimate | Sums of Squares |
| 1:SEAT 2:DYNAMO 3:HANDLBRS 4:GEAR 5:RAINCOAT 6:BREAKFST 7:TIRES | 3.50000 12.00000 1.00000 22.50000 .50000 1.00000 2.50000 | 24.50 288.00 2.00 1012.50 .50 2.00 12.50 |
Example 36: Experimental Design: A second-order central composite (response surface) design
Box, Hunter, and Hunter (1978, p. 519) present an example data set for a 2-factor second-order central composite (response surface) design with two blocks. Shown below are the parameter estimates computed by STATISTICA: Experimental Design (see Box et al., p. 520).
| experim. design | Parameter Estimates; Variable: YIELD 2**(2-0) 2nd order central composite design m=83.88333 s=4.39293 Intercept=87.3750 | |
|---|---|---|
| Effect | Paramet. | Std.Err. of Par. |
| C vs. S TIME DEGREES 1**2 2**2 1 by 2 | -.85003 -1.38374 .36199 -2.14377 -3.09379 -4.87500 | .506913 .620843 .620843 .694129 .694129 .878000 |
Example 37: Experimental Design: A Taguchi robust design experiment (L18, S/N: Smaller-the-Better)
Phadke (1989, p. 82-83) discusses in detail the analysis of a robust design experiment pertaining to the manufacture of silicon wafers. Shown below is the summary ANOVA table computed by STATISTICA: Experimental Design for the Surface Defect data (a smaller-the-better problem; see also Phadke, p. 88, Table 4.6); note that as described in Phadke (p. 88), factor Cleaning was pooled into the error term.
| experim. design | Analysis of Variance m = -45.362 s = 24.4841 * - effect pooled into error |
|||
|---|---|---|---|---|
| Effect | SS | df | ms | F |
| {1}TEMPERAT {2}PRESSURE {3}NITROGEN {4}SILANE {5}SETT_TIM *CLEANING | 4427.24 3415.55 1029.52 371.93 378.28 163.52 | 2 2 2 2 2 2 | 2213.62 1707.77 514.76 185.97 189.14 | 27.26 21.03 6.34 2.29 2.33 |
| Residual | 568.46 | 7 | 81.21 | |
Examples 38-52: Analysis of Benchmark datasets
A standard set of benchmark datasets for the most common analyses was originally proposed by Elliott, Reisch, & Campbell (1989) and has since then been used in published reviews of statistical packages. Shown below are the results for all proposed benchmark analyses (and extensions of some of those tests designed to make them more demanding) as computed by STATISTICA.
Example 38: Descriptive statistics with small relative variances
Here are the results computed for the example dataset proposed by Elliott et al. (p. 290). To demonstrate the precision of STATISTICA we have extended the test to extremely small relative variances (100000000001 to 100000000009).
| basic stats | Descriptive Statistics N. of Cases = 9 (MD pairwise deleted) |
||
|---|---|---|---|
| Mean | St. Err. | ST. Dev. | |
| V1 V2 V3 V4 V5 V6 V7 V8 V9 | 1005.0000 10005.0000 100005.0000 1000005.0000 10000005.0000 100000005.0000 1000000005.0000 10000000005.0000 100000000005.000 | .91287092917528 .91287092917528 .91287092917528 .91287092917528 .91287092917528 .91287092917528 .91287092917528 .91287092917528 .91287092917528 | 2.73861266370720 2.73861266370720 2.73861266370720 2.73861266370720 2.73861266370720 2.73861266370720 2.73861266370720 2.73861266370720 2.73861266370720 |
Example 39: Independent group t-test
Here are the results computed for the t-test benchmark dataset proposed by Elliott et al. (p. 290).
| basic stats | T-test; indep.var: FERTLZR [1 gr.= PRESENT] [2 gr.= NEWER] N. of Cases = 18 |
|
|---|---|---|
| t | 2-tailed p | |
| Height | -2.988440 | .008686 |
Example 40: Paired t-test
Here are the results computed for the paired t-test benchmark dataset proposed by Elliott et al. (p. 290).
| basic stats | Single t-Tests |
||||
|---|---|---|---|---|---|
| Comparison | t | p | N | E(X-Y) | D(X-Y) |
| HINDLEG-FORELEG | 3.41379 | .00770 | 10 | 3.3000 | 3.0569 |
Example 41: One-way ANOVA (test 1)
Here are the results of the one-way ANOVA benchmark (Example 1) proposed by Elliott et al. (p. 291).
| manova | MAIN EFFECT: FEED 1-FEED |
||||
|---|---|---|---|---|---|
| Univar. Test | Sum of Squares | df | Mean Square | F | p |
| Effect Error | 4226.348 28.350 | 3 15 | 1408.783 8.557 | 164.64 | .0000 |
Example 42: One-way ANOVA (test 2)
Here are the results of the one-way ANOVA benchmark (Example 2) proposed by Elliott et al. (p. 291).
| manova | MAIN EFFECT: CONDITN 1-CONDITN |
||||
|---|---|---|---|---|---|
| Univar. Test | Sum of Squares | df | Mean Square | F | p |
| Effect Error | 10.6622 7.1663 | 4 9 | 2.66556 .79626 | 3.3476 | .0611 |
Example 43: One-way repeated measures ANOVA
Here are the results for the one-way repeated measures ANOVA benchmark proposed by Elliott et al. (p. 292).
| manova | MAIN EFFECT: Drug 1-Drug |
||||
|---|---|---|---|---|---|
| Univar. Test | Sum of Squares | df | Mean Square | F | p |
| Effect Error | 698.200 112.800 | 3 12 | 232.733 9.400 | 24.759 | .0000 |
Example 44: Two-way ANOVA (balanced)
Here are the results for the two-way balanced ANOVA benchmark proposed by Elliott et al. (p. 292).
| manova | Summary of all Effects; design: 1-GENDER, 2-HORMONE |
|||||
|---|---|---|---|---|---|---|
| Effect | df Effect | ms Effect | df Error | ms Error | F | p |
| 1 *2 12 | 1 1 1 | 70.31 1386.11 4.90 | 16 16 16 | 22.898 22.898 22.898 | 3.071 60.534 .214 | .0989 .0000 .6449 |
Example 45: Two-way ANOVA (unbalanced)
Here are the results for the unbalanced ANOVA benchmark data proposed by Elliott et al. (p. 293). We show here only the results for the Type III analysis (as "recommended" by Elliott et al., Table 3.7.2); note that Type I and II analyses can also be performed with STATISTICA: ANOVA/MANOVA.
| manova | Summary of all Effects (Type III SS) Design: 1-DRUG, 2-DISEASE |
|||||
|---|---|---|---|---|---|---|
| Effect | df Effect | ms Effect | df Error | ms Error | F | p |
| *1 2 12 | 3 2 6 | 999.157 207.937 117.878 | 46 46 46 | 110.453 110.453 110.453 | 9.046 1.883 1.067 | .0001 .1637 .3958 |
Example 46: Simple linear regression
Here are the results (computed via STATISTICA: Multiple Regression) for the data proposed by Elliott et al. (p. 294; note that the result reported in Elliott as r-square is in fact the result for the simple Pearson correlation coefficient r).
R: .726305400
R-Square: .527519535
Intercept: 4.910512449 St.Er: 6.627462 t(11)=.7409 p<.47
| Variable | B | t(11) | p |
|---|---|---|---|
| HANDGUNS | .0376114423094 | 3.5044808186082 | .00493 |
Example 47: Multiple linear regression (Example 1)
Here are the results for the data proposed by Elliott et al. (p. 295; note that the result reported in Elliott as r-square is in fact the result for the multiple correlation coefficient r).
Multiple R: .922119692
Multiple R-Square: .850304726
Intercept: 2.085724401
| multiple regress. | Regression Weights | |
|---|---|---|
| Variable | B | St. Err. of B |
| X1 X2 | .0569873379910 1.0500229564602 | 2.6131042380235 .3262103147516 |
Example 48: Multiple linear regression (Example 2)
The next multiple regression benchmark proposed by Elliott et al., 1989 (p. 295, Example 2) is based on the well-known Longley dataset (with redundant predictor variables, Longley, 1967). The results of this test are reported in Example 27, above. As mentioned before (see our Example 27), there is a typographical error in the table presented in Elliott et al., 1989 (Table 4.3.1, p. 296). Specifically, the B coefficient for TIME is 1829.151464614 (as reported in STATISITCA) and not 1829.15146416 (6 and 1 are reversed).
To our knowledge, STATISTICA is the only statistics program available on the market that will correctly compute and report regression coefficients for the Longley dataset with this level of precision (Excel will correctly report the first 8 significant digits, Lotus will correctly report all 12 digits).
Example 49: Multiple linear regression (Example 3)
Here again are the (partial) results for the polynomial regression problem reported in Elliott et al. (1989, p. 297). Note that this test is even more "demanding" than the previous one and an extremely low setting of the minimum tolerance parameter is required to obtain the parameter estimates for the sixth order polynomial.
Dependent Variable: Y_HR
Multiple R: .996793635
Intercept: 157.88215543 Std.Error: 73.68338
| multiple regress. | Regression Weights | |
|---|---|---|
| variable | B | St. Err. of B |
| X_KG P2 P3 P4 P5 P6 | -330.97580114610 364.04271758509 -199.36108558038 58.11303781881 -8.60698967739 .50963834084 | 192.284963071360 201.286163909620 108.400947552150 31.758798390784 4.813032615799 .295596359040 |
Example 50: A 2 x 2 contingency table and Fisher exact test
Here are the results for the 2x2 contingency table presented in Elliott et al. (p. 295).
Chi-square (N = 29) = 4.89 p < .0271
Phi-Square = .168521
Fisher Exact Probability (one-tailed): .032884
Example 51: An R x C contingency table (Example 1)
Here are the (partial) results for the 2x4 contingency table presented in Elliott et al. (p. 298).
df p
Maximum Likelihood Chi-square: 9.51215 3 .023216
Pearson Chi-square: 8.98718 3 .029477
| log=lin. analysis | Expected Freq.: GENDER by HAIR_COL | ||||
|---|---|---|---|---|---|
| GENDER | HAIR_COL BLACK | HAIR_COL BROWN | HAIR_COL BLONDE | HAIR_COL RED | TOTAL |
| MALE FEMALE | 29.00000 58.00000 | 36.0000 72.0000 | 26.66667 53.33333 | 8.33333 16.66667 | 100.0000 200.0000 |
| Total | 87.00000 | 108.0000 | 80.00000 | 25.00000 | 300.0000 |
Example 52: An R x C contingency table (Example 2)
Here are the (partial) results for the 2x4 contingency table presented in Elliott et al. (p. 298). Note that the expected frequency for group Negative/Days_0 is incorrectly reported in Elliott et al. as 14628 (and thus the expected frequencies in the first column do not add up to the marginal frequency); the correct expected frequency for this cell is 14628.5.
df p
Maximum Likelihood Chi-square: 62.4336 3 .000000
Pearson Chi-square: 283.047 3 .000000
| log=in. analysis | Expected Freq.: DAYS by GROUP | ||||
|---|---|---|---|---|---|
| crossprd DAYS | GROUP DAYS_0 | GROUP DAYS 1_2 | GROUP DAYS 3_5 | GROUP DAYS 6_9 | Total |
| NEGATIVE POSITIVE | 14628.50 42.50 | 312.0932 .9068 | 147.5712 .428855 | .83777 .16223 | 15144.00 44.00 |
| Total | 14671.00 | 313.0000 | 148.0000 | 56.00000 | 15188.00 |
Bishop, Y. M. M., Fienberg, S. E., & Holland, P. W. (1975). Discrete multivariate analysis. Cambridge, MA: MIT Press.
Box, G. E. P., Hunter, W. G., & Hunter, S. J. (1978). Statistics for experimenters: An introduction to design, data analysis, and model building. New York: Wiley.
Brecht, L., & Woodward, A. (1985). Ganova (Technical Documentation). Unpublished Manuscript.
Brown, M. B., Engelman, L., & Jennrich, R. I. (1990). BMDP, Statistical software manual. Los Angeles: University of California Press.
Brown, M. B., Engelman, L., Frane, J. W., Hill, M. A., Jennrich, R. I., & Toporek, J. D. (1983). BMDP, Statistical software manual. Los Angeles: University of California Press.
Cox, D. R. (1970). The analysis of binary data. New York: Halsted Press.
Crowley, J., & Hu, M. (1977). Covariance analysis of heart transplant survival data. Journal of the American Statistical Association,72, 27-36.
Elliott, A. C., Reisch, J. S., & Campbell, N. P. (1989). Benchmark datasets for evaluating microcomputer statistical programs. Collegiate Microcomputer,11, 289-299.
Enslein, K., Ralston, A., & Wilf, H. S. (1977). Statistical methods for digital computers. New York: Wiley.
Fienberg, S. E. (1977). The analysis of cross-classified categorical data. Cambridge, MA: MIT Press.
Finn, J. D. (1974). A general model for multivariate analysis. New York: Holt, Rinehart & Winston.
Finn, J. D. (1977). Multivariate analysis of variance and covariance. In K. Enslein, A. Ralston, and H. S. Wilf (Eds.). Statistical methods for digital computers. Vol. III, New York: Wiley.
King, M., Bailey, D.M., Gibson, D.G., Pitha, J.V., & McCay, P. B. (1979). Incidence and growth of mammary tumors induced by 7,12-dimethylbenz (alpha) antheacene as related to the dietary content of fat and antioxidant. Journal of the National Cancer Institute, 63, 656-664.
Lawless, J. F. (1982). Statistical models and methods for lifetime data. New York: Wiley.
Lee, E. T. (1992). Statistical methods for survival data analysis (2nd edition). New York: Wiley.
Lindman, H. R. (1974). Analysis of variance in complex experimental designs. San Francisco: W. H. Freeman & Co.
Longley, J. W. (1967). An appraisal of least squares programs for the electronic computer from the point of view of the user. JASA,62, 819-831.
Milliken, G. A. & Johnson, D. E. (1984). Analysis of messy data. Vol. I: Designed experiments. New York: Van Nostrand Reinhold, Co.
Neter, J., Wasserman, W., & Kutner, M. H. (1985). Applied linear statistical models: Regression, analysis of variance, and experimental designs. Homewood, Ill.: Irwin.
Phadke, M. S. (1989). Quality engineering using robust design. Englewood Cliffs, NJ: Prentice Hall.
Searle, S. R. (1987). Linear models for unbalanced data. New York: Wiley.
Winer, B. J. (1962). Statistical principles in experimental design. New York: McGraw-Hill. (2nd edition, McGraw-Hill, 1971).
Woodward, J. A., Bonett, D. G., & Brecht, M. L. (1990). Introduction to linear models and experimental design. New York: Harcourt, Brace, Jovanovich.
This material was developed by StatSoft, Inc. StatSoft, Inc. does not copyright the selection of the benchmark materials used in this text and explicitly encourages the use of those tests by others to the benefit of all statistics software. The proper citation for this selection of validation benchmarks is: Validation Benchmarks for Statistical Algorithms (version 1B). (1992). Tulsa, OK: StatSoft. We would appreciate your comments or questions.
| Back to Top |
| Request Quote |
| StatSoft Home Page |
Pacific
Suite 1, 46-48 Howard Street
North Melbourne VIC 3051
Australia
Phone: +61 3 9348 9422
Fax: +61 3 9348 9420
e-mail: info@statsoft.com.au
©Copyright StatSoft, Inc., 1984-2006.
StatSoft, StatSoft logo, STATISTICA, Enterprise/QC, Enterprise, Data Miner, SEPATH and GTrees are trademarks of StatSoft, Inc.