
Features of STATISTICA Advanced Linear/Non-Linear Models
STATISTICA Advanced Linear/Non-Linear Models offers a wide array of the most advanced linear and nonlinear modeling
tools on the market, supports continuous and categorical predictors, interactions, hierarchical models; automatic model
selection facilities; also, variance components, time series, and many other methods; all analyses with extensive,
interactive graphical support and built-in complete Visual Basic scripting.
STATISTICA Advanced Linear/Non-Linear Models is compatible with Windows 2000 and Windows XP.
Variance Components and Mixed Model ANOVA/ANCOVA
Survival/Failure Time Analysis
General Nonlinear Estimation (and Quick Logit/Probit Regression)
Log-Linear Analysis of Frequency Tables
Time Series Analysis/Forecasting
Structural Equation Modeling/Path Analysis (SEPATH)
General Linear Models (GLM)
General Regression Models (GRM)
Generalized Linear Models (GLZ)
General Partial Least Squares Models(PLS)
![]() VARIANCE COMPONENTS AND MIXED MODEL ANOVA/ANCOVA.
Variance Components and Mixed Model ANOVA/ANCOVA. is a specialized module for designs with random effects and/or factors
with many levels; options for handling random effects and for estimating variance components are also provided in the
General Linear Models module. Random effects (factors)occur frequently in industrial research, when the levels of a factor
represent values sampled from a random variable (as opposed to being deliberately chosen or arranged by the experimenter).
The Variance Components module will allow you to analyze designs with any combinations of fixed effects, random effects, and
covariates. Extremely large ANOVA/ANCOVA designs can be efficiently analyzed: Factors can have several hundreds of levels.
The program will analyze standard factorial (crossed) designs as well as hierarchically nested designs, and compute the standard
Type I, II, and III analysis of variance sums of squares and mean squares for the effects in the model. In addition, you can
compute the table of expected mean squares for the effects in the design, the variance components for the random effects in the
model, the coefficients for the denominator synthesis, and the complete ANOVA table with tests based on synthesized error sums of
squares and degrees of freedom (using Satterthwaite's method).
Other methods for estimating variance components are also supported (e.g., MIVQUE0, Maximum Likelihood [ML], Restricted Maximum
Likelihood [REML]). For maximum likelihood estimation, both the Newton-Raphson and Fisher scoring algorithms are used, and the
model will not be arbitrarily changed (reduced) during estimation to handle situations where most components are at or near zero.
Several options for reviewing the weighted and unweighted marginal means, and their confidence intervals, are also available.
Extensive graphics options can be used to visualize the results.
![]() SURVIVAL/FAILURE TIME ANALYSIS.
This module features a comprehensive implementation of a variety of techniques for analyzing censored data from social,
biological, and medical research, as well as procedures used in engineering and marketing (e.g., quality control,
reliability estimation, etc.). In addition to computing life tables with various descriptive statistics and Kaplan-Meier
product limit estimates, the user can compare the survivorship functions in different groups using a large selection of
methods (including the Gehan test, Cox F-test, Cox-Mantel test, Log-rank test, and Peto & Peto generalized Wilcoxon test).
Also, Kaplan-Meier plots can be computed for groups (uncensored observations are identified in graphs with different point markers).
The program also features a selection of survival function fitting procedures (including the Exponential, Linear Hazard,
Gompertz, and Weibull functions) based on either unweighted and weighted least squares methods (maximum-likelihood parameter
estimates for various distributions, including Weibull, can also be computed via the STATISTICA Process Analysis module).
Finally, the program offers full implementations of four general explanatory models (Cox's proportional hazard model,
exponential regression model, log-normal and normal regression models) with extended diagnostics, including stratified
analysis and graphs of survival for user-specified values of predictors. For Cox proportional hazard regression, the user
can choose to stratify the sample to permit different baseline hazards in different strata (but a constant coefficient vector),
or the user can allow for different baseline hazards as well as coefficient vectors.
In addition, general facilities are provided to define one or more time-dependent covariates. Time-dependent covariates can be
specified via a flexible formula interpreter that allows the user to define the covariates via arithmetic expressions which may
include time, as well as the standard logical functions (e.g., timdep=age+age*log(t_)*(age>45), where t_ references survival time)
and a wide variety of distribution functions. As in all other modules of STATISTICA, the user can access and change the technical
parameters of all procedures (or accept dynamic defaults).
The module also offers an extensive selection of graphics and specialized diagrams to aid in the interpretation of results
(including plots of cumulative proportions surviving/failing, patterns of censored data, hazard and cumulative hazard functions,
probability density functions, group comparison plots, distribution fitting plots, various residual plots, and many others). For
engineering applications, see also Weibull Analysis.
![]()
GENERAL NONLINEAR ESTIMATION (and Quick Logit/Probit Regression).
The Nonlinear Estimation module allows the user to fit essentially any type of nonlinear model. One of the unique features of this
module is that (unlike traditional nonlinear estimation programs) it does not impose any limits on the size of data files that it
can process.
Estimation Methods. The models can be fit using least squares or maximum-likelihood estimation, or any user-specified loss function. When using the least-squares criterion, the very efficient Levenberg-Marquardt and Gauss-Newton algorithms can be used to estimate the parameters for arbitrary linear and nonlinear regression problems. For large datasets or for difficult nonlinear regression problems (such as those rated "higher difficulty" among the Statistical Reference Datasets provided by the National Institute of Standards and Technology; see http://www.nist.gov/itl/div898/strd/index.html), when using the least-squares criterion, this is the recommended method for computing precise parameter estimates. When using arbitrary loss functions, the user can choose from among four very different, powerful estimation procedures (quasi-Newton, Simplex, Hooke-Jeeves pattern moves, and Rosenbrock pattern search method of rotating coordinates) so that stable parameter estimates can be obtained in practically all cases, and even in extremely numerically-demanding conditions (see the Validation Benchmarks ). Models. The user can specify any type of model by typing in the respective equation into an equation editor. The equations may include logical operators; thus, discontinuous (piecewise) regression models and models including indicator variables can also be estimated. The equations may also include a wide selection of distribution functions and cumulative distribution functions (Beta, Binomial, Cauchy, Chi-square, Exponential, Extreme value, F, Gamma, Geometric, Laplace, Logistic, Normal, Log-Normal, Pareto, Poisson, Rayleigh, t (Student), or Weibull distribution). The user has full control over all aspects of the estimation procedure (e.g., starting values, step sizes, convergence criteria, etc.). The most common nonlinear regression models are predefined in the Nonlinear Estimation module, and can be chosen simply as menu options. Those regression models include stepwise Probit and Logit regression, the exponential regression model, and linear piecewise (break point) regression. Note that STATISTICA also includes implementations of powerful algorithms for fitting generalized linear models, including probit and multinomial logit models, and generalized additive models; see the respective descriptions for additional details.
Graphs. All output is integrated with extensive selections of graphs, including interactively-adjustable 2D and 3D (surface) arbitrary function fitting graphs which allow the user to visualize the quality of the fit and identify outliers or ranges of discrepancy between the model and the data; the user can interactively adjust the equation of the fitted function (as shown in the graph) without re-processing the data and visualize practically all aspects of the nonlinear fitting process). Many other specialized graphs are provided to evaluate the fitting process and visualize the results, such as histograms of all selected variables and residual values, scatterplots of observed versus predicted values and predicted versus residual values, normal and half-normal probability plots of residuals, and many others.
LOG-LINEAR ANALYSIS OF FREQUENCY TABLES.
This module offers a complete implementation of log-linear modeling procedures for multi-way frequency tables. Note that
STATISTICA also includes the Generalized Linear Models module, which provides options for analyzing binomial and multinomial
logit models with coded ANOVA/ANCOVA-like designs.
In the Log-Linear Analysis module, the user can analyze up to 7-way tables in a single run. Both complete and incomplete tables
(with structural zeros) can be analyzed. Frequency tables can be computed from raw data, or may be entered directly into the program.
The Log-Linear Analysis module provides a comprehensive selection of advanced modeling procedures in an interactive and flexible environment
that greatly facilitates exploratory and confirmatory analyses of complex tables. The user may at all times review the complete observed
table as well as marginal tables, and fitted (expected) values, and may evaluate the fit of all partial and marginal association models
or select specific models (marginal tables) to be fitted to the observed data.
The program also offers an intelligent automatic model selection procedure that first determines the necessary order of interaction
terms required for a model to fit the data, and then, through backwards elimination, determines the best sufficient model to
satisfactorily fit the data (using criteria determined by the user).
The standard output includes G-square (Maximum-Likelihood Chi-square), the standard Pearson Chi-square with the appropriate
degrees of freedom and significance levels, the observed and expected tables, marginal tables, and other statistics. Graphics
options available in the Log-linear module include a variety of 2D and 3D graphs designed to visualize 2-way and multi-way
frequency tables (including interactive, user-controlled cascades of categorized histograms and 3D histograms revealing "slices"
of multi-way tables), plots of observed and fitted frequencies, plots of various residuals (standardized, components of
Maximum-Likelihood Chi-square, Freeman-Tukey deviates, etc.), and many others.
![]() TIME SERIES ANALYSIS/FORECASTING.
The Time Series module contains a wide range of descriptive, modeling, decomposition, and forecasting methods for both time and
frequency domain models. These procedures are integrated, that is, the results of one analysis (e.g., ARIMA residuals) can be used
directly in subsequent analysis (e.g., to compute the autocorrelation of the residuals).
Also, numerous flexible options are provided to review and plot single or multiple series. Analyses can be performed on even very
long series. Multiple series can be maintained in the active work area of the program (e.g., multiple raw input data series or
series resulting from different stages of the analysis); the series can be reviewed and compared. The program will automatically
keep track of successive analyses, and maintain a log of transformations and other results (e.g., ARIMA residuals, seasonal
components, etc.). Thus, the user can always return to prior transformations or compare (plot) the original series together with
its transformations. Information about the consecutive transformations is maintained in the form of long variable labels, so if you
save the newly created variables into a dataset, the "history" of each of the series will be permanently preserved. The specific
Time Series procedures are described in the following subsections.
Transformations, Modeling, Plots, Autocorrelations. The available time series transformations allow the user to fully explore patterns in the input series, and to perform all common time series transformations, including: de-trending, removal of autocorrelation, moving average smoothing (unweighted and weighted, with user-defined or Daniell, Tukey, Hamming, Parzen, or Bartlett weights), moving median smoothing, simple exponential smoothing (see also the description of all exponential smoothing options below), differencing, integrating, residualizing, shifting, 4253H smoothing, tapering, Fourier (and inverse) transformations, and others. Autocorrelation, partial autocorrelation, and crosscorrelation analyses can also be performed.
Classical Seasonal Decomposition (Census Method I). The user may specify the length of the seasonal period, and choose either the additive or multiplicative seasonal model. The program will compute the moving averages, ratios or differences, seasonal factors, the seasonally adjusted series, the smoothed trend-cycle component, and the irregular component. Those components are available for further analysis; for example, the user may compute histograms, normal probability plots, etc. for any or all of these components (e.g., to test model adequacy).
Polynomial Distributed Lag Models. The implementation of the polynomial distributed lag methods in the Time Series module will estimate models with unconstrained lags as well as (constrained) Almon distributed lags models. A selection of graphs are available to examine the distributions of the model variables.
|
Regression-Based Forecasting Techniques. Finally, STATISTICA offers regression-based time series techniques for lagged or non-lagged variables (including regression through the origin, nonlinear regression, and interactive what-if forecasting).
| Back to Top |

STRUCTURAL EQUATION MODELING AND PATH ANALYSIS (SEPATH).
STATISTICA includes a comprehensive implementation of structural equation modeling techniques with flexible Monte Carlo simulation
facilities (SEPATH). The | Back to Top |
©Copyright StatSoft, Inc., 1984-2006.
SEPATH Monte Carlo simulation. The STATISTICA Structural Equation Modeling (SEPATH) module (see above) includes powerful
Back to Top
GENERAL LINEAR MODELS (GLM) STATISTICA General
Linear Models (GLM) analyzes
responses on one or more continuous dependent variables as a function of one or more categorical or continuous independent variables.
GLM is not only the most computationally advanced GLM tool currently on the market, but it is also the most comprehensive and
complete application available, offering a larger selection of options, graphs, accompanying statistics and extended diagnostics
than any other program. Designed with a "no compromise approach", GLM offers the most extensive selection of options to
handle GLM's so-called "controversial problems" that do not have any widely agreed upon solutions. GLM will compute all
the standard results, including ANOVA tables with univariate and multivariate tests, descriptive statistics, etc.
GLM offers a large number of results and graphics options that are usually not available in other programs. GLM
also offers simple ways to test linear combinations of parameter estimate; specifications of custom error terms and effects;
comprehensive post-hoc comparison methods for between group effects as well as repeated measures effects, and the interactions
between repeated measures. Click here to read more about the functionality offered in GLM.
Back to Top

GENERAL REGRESSION MODELS (GRM) STATISTICA General Regression Models
(GRM) provides the user with a unique, highly flexible implementation of the standard and unique
results options in the general linear models, as well as including a comprehensive set of stepwise regression and best-subset model
building techniques supporting both continuous and categorical variables.
Stepwise and best subset methods to build models for highly complex designs can be used in GRM, including designs with
effects for categorical predictor variables. Thus, the "general" in General Regression Models refers both to the use
of the general linear models, and to the fact that unlike most other stepwise regression programs, GRM is not limited to the
analysis of designs that contain only continuous predictor variables. In addition, unique regression-specific results options
include Pareto charts of parameter estimates, whole model summaries (tests) with various methods for evaluating no-intercept models,
partial and semi-partial correlations, etc. To read about what else GRM includes, please click
here.
Back to Top

GENERALIZED LINEAR MODELS (GLZ) The Generalized
Linear Models (GLZ) allows the user to search for both linear and nonlinear relationships between a response variable and
categorical or continuous predictor variables (including multinomial logit and probit,
signal detection models, and many
others). Special applications of generalized linear models include a number of widely used types of analyses, such as binomial and
multinomial logit and probit regression, or Signal Detection Theory (SDT) models. The GLZ module will compute all standard
results statistics, including likelihood ratio tests, and Wald and score tests for significant effects, parameter estimates and
their standard errors and confidence intervals, etc. The user-interfaces, methods for specifying designs, and "touch-and-feel" of
the program is similar to GLM, GRM, and PLS. The user is able to easily specify ANOVA or ANCOVA-like
designs, response surface designs, mixture surface designs, etc.; thus, even novice users will have no difficulty applying
generalized linear models to analyze their data. In addition, GLZ includes a comprehensive selection of model checking
tools such as Spreadsheets and graphs for various residuals and outlier detection statistics, including raw residuals, Pearson
residuals, deviance residuals, studentized Pearson residuals, studentized deviance residuals, likelihood residuals, differential
Chi-square statistics, differential deviance, and generalized Cook distances, etc. Click here for
more information on GLZ.
Back to Top

GENERAL PARTIAL LEAST SQUARES MODELS (PLS) Partial Least Squares
(PLS) includes a comprehensive selection of algorithms for univariate and multivariate
partial least squares problems.
PLS will compute all the standard results for a partial least squares analysis; in addition, it offers a large number of
results options and in particular graphics options that are usually not available in other implementations; for example, graphs of
parameter values as a function of the number of components, two-dimensional plots for all output statistics (parameters, factor
loadings, etc.), two-dimensional plots for all residuals statistics, etc. Because PLS offers an identical selection of
flexible user interfaces to that of GLM, GRM and GLZ, it is very easy to set up models in one module and
quickly analyze the data using the same model in PLS. This unique flexibility allows even novice users to apply these
powerful techniques to their analysis problems. The partial least squares method is a powerful data mining technique, particularly
well suited for determining a smaller number of dimensions in a large number of predictors and response variables. These methods
for analyzing linear systems have become popular only in the last few years; thus, many of the algorithms and statistics are still
the subject of ongoing research.
Back to Top
Request Quote
StatSoft Home Page
Pacific
Suite 1, 46-48 Howard Street
North Melbourne VIC 3051
Australia
Phone: +61 3 9348 9422
Fax: +61 3 9348 9420
e-mail: info@statsoft.com.au
StatSoft, StatSoft logo, STATISTICA, Enterprise/QC, Enterprise, Data Miner, SEPATH and GTrees are trademarks of StatSoft, Inc.