|
STATISTICA 7 - New Features and Enhancements
Most Unique Features in STATISTICA 7
View the STATISTICA 7 Movie
Upgrade Order Information
Variable Selection Dialogs
Modeless Probability Calculator
Support for New Import and Export Types
WebSTATISTICA Changes & Integration With Enterprise SPC and STATISTICA Document Management System
Enhancements to Existing Functionality
Extended Import/Export
Enhanced Graph Updating
Interactive Graphs, Brushing
"By Group" Analysis for Statistics and Graphs
Variable and Case Metadata
Case Metadata
Variable Metadata
Workbook Multi-Item Display
New Recording / Reporting Options for Case Selection Conditions
Export PDF Format
Web Browser Document Type
Enhanced Text Importing
Automatic Variable Classification
Licensing Changes
Sorting
Merge Data
Stacking/Unstacking
Enhanced Spreadsheet Formulas and Case Selection Conditions
Further Expanded STATISTICA Visual Basic Functionality
"All Values" Categorization Method
Basic Statistics
Quality Control
Process Analysis
SEWSS Enhancements
Aggregated Data
21 CFR PART 11 Compliance
New Products and Analysis Modules
STATISTICA NIPALS Algorithm (PCA/PLS)
STATISTICA Sequence, Association and Link Analysis
STATISTICA Multivariate Statistical Process Control (MSPC)
Random Forests
Variable Selection dialogs
Resizable
If you have a large data set with many variables, or the variable names in your data
set are long, you can enlarge the variable selection dialog by dragging its corners or
sides. Doing this will increase the size of the display boxes, allowing you to better
view long variable names or larger portions of the variable list. Once the dialog has
been resized the settings will remain the same throughout the analysis.
Display stylistic variation
In the variable selection dialog, STATISTICA now displays formats applied to variable
names (such as bold, underline, italics, and font color; as specified in the
spreadsheet). For example, a variable called Measure1 in the spreadsheet will be
displayed as Measure1 in the variable selection dialog. For readability reasons,
changes to the font size are not shown in the variable selection dialog.
Tool tips over long names
Tool tips have been added when mousing over long names when there is not enough
space in the dialogue window to view the entire name.
Bundles
Bundles can be used to organize large sets of variables and to facilitate the repeated
selection of the same set of variables. By creating bundles, you can make it possible
to quickly and easily locate a subset of data in a large data file (e.g., if the same set
of target variables are repeatedly used, you could create a bundle called Targets)
Copying variable information to Spreadsheet
STATISTICA now allows exporting of variable specifications (All Specs, Text Labels,
and Value Specs) to a STATISTICA spreadsheet file.
Modeless Probability Calculator
Probability calculator dialog box has been changed to model less (can be minimized); as a
direct consequence, other tasks can be performed, while one/more probability calculator(s)
is/are in use.
Graphics
1) "Repeat last Format"
By clicking F4, last formatting changes can be copied to unformatted items.
2) New line-plot fit types
Moving Average and Exponential Smoothing are the two new fit types, and they can
be applied after a line-plot has been created. The above fit types are commonly used
as the forecasting methods for a wide variety of time series data.
3) Speed of rendering
New marker point reduction options (Standard, Fast, or Aggressive) are used to
specify an optimization for rendering graphs with large amounts of data. Usually, this
option does not effect the overall appearance of the graph, it simply allows for
quicker display of the graph when large (i.e., over 100,000 cases) data sets are
involved.
4) Zoom
When you click the magnifying glass button, the mouse pointer becomes a magnifier,
enabling you to proportionally enlarge the current graph and redraw it in the new
ranges of coordinates either in the same window, or in a separate window.
Support for new import and export types
1) JMP 5.1 (All versions up to 5)
Support for JMP Data Files (binary).
2) SAS (Versions 6.08 and above)
Support for SAS Data Files (binary) and SAS Transport Files (All versions supported).
Improvements to import speed have been addressed in this release.
3) SPSS (All versions supported)
Support for SPSS Data Files (binary) and SPSS Portable Files (Replaces V6 SPSS POR
import functionality)
4) Minitab (All versions up to 11)
Support for Minitab Data Files (binary).
5) Automation interfaces
Spreadsheets can be imported/exported from/to SPSS (.sav, .por), SAS (.sas7bdat,
.xpt), Mnitab (.mtw), and JMP (.jmp).
WebSTATISTICA changes
WebSTATISTICA progress bar for batch jobs If you offloaded computations to
WebSTATISTICA, for batch processing, you can monitor the progress of the batch
jobs via the Web browser. This is particularly useful when you want to check the
status of lengthy computations remotely, either from location inside or outside the
enterprise.
Integration of SEWSS with SDMS
Integration of SEWSS and SDMS (Versioning of profiles and monitors) If you
enable SDMS integration with SEWSS, you can automatically store profiles and
monitors in SDMS (STATISTICA Document management System) and optionally
insure that older versions of profiles and monitors are maintained. SDMS keeps all
the different versions of the documents and all modifications to the profiles and
monitors will be stored as new versions.
Enhancements to Existing Functionality:
Extended Import/Export
Added support for importing and exporting to/from:
- SAS Data Files (binary)
- SAS Transport Files
- SPSS Data Files (binary)
- SPSS Portable Files (Replaces V6 SPSS POR import functionality)
- Minitab Data Files (binary)
- JMP Data Files (binary)
- Enhanced speed for the Unstack operation for sparse data situations.
- Oracle-write back can be performing using optimized native Oracle operations providing improved speed for scoring/writeback operations on a large IDP datamining source.
Enhanced Graph Updating
- Support for maintaining integrated "data-graphs" exploratory environments.
- STATISTICA Graphs will update when the source Spreadsheet data change even after the respective STATISTICA analyses are closed.
- Graphs can be re-linked to new Spreadsheets and Variables, allowing currently customized graphs (titles, scaling, embedded objects, bar shading, etc.) to be used as "Templates" for deployment to different data sets.
Interactive Graphs
- Tight integration between Graphs and their source Spreadsheets.
- Brush points on Scatterplots and the Cases will automatically become marked in the respective Spreadsheet, so the subsets can be used in subsequent analyses.
- Brushing states will propagate to the source Spreadsheet and then to all other open Graphs based on the same Spreadsheet; this feature enables the user to brush points on one graph and view the corresponding Cases highlighted on other open Graphs.
- Brushing events will update the Spreadsheet marking Cases as Labeled/Unlabeled, Excluded/Included, Marked/Unmarked. Related graphs tied to the same data can then be updated to reflect the brushing events performed on the first graph.
- Histograms can now be brushed; if selecting from the Spreadsheet, then the bars will be partially filled to represent the portion of the histogram that is selected by the cases; if selected from the histogram, then all component cases are selected.
- "Raw Data" display for Brushing of Box & Whisker plots. When selected this option includes the raw data points in the display.
- Box and Whiskers plots can now have Jitter. Jitter will offset the points slightly so you can distinguish among points. This makes it easier to get tooltip information about certain points, and to select points for brushing.
- Box Plots can now be brushed. If raw data points are displayed then the individual points are selected in the Spreadsheet. If raw data is not displayed then selecting the box will select all points associated with it.
- Text string can be used as a graph marker point.
"By Group" Analysis for Statistics and Graphs
- All STATISTICA Analyses and Graphs now support the selection of one or more "By Variables." The specified analysis is repeated for each unique level (value) of the "By Variables." For example, a Multiple Linear Regression model can be specified and calculated independently for subsets of cases defined by each unique value of variable City (e.g., Dallas, Atlanta, Pittsburgh, Chicago...).
Variable and Case Metadata
Metadata can now be defined for Cases and Variables to offer new analytic options and simplify and speed up specifying new analyses.
Case Metadata:
- Marker Type: Defines the point marker shape to be used for the respective Case(s); used in Graph Types such as Scatterplots (for example, one particular case can be assigned a "red star" marker, and it will appear as such in all scatterplots).
- Marker Color: Defines the point marker color to be used for the respective Case(s).
- Excluded: User can mark a case as Excluded. An Excluded case will be omitted from calculations, but will still be present in graphical displays.
- Hidden: User can turn off a point in graph, i.e., the point will still be used in computations, but will not be displayed in a graph.
- Label: User can select to label individual cases within graphs.
Variable Metadata:
- Measurement Type (Auto, Continuous, Categorical, Ordinal): Used for automatic variable classification in Analyses and, optionally, automatically populating variable selection list boxes only with variables of the appropriate types.
- Excluded: Prevents display in Variable selection dialogs.
- Label: The User can define a variable as a Label variable. The values of a Label variable will be used as point labels within appropriate graphs.
- Case state: User can save case states to a specified variable.
- Properties: User can create custom metadata fields (name-value pairs) to be stored and associated with a Variable. For example, a User can define an "Upper Control Limit" property for a variable assigned a value of "2.6". A STATISTICA Visual Basic (SVB) macro can query the Variable Properties, including the custom Upper Control Limit Property, to apply it to Quality Control Charts based on this Variable. With this approach, the same SVB macro can be applied to different data and dynamically use appropriate QC Chart limits and specifications.
Workbook Multi-Item Display
- New default behavior in Workbooks is to display the contents of a folder, when the folder is selected, as a pane of the respective Spreadsheets and Graphs from that folder displayed in form of a grid of items (adjacent to each other).
- Workbooks now support the ability to View/Print the contents of a Workbook folder in a user-defined grid configuration.
- V7 Workbooks can now be saved in Version 6 format; each item in the workbook will be exported to V6 format.
- The Workbook multi-item display supports selecting and copying selected items from the display using the standard Shift-click or Control-click conventions.
New Recording / Reporting Options for Case Selection Conditions
- Currently specified case selection conditions can now be automatically displayed in title areas of all respective graphs (generated from the case selected subsets) and in the header areas of all result spreadsheets.
Export PDF Format
- Export all document types into (editable) PDF format.
Web Browser Document Type
- Support for Integrated Internet Explorer (IE) Windows in the STATISTICA Application.
- The integrated IE Window offers one more method supported in STATISTICA to easily build custom User Interfaces, in this case, using the standard HTML scripting.
- IE Windows supports HTML applications that can include native STATISTICA Spreadsheet and Graph objects for interactive editing, brushing, etc.
- IE Windows support hosting of native STATISTICA Spreadsheet and Graph objects for interactive editing, brushing, etc.
- Seamless integration of desktop STATISTICA and WebSTATISTICA running on a remote server.
Enhanced Text Importing
- The import of text files has been enhanced through the "auto" text import method. Users can now have the system automatically determine which columns should be imported as variables of type Text (instead of variables of type Double with text labels), or users can manually specify which columns are to be imported as text.
Automatic Variable Classification
- To speed up and simplify the process of selecting variables for analyses, Variable selection in Analyses and Graphs will (optionally) limit the display of Variables to the types that are appropriate for their respective roles in the Analyses. For example, in "By Group" Analyses, by default only Categorical Variables will be displayed for selection as the By Variables.
Licensing Changes
- STATISTICA Concurrent Licensing has been enhanced to allow for more granular licensing of modules and offline usage while a STATISTICA User is disconnected from the network (as well as supporting "trial period" usage of individual modules).
Sorting
- Improved user interface to define complex sorting scenarios with very many keys.
- Support via automation for up to 14 sort keys.
Merge Data
- Merging from both open and disk-based Spreadsheets.
- Addition of "Cartesian-join" merge.
- Enhanced user interface.
Stacking/Unstacking
- Added ability to Interleave output when stacking.
- Stacking - Unstacked variables can be included/excluded from results.
- Unstacking - Added options for handling multiple cross tab values.
Enhanced Spreadsheet Formulas and Case Selection Conditions
STATISTICA now provides an even broader selection of regular expression (including so-called fuzzy text searching) functions that can be used in spreadsheet and case selection formulas. For example:
- RE_SEARCH - search for text in a variable using regular expressions.
- RE_MATCH - compare text using regular expressions.
- RE_REPLACE - text replacement in a variable using regular expressions.
- LIKE - compare text using an operator similar to SQL's LIKE keyword.
Further Expanded STATISTICA Visual Basic Functionality
- Go even further using STATISTICA as an efficient programming platform for developing highly interactive custom graphics applications.
- Embed a wide variety of ActiveX controls within STATISTICA graphs.
- The Object Model help file, stv6om.chm, has been reorganized to make the structure easier to understand. Documentation has been added for the SEWSS object model.
"All Values" Categorization Method
- A new method of categorization in graphs allows for up to 255 distinct categories of integer or non-integer values.
Basic Statistics
- Enhanced breakdown tables generated with elimination of empty rows in generated tables.
Quality Control
- STATISTICA QC Charts support aggregated data (means, ranges, standard deviations) as input. This capability is particularly useful when automated data collection equipment and instruments output only aggregated data for each sample.
Process Analysis
- Gage Linearity analysis.
- 5,000 cases limit has been removed.
SEWSS Enhancements:
Aggregated Data
SEWSS supports aggregated data (means, ranges, standard deviation) as input. This capability is important and useful when automated data collection equipment and instruments output only aggregated data for each sample.
21 CFR Part 11 Compliance
There are extensions to SEWSS offer options to better keep track of SEWSS users' activities and to increase administrator's control over the way in which SEWSS is being used. These features are also required for complete compliance in a 21 CFR Part 11 environment. This includes logging of system changes, implementing a Windows integrated logon environment, and locking Spreadsheets and Graphs from modifications.
- Integrated NT authentication allows administrators to import users from any NT domain and define as part of the SEWSS domain. Once a SEWSS domain user is defined no further log-on prompt is required to log-in to SEWSS.
- The Jet4.0 OLEDB provider can now be used when connecting into an external data source offering enhanced options for accessing certain databases.
New Products and Analysis Modules:
STATISTICA NIPALS Algorithm (PCA/PLS) - an implementation of a number of techniques known as Principal Component Analysis (PCA) and Partial Least Squares (PLS). In STATISTICA, PCA and PLS are implemented using the state of the art NIPALS algorithm (Nonlinear Iterative Partial Least Squares) a mathematical procedure designed to extract systematic variations, relationships, and information in datasets. STATISTICA NIPALS simplifies the analysis at hand while effectively combating the curse of high dimensionality (typically present when the number of variables is large). STATISTICA NIPALS is also particularly suited for use in data diagnostics, making it an ideal tool for use in Quality Control in many areas of science and technology. A few examples are pharmaceuticals, biochemicals and semiconductor industry. Important features include:
- Scalability: The ability to handle datasets with very large number of variables.
- Data diagnostics and inter-variable relations: Capable of applying PCA to data diagnostics, while also using Partial Least Squares for relating a number of predictors to a set of outcome variables (whether in a classification or a regression problem).
- Integrated Graphical Analysis: Wide selection of integrated graphical techniques including batches plotted in the component space, importance plot of components, and univariate and multivariate QC Charts.
- Cross-validation. Integrated options for cross-validation to evaluate the number of components to extract.
- Quality Control. Wide selection of univariate and multivariate QC Charts for offline analysis or automatically-updated as new data are collected.
STATISTICA Sequence, Association and Link Analysis - this new, stand-alone product addresses the needs of clients in retailing, banking, insurance, etc., industries by implementing the fastest known, highly scalable sequence analysis algorithm with the ability to drive Association and Sequence rules in one single analysis. Furthermore, the program represents a stand-alone application that can be used for model building and deployment.
STATISTICA Multivariate Statistical Process Control (MSPC) - this new, stand-alone product (available in enterprise, client-server versions) is designed for advanced process control applications in many industries, including pharmaceutical, chemical and bio-chemical, food production and others; it provides the widest selection of univariate and multivariate techniques for statistical process control applications. Analytic capabilities include, among many others:
- Partial Least Squares - comprehensive implementation of NIPALS algorithm for partial least squares regression including hierarchical PLS and multi-way PLS.
- Principal Components - comprehensive implementation of NIPALS algorithm for Principal Components Analysis including hierarchical PCA and multi-way PCA.
- Scalable to hundreds of thousands of parameters, both process parameters, in-process tests, and finished product tests.
- Integrated Graphical Analysis - wide selection of integrated graphical techniques including batches plotted in the component space, importance plot of components, and univariate and multivariate QC Charts.
- Cross-validation - integrated options for cross-validation to evaluate the number of components to extract.
- Quality Control - wide selection of univariate and multivariate QC Charts for offline analysis or automatically-updated as new data are collected.
Random Forests - this new module of STATISTICA Data Miner applications offers cutting-edge techniques for building flexible models for classification and regression; particularly well-suited for extremely large numbers of predictor variables.
Download this information in PDF form.
Most Unique Features in STATISTICA 7
View the STATISTICA 7 Movie
Upgrade Order Information
|
|