The table below gives the data handling limits for each Data Science major feature. "Unlimited" indicates that no limit is imposed, but other limits may apply based on your computer resources and Excel version.
Analytic Solver Data Science Feature | Analytic Solver Data Mining | Subsets |
---|---|---|
Risk Analysis and Synthetic Data Generation | Unlimited | Not Supported |
Partitioning | ||
# of Records (original data) | Unlimited | 65,000 |
# of Records (training partition) | Unlimited | 10,000 |
# Variables (output) | Unlimited | 100 |
Sampling | ||
# of Records (original data) | Unlimited | 65,000 |
# of Variables (output) | Unlimited | 50 |
# of Strata (Stratified Sampling) | Unlimited | 30 |
Database | ||
# of Records (table) | Unlimited | 1,000,000 |
# of Records (output) | Unlimited | 65,000 |
# of Variables (table) | Unlimited | Unlimited |
# of Variables (output) | Unlimited | 50 |
# of Strata (Stratified Sampling) | Unlimited | 30 |
File System | ||
# of Files | Unlimited | 100 |
Text Mining | ||
# Documents | Unlimited | 100 |
# Characters (per document) | Unlimited | 5,000 |
# Terms in final vocabulary | Unlimited | 50 |
# Text columns | Unlimited | 1 |
Transformation | ||
Common | ||
# of Records | Unlimited | 10,000 |
# of Variables | Unlimited | 50 |
Missing Data Handling | ||
# of Records | Unlimited | 65,000 |
# of Variables | Unlimited | 50 |
Binning Continuous Data | ||
# of Records | Unlimited | 65,000 |
Transforming Categorical Data | ||
# of Records | Unlimited | 65,000 |
# of Variables (data range) | Unlimited | 50 |
# of Distinct values | Unlimited | 30 |
Time Series Analysis | ||
# of Records | Unlimited | 1,000 |
Classification and Prediction | ||
# of Records (total) | Unlimited | 65,000 |
# of Records (training partition) | Unlimited | 10,000 |
# of Records (new data for scoring) | Unlimited | 65,000 |
# of Variables (output) | Unlimited | 50 |
# of Distinct classes (output variable) | Unlimited | 30 |
# of Distinct values (categorical input variables) | Unlimited | 15 |
k-Nearest Neighbor | ||
# k-Nearest neighbors | 50 | 10 |
Regression/Classification Trees | ||
# of Splits | Unlimited | 100 |
# of Nodes | Unlimited | 100 |
# of Levels | Unlimited | 100 |
# Levels in Tree Drawing | Unlimited | 7 |
Ensemble Methods | ||
# Weak learners | Unlimited | 10 |
Feature Selection | ||
# of Records | Unlimited | 10,000 |
# of Variables | Unlimited | 50 |
# of Distinct classes (output variable) | Unlimited | 30 |
# of Distinct values (input variables) | Unlimited | 100 |
Association Rules | ||
# of Transactions | Unlimited | 65,000 |
# of Distinct items | 5,000 | 100 |
Clustering | ||
k-Means | ||
# of Records | Unlimited | 10,000 |
# of Variables | Unlimited | 50 |
# of Clusters | Unlimited | 10 |
# of Iterations | Unlimited | 50 |
Hierarchical | ||
# of Records | Unlimited | 10,000 |
# of Variables | Unlimited | 50 |
# of Clusters in a Dendrogram | Unlimited | 10 |
Size of distance matrix | Unlimited | 1,000 x 1,000 |
Charts | ||
# of Records | Unlimited | 65,000 |
# of Variables (original data) | Unlimited | 100 |
General | ||
Model pane | Included | Included |
Big Data sampling/summarization | Included | Included |
Model storage and scoring | Included | Included |