Duration: 40 hrs
Fee: 15,000 INR
IBM SPSS Statistics is a leading desktop statistical analytics and reporting tool. SPSS can take data from almost any type of file and use them to generate tabulated reports, charts, and plots of distributions and trends, descriptive statistics, and conduct complex statistical analyses.
SPSS is among the most widely used programs for almost every industry, including telecommunications, banking, finance, insurance, healthcare, manufacturing, retail, consumer packaged goods, higher education, government, and market research.
To discuss any of these capabilities further, please contact us.
Syllabus Content:
Business Understanding
- CRISP-DM process methodology
- Identifying business objectives
- Translating business objectives to data mining goals
Data Understanding
- Reading data from various sources – Source Nodes
- Database
- Excel
- Text
- Using data visualization – Graph Nodes
- Distribution node
- Histogram node
- Web node
- Graphboard node
- Understanding distributions and summary statistics
- Statistics node
- Data audit node
- Identifying data quality issues
- Type node/tab
- Data audit node/output
- Identifying and understanding outliers
- Data Audit node/output
- Identifying anomalies – anomaly node
- Understanding relationships among variables
- Matrix node
- Means node
Data Preparation
- Using the Merge and Append nodes to combine datasets
- Deriving new fields – Fields Pallet Nodes
- Working with numeric fields
- Working with string fields
- Working with date and time fields
- Binning range fields
- Reclassifiying categorical fields
- Aggregate and restructure datasets
- Using the Select node
- Sampling and Balancing datasets
- Methods for reducing the dimensionality of the dataset
- Factor/PCA node
- Feature Selection node
- Understanding SQL pushback
- Understanding use of data caching
- Methods for Missing Value Replacement
Modeling
- Partitioning the dataset
- Understanding which models to use for sets or binary outcomes
- Understanding which models to use for numeric outcomes
- Understanding model types and basic operations
- Association Models
- Predictive Models
- Clustering Models
- Combining models using the Ensemble node
- Auto modeling nodes
Evaluation of Results
- Using the Analysis node
- Producing and interpreting Evaluation charts
- Lift charts
- Gains charts
- Using data visualizations (charts) and classification tables to interpret model results
- Interpreting Generated Model Nuggets
Deployment of Results
- Using the Export Nodes
- Scoring new data using generated models
- Understanding monitoring of deployed models
Please contact us to discuss your requirements further.