Depending on the requirements, the Deployment phase can be as simple as generating a report or as complex as implementing a repeatable data mining process. Also, in Chapter 10, Decision Tree Models we go through a thorough demonstration of one modeling technique, decision trees, to orient you to modeling in Modeler.Ĭreation of the model is generally not the end of the project. While this book is not an algorithms guide, and even though it is impossible to offer a chapter on each algorithm, Chapter 9, Introduction to Modeling Options in IBM SPSS Modeler should be very helpful in understanding, at a high level, what options are available in Modeler. Open source options such as R have hundreds of choices. Even in a well-curated workbench such as Modeler, there are dozens of choices. There are an overwhelming number of algorithms to choose from. Despite the fact that the algorithms are doing the heavy lifting in this phase, it is generally considered the most intimidating it is understandable why. In many ways, this is the easiest phase, as the algorithms do a lot of the work if you have done an excellent job on the prior phases and you've done a good job translating the business problem into a data mining problem. The Modeling phase is probably what you expect it to be-the phase where the modeling algorithms move to the forefront.
When you are ready for a more advanced treatment of this topic, there are two resources that will go into Data Preparation in much more depth, and both have extensive Modeler software examples: The IBM SPSS Modeler Cookbook ( Packt Publishing) and Effective Data Preparation ( Cambridge University Press). However, a book dedicated to the basics of data mining can really only start you on your journey when it comes to Data Preparation, since there are so many ways in which you can improve and prepare data. We cover cleaning, selecting, integrating, and constructing data, in Chapter 5, Cleaning and Selecting Data Chapter 6, Combining Data Files and Chapter 7, Deriving New Fields, respectively. It is terribly important that Data Preparation is done well, and a substantial amount of this book is dedicated to it. Data Preparation is often described as the most labor-intensive phase for the data analyst.
The Data Preparation phase covers all activities to construct the final dataset (the data that will be fed into the modeling tool(s)) from the initial raw data.
By the end of this book, you will have a firm understanding of the basics of data mining and how to effectively use Modeler to build predictive models. Finally, you will see how you can score new data and export your predictions.
Assessing a model’s performance is as important as building it this book will also show you how to do that. This book provides an overview of various popular data modeling techniques and presents a detailed case study of how to use CHAID, a decision tree model. The authors have drawn upon their decades of teaching thousands of new users, to choose those aspects of Modeler that you should learn first, so that you get off to a good start using proven best practices. Using a single case study throughout, this intentionally short and focused book sticks to the essentials. You will learn how to read data into Modeler, assess data quality, prepare your data for modeling, find interesting patterns and relationships within your data, and export your predictions. This book takes a detailed, step-by-step approach to introducing data mining using the de facto standard process, CRISP-DM, and Modeler’s easy to learn “visual programming” style. Since it is popular in corporate settings, widely available in university settings, and highly compatible with all the latest technologies, it is the perfect way to start your Data Science and Machine Learning journey.
With almost 25 years of history, Modeler is the most established and comprehensive Data Mining workbench available. IBM SPSS Modeler allows users to quickly and efficiently use predictive analytics and gain insights from your data.