Articles
Link to Home

Link to Articles

Link to Consulting

Link to Seminar

Link to Stat-Chat

Link to Software

Link to Clients


If you would like to be notified when new articles are added, please click here.


Bruce Ratner's book
Statistical Modeling and Analysis for Database Marketing:
Effective Techniques for Mining Big Data (2003)

NEW!
Statistical and Machine-Learning Data Mining: Techniques for Better Modeling and Analyzing Big Data (2012)
 



Top Articles: Solutions
 
 
Top Articles: Analytics
 

VOLUME 21 (2017)
 1.  Third Edition of My Book
 
VOLUME 20 (2016)
 12. Profile Analysis of Any Regression-based Model
 11. Opening the Dataset: A Twelve-Step Program for Dataholics 
 10. Opening the Dataset: Confession of a Dataholic 
 9. Market Segmentation: An Easy Way to Understand the Segments 
 8. The Statistical Golden Rule: Measuring the Art and Science of Statistical Practice 
 7. What is Your First Data Step? 
 6. Statisticians Have a Bad Habit 
 5. Power of Thought
 4. One Pound of Pennies: The Correlation Between the Mean Value of Pennies and the Skew of the Year of Mint  
 3. Stevens’ Four Scales of Measurement: The Addition of a New Scale 
 2. Apple and Orange Comparison: Statistically Fruitless or Fruitful? 
 1. Profile Analysis of Any Regression-based Model
 
VOLUME 19 (2015)
 10. Big Data, Schmea Data, It Still Boils Down to the Super Six Statistics 
 9. Book-Mash: Random Stacking of Statistics Books 
 8. A Glass of Water vs. A Can of Trash: What Say You, Half-Empty or Half-Full? 
 7. Wouldn’t It Be Nice to Have a Regression Technique that Builds the Best Model Possible Within an Allotted Time? 
 6. Life-Time Value Modeling of Big-ticket Items 
 5. My Statistics Floater: One-Sample Test for Two Mutually-Exclusive Proportions 
 4. Zero-Inflated Regression: Modeling a Distribution with a Mass at Zero 
 3. The Originative Regression Models: Are They too Old and Untenable? 
 2. Building a Multi-Level Classification Model to Simultaneously Maximize Decile Tables for Each Level, Not the Traditional Confusion Matrix
 1. Outperforming a Multi-Level Classification Model Whose Chance Performance is Large
 
VOLUME 18 (2014)
 14. Data Mining and the Golden Gut: Complementary, Supplementary or Mutually Exclusive? 
 13. Principal Component Analysis of Yesterday and Today
 12. The Uplift Model: Building a Database Model to Assess the True Impact of a Test Campaign 
 11. A Data Mining Method for Moderating Outliers, Instead of Discarding Them 
 10. The Originative Statistical Regression Models: Are They Too Old and Untenable? 
  9. The Predictive Model: Its Reliability and Validity 
 8. Accidental Statistician: Who Can Befitted of a Self-described Caption?  
 7. Life-Time Value Modeling of Big-ticket Items 
 6. Validating the Logistic Regression Model: Try Bootstrapping 
 5. Regression Modeling Involves Art, Science, and Poetry Too 
 4. Re-Data-Mining Your Constantly-updated Database: A Criterion for Doing So 
 3. What Criteria Do You Use to Build a Model that Maximizes the Cum Lift? 
 2. What Criteria Do You Use to Determine the Best Model? 
 1. Top Five Statistical Modeling Problems: Nonissues for the Machine-learning GenIQ Model 
 
VOLUME 17 (2013)
 10. Statistical vs. Machine-Learning Data Mining 
  9. CHAID-based Data Mining for Paired-Variable Assessment 
  8. The Missing Statistic in the Decile Table: The Confidence Interval 
  7. A Popular Statistical Term Coined with the Formula X's Y 
  6. "Few things are harder to put up with than the annoyance of a good (statistics) example" 
  5. The Importance of Straight Data: Simplicity and Desirability for Good Model Building Practice 
  4. Social Marketing Intelligence for Sweeping Improvement in Marketing Campaigns 
  3. The Paradox of Overfitting 
  2. Building a Database Model to Outperform a Test Campaign 
  1. Model Selection for Credit Card Profitable Approval 

VOLUME 16 (2012)
 10.  To Fit or Not to Fit Data to a Model
 9. Assessing the Predictiveness of a Classification Model: Traditional vs. Modern Methods 
 8. Two-by-Two Classification and Decile Tables - A Comparison 
 7. Survival of the Fittest: Who Coined It, and When?  
 6. Genetic vs. Statistic Regression - A Comparison 
 5. Your Customers are Talking: Are You Listening? 
 4. Is Not a Response-Model Tree a Response-Model Tree by Any Other Name? 
 3. Interpretation of Coefficient-free Models 
 2. Controlling Credit Risk: Building a Not-Yet Popular Forecasting Model 
 1. Social Network Analysis, Social Media Data, and Text Mining to Boost Business Intelligence 

VOLUME 15 (2011) 
 10. Predictive Modeling Using Real-time Data 
 9. Improve Marketing ROI: Predictive Analytics Using Real-time Data 
8.  Data Mining Quiz <> Data Mining Quiz - II 
 7. How Large a Sample is Required to Build a Database Response Model? 
 6. A Customer Intelligence Model: A New Approach to Gain Customer Insight 
 5. How Does Spearman's Coefficient Relate to Pearson's Coefficient? 
 4. CHAID: Nine Inventive, Utile Applications Beyond Its Original Intent 
 3. Marketing Optimization: Regression-tree Approach for Outbound Campaigns 
 2.  Calculating the Average Correlation Coefficient: Why?
 1.  Data Mining: Illustration of the Pythagorean Theorem
 
VOLUME 14 (2010)
 10. Stepwise is a Problematic Method for Variable Selection in Regression: Alternative Methods are Available 
 9. Identifying Your Best Customers: Descriptive, Predictive and Look-Alike Profiling 
 8. Subprime Lender Short Term Loan Models for Credit Default and Exposure 
 7.    Given the Irrational Number Pi, are the Digits after the Decimal Point Random?
 6. Variable Selection Methods in Regression: Ignorable Problem, Outing Notable Solution 
 5. What If There Were No Significance Testing? 
 4. If you can think …, then I guarantee … not to waste your time. 
 3. Predicting the Quality of Your Statistical Regression Models 
 2. Confusion Matrix: Perhaps Confusing, but Definitely Biased 
 1. What is the GenIQ Model? 

VOLUME 13b (2009)
 10. Linear Probability, Logit, and Probit Models: How Do They Differ?  
 9. Given an Irrational Number, are the Digits after the Decimal Point Random?
 8. How To Bootstrap 
 7. HELP! I Need Somebody, Not Just Anybody ... 
 6. A Database Marketing Regression Model that Maximizes Cum Lift 
 5. A New Method of Modeling Missing Data: Deliverance of Discarded, Incomplete Cases 
 4. Predicting Share of Wallet without Survey Data 
 3. Do-It-Yourself Method for Finding the Square Root of 2
 2. Variable Selection Methods in Regression: Many Statisticians Know Them, But Few Know They Produce Poorly Performing Models
 1. Statistical Modelers and Data Miners: Variable Selection, Data Mining Paradigm, Optimal Decile Table, and more ... 

 VOLUME 13a (2009)
 10. Pythagoras: Everyone Knows His Famous Theorem, but Not Who Discovered It One Thousand Years before Him 
 9. A Trilogy of “Item” Biographies of Our Favorite Statisticians
 8. The GenIQ Model: Data-defined, Data Mining, Variable Selection, and Decile Optimization 
 7. GenIQ: A Visual Introduction  
 6. Genetic Data Mining: The Correlation Coefficient 
 5. Data Mining: An Ill-defined Concept 
 4. How to Make the Best Credit Score Even Better 
 3. Data Cleaning is Not Completed Until the “Noise” is Eliminated 
 2. Overfitting: Old Problem, New Solution 
 1. Statistical Modeling Problems: Nonissue for GenIQ 
 
VOLUME 12c (2008)
 10. The Correlation Coefficient: Its Values Range Between Plus/Minus 1, or Do They? 
 9. The Importance of Straight Data: For Simplicity, Desirable for Good Modeling 
 8. GenIQ-enhanced/Data-reused Regression 
 7. Different Data, Identical Regression Models: Which Model is Better? 
 6. Subprime Lender Short Term Loan Models for Credit Default and Exposure 
 5. Historical View of Three Regression Models 
 4. GenIQ-enhanced Regression Model 
 3. Statistical Terms: Who Coined Them, and When? 
 2. Credit Risk Modeling – A Machine Learning Approach 
 1. Finding Tax Cheaters Easily

 VOLUME 12b (2008)
 10. GenIQ: OLS Curve Fitter
 9. GenIQ: Nonlinear Curve Fitter 
 8. Fundraising Modeling: Competitive and Successful 
 7. Retail Revenue Optimization: Accounting for Profit-eating Markdowns 
 6. Extracting Nonlinear Dependencies: An Easy, Automatic Method 
 5. Radically Distinctive Without Equal Predictive Model 
 4. CRM Success with Data Mining 
 3. Gaining Insights from Your Data: A Neoteric Machine Learning Method 
 2. Data Mining Paradigm: Historical Perspective
 1. Data Mining for the Desktop 
 
VOLUME 12a (2008)                                                       
10.  Data Mining Using Genetic Programming 
 9. Analytical Model Development and Deployment 
 8. Nonprofit Modeling: Remaining Competitive and Successful
 7. Multiple Catalog Mail Campaigns: Who Gets Mailed Next, and Which Catalog Should It Be? 
 6.  Detecting Fraudulent Insurance Claims: A Machine Learning Approach
 5. Demand Forecasting for Retail: A Genetic Approach
 4. Optimizing Website Content via the Taguchi Method
 3. Risk Management for the Insurance Industry: A Machine Learning Approach
 2. The GenIQ Model: A Method that Lets the Data Specify the Model
 1. Quantile Regression: Model-free Approach

VOLUME 11c (2007)
10.  The Most Compelling Illustration of the GenIQ Model 
9.  The Genetic Programming Engine that Does: Data Specify the Model, Not Fit Data to a Model 
8.  Subprime Borrower Market: Building a Subprime Lender Scoring Model for a Homogeneous Segment 
7.  Interpreting Model Performance: Use the "Smart" Decile Analysis 
6.  Product Positioning: Predicting the Next Best Offer to Give Customers
5.  Marketing Optimization Model: A Genetic Approach 
4.  The GenIQ Model: FAQs
3.  Missing Value Analysis: A Machine-learning Approach 
2.  Retain Best Customers and Maximize their Potential: A CRM Machine-learning Approach 
1.  Gain of a Predictive Information Advantage: Data Mining via Evolution  

VOLUME 11b (2007)
10.  A 9-Step Computer Program for Analysts Who Want to Better Their Modeling 
9.  Retail Revenue Optimization: A Model-free Approach 
8.  Data Smoothing: An Application of CHAID 
7.  Tukey's Bulging Rule: Why Use It, and What to Do When It Fails  
6.  Logistic Regression: An Overview 
5.  Tukey's Bulging Rule for Straightening Data 
4.  “Dumb” Decile Analysis versus “Smart” Decile Analysis: Identifying Extreme Response Segments 
3.  Credit Scoring: A New Approach to Control Risk
2.  Market Segmentation: Defining Target Markets with CHAID 
1.  Predictive Analytics Now Accessible to Excel Spreadsheet Users:
GenIQ Model Software with an Excel Toolbar
 
 
VOLUME 11a (2007)
10.  The "Primo" Data Mining Book
9.  Explaining Collaborative Filtering: An Openwork 
8.  The Correlation Coefficient: Definition 
7.  CHAID: Its Original Intent 
6.  Multivariate Regression Trees: An Alternative Method 
5.  Market Segment Classification Modeling with Machine Learning 
4.  Maximizing the Lift in Database Marketing
3.  Direct Response Marketing 
2.  Discrimination Between Alternative Binary Response Models
1.  An Alternative Response Model

 VOLUME 10f (2006)
10.  Workforce Optimization 
9.  Unconventional Thinking for Increasing Profits 
8.  Exploratory Data Analysis for Large and Complex Data 
7.  Financial Intelligence: Understanding Profit Drivers and Growing Profitability 
6.  CRM Segmentation for Targeted Marketing 
5.  CRM for the Publication Industry: Subscriber-Centric Targeted Market Modeling 
4.  CRM: Cross-Sell and Up-Sell to Improve Response Rates and Increase Revenue 
3.  Decile Analysis Primer: Cum Lift for Response Model 
2.  A Machine Learning Approach to Conjoint Analysis 
1..  The Banking Industry Problem-Solution: Reduce Costs, Increase Profits by Data Mining and Modeling 
 
VOLUME 10e (2006)
 
10.  Latent Class Analysis and Modeling: A Pharmaceutical Case Study 
9.  Enhancing Model Performance
8.  Risk Analytics for Telecommunication  
7.  A Variable Selection Method that Provides a Unique Ranking of Variable Importance 
6.  Telecommunication Fraud Reduction: Analytical Approaches
5.  Optimizing Customer Loyalty 
4.  CHAID for Uncovering Relationships: A Data Mining Tool 
3.  Fraud Detection: Beyond the Rules-Based Approach
2.  Trigger Marketing: Predicting the Next Best Offer to Give Customers 
1.  Data Preparation: Never Drop Original Variables, Always Create Copies of Them 

VOLUME 10d (2006)
10.  A Unique Data Mining Tool for Direct Marketing 
9.  A Genetic Logistic Regression Model: A Model-free Approach to Identifying Responders to a CRM Solicitation 
8.  Assessing the Importance of Variables in Database Response Models  
7.  Expanding Your Statistical Computing Toolbox 
6.  When Statistical Model Performance is Poor: Try Something New, and Try It Again 
5.  Analysis and Modeling for Today's Data 
4.  Building a Database Zipcode Acquisition Model 
3.  A Phat Example of the GenIQ Model's Predictive Power  
2.  GenIQ-Parkinson's Law: The GenIQ Model Expands to Fill the Time Available for Model Completion 
1. When Data Are Too Large to Handle in the Memory of Your Computer 

VOLUME 10c (2006)
10.  Algorithmic Methods: Non-Statistical Methods Solving Statistical Problems 
9.  Using the GenIQ Model to Insure the Validation of a Model is Unbiased 
8.  Rare Event Sampling 
7.  Data Preparation for Determining Sample Size 
6.  Data Preparation for Big Data
5.  Generating a Random Sample of Alphabet Letters: Why? 
4.  The 80/20 Rule: Revised for Data Preparation 
3.  Response-Approval Model: An Effective Approach for Implementation
2.  Trend Extrapolation:Will the Trend Bend?
1.  Technical Report #12: Counting the Number of Records in a By-Group 

VOLUME 10b (2006)
10.  Modeling a Distribution with a Mass at Zero 
9.  A New Method of Modeling Missing Data: Deliverance of Discarded, Incomplete Cases
8.  A Genetic Model to Identify Titanic Survivors  
7.  Technical Report #11: 
Calculating Complete-case Analysis Sample Size
6. Technical Report #10: Counting Missing Values for Any Variable  
5.  Marketing Mix Model: A Genetic Approach 
4. Technical Report #9: Calculating the Average Correlation Coefficient of a Correlation Matrix
3.  Rethink The Regression Model: Think GenIQ Model 
2. Technical Report #8: Scoring An Oblique Principal Component  
1. Handling Qualitative Attributes: Upgrading Discrete Heritable Information

VOLUME 10a (2006)
 
10. Marketing Mix Model: Right Offer, Right Time, and Right Channel
9.  A Regression Tree Approach for Optimizing Price and Package Offerings
8.  Technical Report #7: Creating Time-on-File Variable
7. Model Selection Is A Problem
6. Customer-Value Based Segmentation: An Overview
5. A New Method for Collections & Recovery Models
4 . Genetic Data Mining Method for the Proper Use of the Correlation Coefficient
3. Data Mining 101
2. Data Mining Paradigm
1. A Database Marketing Model for Zero-inflated Data

DM STAT-1 DIGEST G - GenIQ Model Cognate Articles

DM STAT-1 DIGEST I - Data Mining and Its Applications

DM STAT-1 DIGEST II - CRM Applications

DM STAT-1 DIGEST II -
Logistic Regression and Related Issues


DM STAT-1 DIGEST IV - 
Data Prep, Missing Data, Data Cleaning, Sampling, etc.


DM STAT-1 DIGEST V - Novel Uses of CHAID

DM STAT-1 DIGEST VI - Useful SAS Programs

DM STAT-1 DIGEST VII -
Common Problems/Proper Solutions


DM STAT-1 DIGEST VIII - Market Segmentation


VOLUME 9b (2005)
5.  A Genetic Jackknife Method: 3-in-1 Tool for Variable Selection, Data Mining and Model Building
4. Identifying Your Best Customers: Descriptive, Predictive and Look-Alike Profiling
3. A Very Automatic Coding of Dummy Variables
2. A Simple Data Cleaning Method for Boosting the Reliability and Performance of Database Models
1. Automatic Coding of Dummy Variables

VOLUME 9a (2005)
6. Contact Center Analytics: Driving Costs Down and Revenue Up
5. A Better Method for Building a High-value Customer Model
4. Technical Report #5: Collapsing Multiple Observations For An Individual Into A Single Observation
3. Model Selection by Means of Natural Selection
2. An Advanced Analytic Approach for Increasing the Value of Customer Retention
1. High Performance Computing for Discovering Interesting and Previously Unknown Information in Direct Marketing Data

VOLUME 8b (2004)
6. Sensitivity Analysis for Database Marketing Models
5. A Model-free Approach to Conjoint Analysis for Optimizing Price and Package Offerings
4. A Simple Bootstrap Variable Selection Method for Building Database Marketing Models
3. A Very Automatic Coding of Dummy Variables
2. Determining Which Variables in a Model Are Its Most Important Predictors: The Predictive Contribution Coefficient
1. "How Large a Sample is Required to Build a Database Response Model?"
 
VOLUME 8a (2004)
8. A Hybrid Statistics-Machine Learning Paradigm for Database Response Modeling
7. Statistics versus Machine Learning: A Significant Difference for Database Response Modeling
   
   
6. Building a CRM Model for Identifying Profitable Leads: The Genetic Contact-Profit Model
5. A New Technique for B-to-B Lead Generation: The Genetic Contact-Conversion Model
   
4. A New CRM Method for Generating Successful Leads: The Genetic Contact-Conversion Model
   
3. Building A Database Response Model for Categorical Data
2. A New Jackknife Method: 3-in-1 Tool for Variable Selection, Data Mining and Model Building
1. A New CRM Method for Identifying High-value Responders

VOLUME 7b (2003)
8. A New Data Mining Method for Identifying Extreme Response Segments
7. The Best-of-Generation Database Model: The GenIQ Model
6. A New Method of Decile Analysis Optimization for Database Models
5. A Genetic Approach to Building a Database Marketing Censored Regression Model
4. A Genetic Imputation Method for Database Modeling
3. A New Method for Including Qualitative Information in Database Models
2. Data Mining for Predictive Value of Discarded Individuals with Missing Data
1. A Non-Imputation Methodology for Database Modeling with Missing Data

VOLUME 7a (2003)
 
7. Sample Balancing for Extremely Small Population Response Rates
6. Sample Balancing for Database Response Models
   
5. The Working Concepts for Building a Database Acquisition Model
4. The Working Concepts for Building a Database Retention Model
3. The Working Concepts for Building a Database Attrition Model
   
2. A Simple Method for Assessing Linear Trend and Seasonality Components in Database Models
1. A Simple Data Cleaning Method for Boosting the Reliability and Performance of Database Models

VOLUME 6 (2002)
4. Interpretation of Coefficient-free Models (appears in Statistical Modeling and Analysis for Database Marketing: Effective Techniques for Mining Big Data)
3. Visualization of Database Models (appears in Statistical Modeling and Analysis for Database Marketing: Effective Techniques for Mining Big Data)
2. Quasi-MAID: An Alternative Method for Multivariate Regression (appears in Statistical Modeling and Analysis for Database Marketing: Effective Techniques for Mining Big Data)
1. A Simple Data Mining Method for Variable Assessment (appears in Statistical Modeling and Analysis for Database Marketing: Effective Techniques for Mining Big Data)

VOLUME 5 (2001)
4. Rapid Statistical Calculations for Determining the Success of Marketing Campaigns (also will appear in Journal of Targeting, Measurement and Analysis for Marketing, 2002)
3. Technical Report #4: Building and Scoring A Logistic Regression Model(appears in Statistical Modeling and Analysis for Database Marketing: Effective Techniques for Mining Big Data)
2. Technical Report #3: Creating A Bootstrap Sample (appears in Statistical Modeling and Analysis for Database Marketing: Effective Techniques for Mining Big Data)
1. The Importance of the Regression Coefficient (appears in Statistical Modeling and Analysis for Database Marketing: Effective Techniques for Mining Big Data)

VOLUME 4 (2000)
4. A Comparison of Two Popular Machine Learning Methods: Common Pitfalls(also will appear in Journal of Targeting, Measurement and Analysis for Marketing, 2001)
3. Technical Report #2: Scoring A Principal Component (appears in Statistical Modeling and Analysis for Database Marketing: Effective Techniques for Mining Big Data)
2. Finding the Best Variables for Direct Marketing Models (also will appear in Journal of Targeting, Measurement and Analysis for Marketing, 2000)
1. CHAID As a Method for Filling In Missing Values (also will appear in Journal of Targeting, Measurement and Analysis for Marketing, 2000)

VOLUME 3 (1999)
4. Genetic Modeling in Direct Marketing (appears in Journal of Research Council of Direct Marketing Association, 1999)
3. Technical Report #1: Automatic Coding of Dummy Variables (appears in Statistical Modeling and Analysis for Database Marketing: Effective Techniques for Mining Big Data)
2. CHAID for Specifying a Model with Interaction Variables (appears in Journal of Targeting, Measurement and Analysis for Marketing, 1999)
1. Identifying Your Best Customers: Descriptive, Predictive and Look-Alike Profiling (appears in Journal of Targeting, Measurement and Analysis for Marketing, 1999)

VOLUME 2 (1998)
4. Profile Curves: A Method of Multivariate Comparison of Groups (appears in Journal of Research Council of Direct Marketing Association, 1999)
3. What Do My Customers Look Like? Look At The Stars! (appears in Statistical Modeling and Analysis for Database Marketing: Effective Techniques for Mining Big Data)
2. Alternative Direct Marketing Response Models:
Linear Probability, Logit And Probit Models
(appears in Journal of Targeting, Measurement and Analysis for Marketing, Volume Seven, Number 3, 1999)
1. Assessment of Direct Marketing Response Models (appears in Journal of Targeting, Measurement and Analysis for Marketing, Volume Seven, Number 1, 1998)

VOLUME 1 (1997)
5. Market Segment Classification Modelling with Logistic Regression (appears in Journal of Targeting, Measurement and Analysis for Marketing, Volume Seven, Number 4, 1999)
4. Direct Marketing Models Using Genetic Algorithms (appears in Journal of Targeting, Measurement and Analysis for Marketing, Volume Six, Number 4, 1998)
3. Bootstraping In Direct Marketing: A New Approach for Validating Response Models (appears in Journal of Targeting, Measurement and Analysis for Marketing,Volume Six, Number 2, 1997)
2. CHAID For Interpreting A Logistic Regression Model (appears in Journal of Targeting, Measurement and Analysis for Marketing, Volume Six, Number 3, 1998)
1. A New Modelling Technique for Maximizing Profits from Solicitations(appears in Statistical Modeling and Analysis for Database Marketing: Effective Techniques for Mining Big Data)