DM Stat-1 Articles
Link to Home

Link to Articles

Link to Consulting

Link to Seminar

Link to Stat-Chat

Link to Software

Link to Clients

Tukey's Bulging Rule
 for Straightening Data
Bruce Ratner, Ph.D.

"A very effective and simple technique for straightening data is re-expressing the variables, which uses Tukey’s Ladder of Powers and the Bulging Rule. Before presenting the details of the technique, it is worth discussing the importance of straight-line relationships or straight data."
-  Ratner, B., Statistical Modeling and Analysis for Database Marketing: Effective Techniques for Mining Big Data, CRC Press, Boca Raton, 2006. The following is an excerpt from Chapter 3, pages 39 -41.

3.5.2 Bulging Rule

The Bulging Rule states the following:

  1. If the data have a shape similar to that shown in the first quadrant, then the data analyst tries re-expressing by going up-ladder for X, Y or both.
  2. If the data have a shape similar to that shown in the second quadrant, then the data analyst tries re-expressing by going the down-ladder for X, and/or up-ladder for Y.
  3. If the data have a shape similar to that shown in the third quadrant, then the data analyst tries re-expressing by going down-ladder for X, Y or both.
  4. If the data have a shape similar to that shown in the fourth quadrant, then the data analyst tries re-expressing by going the up-ladder for X, and/or down-ladder for Y.
Re-expressing is an important, yet fallible part of EDA detective work. While it will typically result in straightening the data, it might result in a deterioration of information. Here is why: re-expression (going down too far) has the potential to squeeze the data so much that its values become indistinguishable, resulting in a loss of information. Expansion (going up too far) can potentially pull apart the data so much that the new far-apart values lie within an artificial range, resulting in a spurious gain of information. ... An excellent real-case illustration follows (pages 41- 50 in the book).


Related Articles:
0. Tukey's Bulging Rule: What to Do When It Fails
1. The Correlation Coefficient: Definition
2. When Data Are Not Straight
3. Data Mining and Its Applications



For more information about this article, call Bruce Ratner at 516.791.3544,
1 800 DM STAT-1, or e-mail at br@dmstat1.com.