Binning or discretization is the process of transforming numerical variables
into categorical counterparts.
An example is to bin values for Age into categories such as 20-39, 40-59, and
60-79. Numerical variables are usually discretized in the modeling methods
based on frequency tables (e.g., decision trees). Moreover, binning
may improve accuracy of the predictive models by reducing the noise or
non-linearity. Finally, binning allows easy identification of outliers, invalid and missing
values of numerical variables. |
|
|