Overview
Collections of data can be divided into two fundamental groups: categorical and numerical.
Categorical data are qualitative (i.e., text) values. Examples include regions or product types. Categorical data can be further divided into:
- Ordinal data, where categories can be ordered in a certain way (e.g., education levels like "High School," "Bachelor's Degree," and "Master's Degree").
- Nominal data, where categories have no inherent order (e.g., colors or product types).
Numerical data are quantitative (i.e., numerical or temporal) values. These are measurable quantities like temperature, age, income, or year. Continuous data can be further categorized as:
- Continuous data, where values can take any number within a range and are often represented on a continuous scale (e.g., height, weight, or time).
- Discrete data, where values are distinct and finite, often counted in whole numbers (e.g., number of children).
- Binned data, where values are grouped in a particular way (e.g., grouping ages into intervals like 0-10, 11-20, etc.).
In Mappica, text fields are treated as categorical data when used to construct axes and color ramps. Number, percent, currency, and date fields are always treated as continuous data.
Implications for Color Assignment
When you select a chart, map, or table's Color Field, color assignment is handled as follows:
- If you choose a continuous field (i.e. text data type), you can assign unique colors to each unique value. These are "Categorical Colors."
- If you choose a numerical field (i.e. number, percent, currency, or date data type), you can set up a Color Gradient. These are "Continuous Colors."
When you select continuous colors, you can also select a Color Scale. The options available are "Linear", "Discrete", "Square Root", "Logarithmic", or "Bi-symmetric Logarithmic":
- The Linear scale maps values evenly across the color gradient. This is the default scale, and works well for evenly distributed data.
- The Discrete scale divides the color gradient into bins, assigning each bin a distinct color. If you are using a predefined gradient, you will be able to choose the total number of Color Bins. If you have created a custom gradient, a bin will be created for each color you add.
- The Square Root scale applies a square root transformation, enhancing the visibility of smaller values while moderately compressing higher ones. It can be suitable for dataset fields with dense low values and/or high outliers, and provides a less extreme adjustment compared to the Logarithmic scale.
- The Logarithmic scale uses a logarithmic transformation, which scales down high values exponentially and makes differences among smaller values more apparent. This can work well for dataset fields spanning multiple orders of magnitude but is unsuitable for fields containing zero or negative values.
- The Bi-symmetric Logarithmic scale handles both positive and negative values symmetrically, applying a logarithmic transformation away from zero while keeping values near zero linear. This can be helpful for dataset fields with wide-ranging positive and negative numbers.
The three buttons above this choropleth map demonstrate different ways of incorporating colors. In the "Linear" and "Discrete" examples, a numerical field has been selected as the Color Field, with the color scale set to "Linear" or "Discrete", respectively. In the "Categorical" example, a text field has been selected as the Color Field, resulting in categorical color assignments.