Skip to content

Data Visualization

Color Theory

Shape is less powerful than color.

Edward Tufte principles :

  1. Chartjunk
  2. Data-ink/Data-ink ratio
  3. Importance of Substance over design

Chartjunk

reduce junk in a graph

Data-ink

A large share of ink on a graphic should present data-information, the ink changing as the data change. Data-ink is the non-erasable core of graphic, the non-redundant ink arranged in response to variations in the number represented.

Data-ink ratio : Data-ink/Total-ink used to print the graphic.

8 Gestalt principles

  1. Proximity(equidistant objects for similar types)
  2. Law of similarity
  3. Law of closure : Our perception fills gaps
  4. Law of symmetry
  5. Law of common fate : moving points are percieved as same.
  6. Law of continuity
  7. Law of good gestalt : elements of an object seem to be perceived are grouped together if they form a pattern that is regular, simple or ordered
  8. Law of Past Experience : implies there are certain circumstances, when visual stimuli are characterized according to past experience

Types of diagrams

  1. Timeline diagrams
  2. Template
  3. Flowchart
  4. Checklist
  5. Mindmap

Line charts re commonly preferred for time series visualization. Bar plots for comparisons between totals across several groups. Stacked plots are an extension of bar plots, for working with several categories in a same group. For example rate of deaths per country within 2020-2021, will be caused primarily by COVID, but can be caused by more types of causes. Box plot displays the minimum, 25th percentile, the median, the 75th percentile and the maximum. It is useful to show the spread of data and for deriving inferences. Scatter plots help in association mining. Decision trees are good to represent logical classifications. Histograms are used to plot quantitative data, grouped into bins or intervals.It shows distribution of variables, whereas bar chart compare variables.

D3.js is very useful for this. Some key pointers are :

  1. Simplicity
  2. Selection
  3. Filtering
  4. Brushing and Linking
  5. Zooming and scaling

Interactive Data Visualization

TIBCO Spotfire

  1. Parallel coordinate plot.
  2. Treemap is useful for 2D visualizations of large hierarchical data.

Trifecta. Ranking Visual Encodings(Arranged on the basis of decreasing intuitive accuracy) :

  1. Position scale common and Position scale non-aligned
  2. Length
  3. Slope
  4. Angle
  5. Area
  6. Volume
  7. Color

Data Visualization tools

  1. TIBCO Sptfire
  2. Trifecta
  3. Qlik
  4. Tableau
  5. Microsoft Power BI
  6. Alteryx
  7. SAS
  8. SAP
  9. Sisense
  10. Microstrategy
  11. Salesforce
  12. Datawatch
  13. Zoomdata

D3.JS, R Charts (ggplot2 package), Pentaho, SAP Lumira, TIBCO Spotfire, QlikView, JasperSoft, and Microstrategy

Powered by VitePress