Data Visualization
Color Theory
Shape is less powerful than color.
Edward Tufte principles :
- Chartjunk
- Data-ink/Data-ink ratio
- Importance of Substance over design
Chartjunk
reduce junk in a graph
Data-ink
A large share of ink on a graphic should present data-information, the ink changing as the data change. Data-ink is the non-erasable core of graphic, the non-redundant ink arranged in response to variations in the number represented.
Data-ink ratio : Data-ink/Total-ink used to print the graphic.
8 Gestalt principles
- Proximity(equidistant objects for similar types)
- Law of similarity
- Law of closure : Our perception fills gaps
- Law of symmetry
- Law of common fate : moving points are percieved as same.
- Law of continuity
- Law of good gestalt : elements of an object seem to be perceived are grouped together if they form a pattern that is regular, simple or ordered
- Law of Past Experience : implies there are certain circumstances, when visual stimuli are characterized according to past experience
Types of diagrams
- Timeline diagrams
- Template
- Flowchart
- Checklist
- Mindmap
Line charts re commonly preferred for time series visualization. Bar plots for comparisons between totals across several groups. Stacked plots are an extension of bar plots, for working with several categories in a same group. For example rate of deaths per country within 2020-2021, will be caused primarily by COVID, but can be caused by more types of causes. Box plot displays the minimum, 25th percentile, the median, the 75th percentile and the maximum. It is useful to show the spread of data and for deriving inferences. Scatter plots help in association mining. Decision trees are good to represent logical classifications. Histograms are used to plot quantitative data, grouped into bins or intervals.It shows distribution of variables, whereas bar chart compare variables.
D3.js is very useful for this. Some key pointers are :
- Simplicity
- Selection
- Filtering
- Brushing and Linking
- Zooming and scaling
Interactive Data Visualization
TIBCO Spotfire
- Parallel coordinate plot.
- Treemap is useful for 2D visualizations of large hierarchical data.
Trifecta. Ranking Visual Encodings(Arranged on the basis of decreasing intuitive accuracy) :
- Position scale common and Position scale non-aligned
- Length
- Slope
- Angle
- Area
- Volume
- Color
Data Visualization tools
- TIBCO Sptfire
- Trifecta
- Qlik
- Tableau
- Microsoft Power BI
- Alteryx
- SAS
- SAP
- Sisense
- Microstrategy
- Salesforce
- Datawatch
- Zoomdata
D3.JS, R Charts (ggplot2 package), Pentaho, SAP Lumira, TIBCO Spotfire, QlikView, JasperSoft, and Microstrategy