Initial writeup: Sept 2022. Last update: Sept 2023.
Looker Studio, formerly Google Data Studio, is a free and quick solution for interactive visual data analysis. Following is a comprehensive list for instructor reference. Depending on the underlying datasets and the tasks at hand, only a fraction are applicable. Feel free to cherry pick for your workshops.
Knowledge tree
- Register GDS account
- Register GCP account (need credit card; for BigQuery connection)
- Create first first blank dashboard; create more pages; create static page elements like labels
- Connect to data source of Google Sheets tab
- Add basic charts: pie / bar/ line
- Understand visualisation basics:
- Dimension: used to group data records/ rows, usually presented as is.
- Metric: numeric values that can be aggregated (for a group), or presented as is.
- Filter: select a subset if data
- Add time series: (note, different from just "line")
- Change date dimension aggregation: day, week, month. quarter, year
- Use break down dimension to generate multi-lines on one time series plot (depends on your data, usually "long table") (Ref: wide v.s. long)
- Use multiple metric columns to generate multi-lines on one times series plot (depends on your data, usually "wide table")
- Add table:
- Variant on metric columns: table with bars; table with heatmap
- Sort by columns
- Move columns between dimensions and metrics (for those can be both)
- Chart refinements: ("Style" tab)
- Change color/ font size/ number precision
- Turn a series into cumulative one
- Add reference line/ reference band
- Add trend line (e.g. regression using polynomials)
- Understand formula: (full list of functions)
- Row level transformations: (applied row by row)
- Arithmetic, e.g.
ABS()
- Conditional, e.g.
IF()
- Date
- Geo
- Text
- Arithmetic, e.g.
- Aggregation/ window function -- Key difference: those function are applied on a group of records/ rows implied by the dimensions selected in a chart. e.g.
AVG()
,MAX()
.
- Row level transformations: (applied row by row)
- Formula exercise:
- Round one column to single decimal point
- Get absolute value of a column
- Concatenate two or more string/ text columns
- Format clickable URLs from text columns
- Discretise a continuous variable to ordinal/ categorical variable (with
IF
/CASE
)- Example 1: Turn # of cases into "high"/ "low" texts
- Example 2: Create age group buckets from integer column to plot histogram
- Extract date parts from a date time variable, e.g. hour of day, day, month, year
- Use window function to understand the data distribution:
MIN
,MAX
,MEDIAN
,PERCENTILE
- Add Map:
- Filled map: "choropleth"
- Scatter map: variants can be bubble map, heatmap
- Add control (filter that applies on all charts in a dashboard page):
- Add Drop down list control
- Add Advanced text search control
- Add Date range filter control (special design in GDS)
- Change the default window to relative to today, e.g. past week
- Understand data blend: enrich data on the fly. The blended tables are first aggregated according to dimensions configured
- Data blend exercise
- Find two time series with daily granularity and blend them into one view. e.g. table 1 contains # of confirmed cases by day, table 2 contains dine-out behaviour by day.
- Use blend to enrich and further aggregate one fact table. e.g. table 1 contains # of confirmed cases by Constituency Area (~400 in HK), table 2 contains Constituency Area to District Council (18 in HK)
- Try other data sources:
- Connect to data source of BigQuery table
- Connect to MySQL (or Google CloudSQL) database
- Try custom SQL query integration
- Publish dashboard:
- Set some pages to hide in view mode and only visible in edit mode
- Use URL parameters to make direct links to filtered views that are frequently visited