Tutorial: Using BigQuery to Analyze CrUX Data

Last blog I gave some examples of how we can use the Chrome User Experience report (CrUX) to gain some insights about site speed. This blog I will continue to show you how to use bigquery to compare your site with the competitors.

Prerequisite:

  •  Log into Google Cloud,
  • Create a project for the CrUX work
  • Avigate to BigQuery console
  • Add the chrome-ux-report dataset and explore the way the tables are structured in ‘preview’

Step one: Figure out what is the origin of your site and the competitor site

like syntax is preferred (Take care of the syntax difference between Standard SQL and T-SQL)

 -- created by: Jacqui Wu
  -- data source: Chrome-ux-report(202003)
  -- last update: 12/05/2020
  
SELECT
  DISTINCT origin
FROM
  `chrome-ux-report.all.202003`
WHERE
  origin LIKE '%yoursite'

Step two: Figure out what should be queried in the select clause?

What we can query from CrUX?

The specific elements that Google is sharing are:

  • “Origin”, which consists of the protocol and hostname, as we used in step one, which can make sure the URL link
  • Effective Connection Type (4G, 3G, etc), which can be queried as the network
  • Form Factor (desktop, mobile, tablet), which can be queried as the device
  • Percentile Histogram data for First Paint, First Contentful Paint, DOM Content Loaded and onLoad (these are all nested, so if we want to query them, we need to unnest them)

Here I create a SQL query of FCP percentage in different sites, which measures the time from navigation to the time when the browser renders the first bit of content from the DOM.

This is an important milestone for users because it provides feedback that the page is actually loading.

SQL queries: 

  -- created by: Jacqui Wu
  -- data source: Chrome-ux-report(202003) in diffrent sites
  -- last update: 12/05/2020
  -- Comparing fcp metric in Different Sites

SELECT origin, form_factor.name AS device, effective_connection_type.name AS conn, "first contentful paint" AS metric, bin.start/1000 AS bin, SUM(bin.density) AS volume
FROM(  
SELECT origin, form_factor, effective_connection_type, first_contentful_paint.histogram.bin as bins
FROM `chrome-ux-report.all.202003`
WHERE origin IN ("your site URL link", "competitor A site URL link", "competitor B site URL link")
)
CROSSS JOIN UNNEST(bins) AS bin
GROUP BY origin, device, conn, bin

Step 3: Export the results to the Data Studio(Google visualization tool)

Here are some tips may be useful

  1. Line chart is preferred for comparing different sites in Visual Selection
  2. Set x-axis to bin(which we already calculate it to seconds) and y-axis to percentage of fcp
  3. Set filter(origin, device, conn) in Filtering section

Wrapping up

This post explored the data pipeline we can use CrUX report to analyze the site performance. In the future, I will write more about CrUX.

If you are interested in or have any problems with CrUX or Business Intelligence, feel free to contact me.

Or you can connect with me through my LinkedIn.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s