Skip to content

Commit e7d41ab

Browse files
Tech report: Dedupe technology records (#60)
* versions * tech filter * new table with versions * typo * versions table * fix * no retries * tech_report_* tables * clusters renamed * lint * adjust export config * fix clustering * origin renamed * deduplicated good_cwv * include minor * fix * cleanup * pattern fix * tech detections only * fix * relaxed pattern * remove hashing (#59) * dedupe technologies * cleanup
1 parent e8e6c86 commit e7d41ab

File tree

2 files changed

+29
-16
lines changed

2 files changed

+29
-16
lines changed

definitions/output/reports/cwv_tech_categories.js

Lines changed: 9 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,14 @@ technology_stats AS (
5555
GROUP BY
5656
technology,
5757
categories
58+
),
59+
60+
total_pages AS (
61+
SELECT
62+
client,
63+
COUNT(DISTINCT root_page) AS origins
64+
FROM pages
65+
GROUP BY client
5866
)
5967
6068
SELECT
@@ -82,11 +90,5 @@ SELECT
8290
COALESCE(MAX(IF(client = 'mobile', origins, 0))) AS mobile
8391
) AS origins,
8492
NULL AS technologies
85-
FROM (
86-
SELECT
87-
client,
88-
COUNT(DISTINCT root_page) AS origins
89-
FROM pages
90-
GROUP BY client
91-
)
93+
FROM total_pages
9294
`)

definitions/output/reports/tech_report_technologies.js

Lines changed: 20 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -21,13 +21,22 @@ WITH pages AS (
2121
2222
tech_origins AS (
2323
SELECT
24-
client,
2524
technology,
26-
COUNT(DISTINCT root_page) AS origins
27-
FROM pages
28-
GROUP BY
29-
client,
30-
technology
25+
STRUCT(
26+
MAX(IF(client = 'desktop', origins, 0)) AS desktop,
27+
MAX(IF(client = 'mobile', origins, 0)) AS mobile
28+
) AS origins
29+
FROM (
30+
SELECT
31+
client,
32+
technology,
33+
COUNT(DISTINCT root_page) AS origins
34+
FROM pages
35+
GROUP BY
36+
client,
37+
technology
38+
)
39+
GROUP BY technology
3140
),
3241
3342
technologies AS (
@@ -53,7 +62,6 @@ total_pages AS (
5362
)
5463
5564
SELECT
56-
client,
5765
technology,
5866
description,
5967
category,
@@ -66,11 +74,14 @@ USING(technology)
6674
UNION ALL
6775
6876
SELECT
69-
client,
7077
'ALL' AS technology,
7178
NULL AS description,
7279
NULL AS category,
7380
NULL AS category_obj,
74-
origins
81+
NULL AS similar_technologies,
82+
STRUCT(
83+
MAX(IF(client = 'desktop', origins, 0)) AS desktop,
84+
MAX(IF(client = 'mobile', origins, 0)) AS mobile
85+
) AS origins
7586
FROM total_pages
7687
`)

0 commit comments

Comments
 (0)