Documentation Generated: November 10, 2025
This is a Looker analytics project named stackoverflow_test that provides comprehensive analysis of StackOverflow data. The project is connected to BigQuery in Google Cloud Platform (GCP) and implements a complete semantic layer with views, explores, dashboards, caching strategies, and data quality tests.
The project uses three source tables from BigQuery:
Table: prj-s-dlp-dq-sandbox-0b3c.EK_test_data.stackoverflow_users
Contains user profile information and reputation metrics.
Columns:
id(INT64) - Unique user identifierdisplay_name(STRING) - User's display nameabout_me(STRING) - User's bio/about sectionage(STRING) - User's agecreation_date(TIMESTAMP) - Account creation datelast_access_date(TIMESTAMP) - Last login datelocation(STRING) - User's location (contains many NULL values)reputation(INT64) - User's reputation scoreup_votes(INT64) - Total up votes receiveddown_votes(INT64) - Total down votes receivedviews(INT64) - Total profile viewsprofile_image_url(STRING) - URL to profile imagewebsite_url(STRING) - User's website URL
Table: prj-s-dlp-dq-sandbox-0b3c.EK_test_data.stackoverflow_badges
Contains badge achievement records for users.
Columns:
id(INT64) - Unique badge record identifiername(STRING) - Badge namedate(TIMESTAMP) - Date badge was earneduser_id(INT64) - ID of user who earned the badgeclass(INT64) - Badge class/tier (values: 1, 2, or 3)tag_based(BOOL) - Whether the badge is tag-based
Table: prj-s-dlp-dq-sandbox-0b3c.EK_test_data.stackoverflow_comments
Contains comment records on posts.
Columns:
id(INT64) - Unique comment identifiertext(STRING) - Comment text contentcreation_date(TIMESTAMP) - Date comment was postedpost_id(INT64) - ID of the post being commented onuser_id(INT64) - ID of comment authoruser_display_name(STRING) - Display name of comment authorscore(INT64) - Comment score/rating
The semantic layer includes 3 LookML views that provide well-organized access to the source data:
File: views/stackoverflow_users.view.lkml
Dimensions:
id(Primary Key, Hidden)display_name- User's display nameabout_me- Bio informationage- User ageprofile_image_url- Profile image URLwebsite_url- User's websitelocation- Geographic locationcreation_date(Dimension Group) - Account creation with timeframes: time, date, week, month, rawcreation_month_year- Formatted month+yearlast_access_date(Dimension Group) - Last access with timeframes: time, date, week, month, rawlast_access_month_year- Formatted month+year
Measures:
reputation- Sum of reputation scoresup_votes- Sum of up votesdown_votes- Sum of down votesviews- Sum of profile viewscount- Total number of users
File: views/stackoverflow_badges.view.lkml
Dimensions:
id(Primary Key, Hidden)name- Badge nameuser_id- User identifierclass- Badge class/tiertag_based- Boolean indicator for tag-based badgesdate(Dimension Group) - Badge date with timeframes: time, date, week, month, rawdate_month_year- Formatted month+year
Measures:
count- Total number of badges
File: views/stackoverflow_comments.view.lkml
Dimensions:
id(Primary Key, Hidden)text- Comment textpost_id- Post identifieruser_id- Comment author user IDuser_display_name- Comment author namecreation_date(Dimension Group) - Comment creation with timeframes: time, date, week, month, rawcreation_month_year- Formatted month+year
Measures:
score- Sum of comment scorescount- Total number of comments
The project includes 5 explores providing different analytical perspectives:
Label: StackOverflow. Badges with Users
Joins the badges view with users data using user_id = id relationship (many-to-one). Enables analysis of badge distribution by user characteristics.
Base View: stackoverflow_badges (labeled "Badges") Joined View: stackoverflow_users (labeled "Users")
Label: StackOverflow. Comments with Users
Joins the comments view with users data using user_id = id relationship (many-to-one). Enables analysis of comment activity and patterns by user.
Base View: stackoverflow_comments (labeled "Comments") Joined View: stackoverflow_users (labeled "Users")
Label: StackOverflow. Badges
Single-view explore of badge data without joins.
Label: StackOverflow. Comments
Single-view explore of comment data without joins.
Label: StackOverflow. Users
Single-view explore of user data without joins.
File: LookML_Dashboards/stackoverflow_badges_and_comments.dashboard.lookml
A comprehensive dashboard providing key metrics and insights into badges and comments data.
Layout: Newspaper (responsive grid)
Filters:
- Badge Name (Tag List, Multiple values allowed)
- User Display Name (Tag List, Multiple values allowed)
Tiles:
-
Dashboard Description (Text Tile)
- Overview of dashboard purpose and contents
-
Total Badges (Single Value)
- Count of all badges in the system
- Source: stackoverflow_badges_users explore
-
Total Comments (Single Value)
- Count of all comments in the system
- Source: stackoverflow_comments_users explore
-
Total Users (Single Value)
- Count of all users in the system
- Source: stackoverflow_badges_users explore
-
Total Reputation (Single Value)
- Sum of all user reputation points
- Source: stackoverflow_badges_users explore
-
Badges by Class (Bar Chart)
- Distribution of badges by class level (1, 2, 3)
- Top 25 results sorted by count descending
- Source: stackoverflow_badges_users explore
-
Top 25 Users by Comments (Column Chart)
- Shows which users have posted the most comments
- Displays user display name vs comment count
- Limited to top 25 users
- Source: stackoverflow_comments_users explore
-
Comments Detail (Table)
- Detailed table view of comments
- Columns: Comment ID, Post ID, User ID, User Display Name, Score
- Limited to 500 rows
- Source: stackoverflow_comments_users explore
File: datagroups.lkml
Implements a 12-hour refresh schedule with 24-hour cache retention.
Configuration:
- Interval Trigger: Every 12 hours (midnight and noon)
- Max Cache Age: 24 hours
- Applied To: All 5 explores
This strategy balances query performance with data freshness, ensuring:
- Cached results are used for up to 24 hours
- Fresh data is fetched at least every 12 hours
- Reduced database load during business hours
The project includes 2 data quality tests to ensure data integrity:
File: data_tests/data_tests.lkml
Purpose: Validates that all users have a non-null display name
Logic:
- Tests the stackoverflow_users explore
- Ensures the display_name field is populated for every user record
- Uses sorting to surface any null values
Assert: NOT is_null(${stackoverflow_users.display_name})
File: data_tests/data_tests.lkml
Purpose: Validates that all badge class values are 1, 2, or 3
Logic:
- Tests the stackoverflow_badges explore
- Ensures badge classification is consistent and within expected values
- Filters out valid records to identify any invalid classifications
Assert: ${stackoverflow_badges.class} = 1 OR ${stackoverflow_badges.class} = 2 OR ${stackoverflow_badges.class} = 3
File: stackoverflow_test.model.lkml
Connection: badal_internal_projects (BigQuery in GCP)
Includes:
datagroups.lkml- Caching definitions/views/*.view.lkml- All view files/explores/*.lkml- All explore files/LookML_Dashboards/*.dashboard.lookml- All dashboard files/data_tests/*.lkml- All data test files
stackoverflow_test/
├── README.md (this file)
├── CLAUDE.md (project instructions)
├── manifest.lkml (table constants)
├── datagroups.lkml (caching configuration)
├── stackoverflow_test.model.lkml (main model file)
├── views/
│ ├── stackoverflow_users.view.lkml
│ ├── stackoverflow_badges.view.lkml
│ └── stackoverflow_comments.view.lkml
├── explores/
│ └── explores.lkml
├── LookML_Dashboards/
│ └── stackoverflow_badges_and_comments.dashboard.lookml
├── data_tests/
│ └── data_tests.lkml
├── tasks/
│ ├── task_1_views.md
│ ├── task_2_explores.md
│ ├── task_3_reporting.md
│ ├── task_4_caching.md
│ ├── task_5_datatests.md
│ ├── task_6_documentation.md
│ ├── best_practices.md
│ └── task_execution_tracker.md
└── dashboard_examples/
└── (10+ example files for reference)
-
Naming Conventions
- Follows LookML naming standards
- No spaces or special characters in filenames
- Proper file extensions (.view.lkml, .explore.lkml, .dashboard.lookml, .lkml)
-
Dimension and Measure Design
- All dimensions and measures have labels and descriptions
- Measures are defined as hidden dimensions first, then as measures
- NULL values in measures converted to 0 for proper aggregation
- Measures use consistent value format: "#,##0.00"
-
Primary Keys
- All views include a hidden primary key
- ID field marked as primary_key: yes and hidden: yes
-
Table Constants
- Table names defined as constants in manifest.lkml
- Constants referenced using backticks in view sql_table_name
-
Date/Time Dimensions
- Uses dimension_group with type: time
- Includes multiple timeframes (time, date, week, month, raw)
- Formatted date dimensions included for user-friendly display
- Proper group_label format for formatted date dimensions
-
Join Relationships
- All joins include relationship parameter (many_to_one)
- Joins use left_outer type for preserving base view records
- Join predicates use raw timeframe for date fields
-
Caching Strategy
- All explores use persist_with parameter
- Consistent datagroup applied across project
- Configurable refresh schedule balancing performance and freshness
-
Documentation
- Comprehensive descriptions for all explores and dashboards
- Data quality tests validate critical business rules
- This README provides complete project overview
- Connect to Data: Ensure your Looker instance is connected to the BigQuery project containing the source tables
- Run Data Tests: Execute the data tests to validate source data quality
- Explore Data: Start with the single-view explores (Badges, Comments, Users) for basic analysis
- Use Joined Explores: Use the joined explores (Badges with Users, Comments with Users) for more sophisticated analysis
- Dashboard: Access the pre-built dashboard for executive summary views
- Data tests should be run regularly to monitor data quality
- The 12-hour caching schedule can be adjusted in
datagroups.lkmlif needed - New dashboards and explores can be added following the established patterns
- Always include descriptions and labels for new dimensions and measures