TinyNova · printscreen · Aug 10, 2025
diff --git a/.env b/.env
@@ -0,0 +1 @@
+#DATABASE_URL=postgresql://postgres:password@localhost/solaceassignment
diff --git a/.eslintrc.json b/.eslintrc.json
@@ -0,0 +1,3 @@
+{
+  "extends": "next/core-web-vitals"
+}
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1,36 @@
+# See https://help.github.com/articles/ignoring-files/ for more about ignoring files.
+
+# dependencies
+/node_modules
+/.pnp
+.pnp.js
+.yarn/install-state.gz
+
+# testing
+/coverage
+
+# next.js
+/.next/
+/out/
+
+# production
+/build
+
+# misc
+.DS_Store
+*.pem
+
+# debug
+npm-debug.log*
+yarn-debug.log*
+yarn-error.log*
+
+# local env files
+.env*.local
+
+# vercel
+.vercel
+
+# typescript
+*.tsbuildinfo
+next-env.d.ts
diff --git a/DISCUSSION.md b/DISCUSSION.md
@@ -0,0 +1,242 @@
+# Solace Advocate Search Improvements
+
+This document outlines the backend and data schema changes I made to improve scalability, maintainability, and performance for the Solace assignment. We'll start with the backend and data model, then cover the frontend in a later section.
+
+NOTE: The goal of this exercise is to communicate my ability and profeciency in building systems. I am a big fan of small digestable pull requests. I felt making lots of small PRs with the amount of improvements I wanted to do would make it very hard to communicate the goal of this exercise. It would be taxing to ask someone to glue 5-8 small PRs in their head all while trying to understand all the changes. I decided to spend more time making this doc and stepping through everything I did.
+
+---
+
+![Screenshot of the Solace Advocate Search UI](./screenshot.png)
+
+[![Watch a walkthrough video on Loom](https://cdn.loom.com/sessions/thumbnails/c67e8b910b404b289c55d8aab891205a-899f85c1e09d2c59-full-play.gif)](https://www.loom.com/share/c67e8b910b404b289c55d8aab891205a)
+
+## Introduction
+
+The original backend schema used a flat structure with embedded JSONB arrays for specialties and stored city and degree as plain text fields. While this works for small datasets, it does not scale well for larger, relational data or for efficient querying and filtering. My changes focus on leveraging relational database best practices, improving query performance, and making the API more maintainable and extensible. I learned early on, you are not smarter than the people who wrote postgres. This is relational data and relational databases handle relational data really really well.
+
+---
+
+## Backend & Data Schema Changes
+
+### 1. **Normalized Data Model**
+
+#### **Original Schema Example**
+
+```typescript
+const advocates = pgTable('advocates', {
+  id: serial('id').primaryKey(),
+  firstName: text('first_name').notNull(),
+  lastName: text('last_name').notNull(),
+  city: text('city').notNull(),
+  degree: text('degree').notNull(),
+  specialties: jsonb('payload').default([]).notNull(),
+  yearsOfExperience: integer('years_of_experience').notNull(),
+  phoneNumber: bigint('phone_number', { mode: 'number' }).notNull(),
+  createdAt: timestamp('created_at').default(sql`CURRENT_TIMESTAMP`),
+});
+```
+
+#### **My Improved Schema**
+
+- **Cities and Degrees** are now separate tables, referenced by foreign keys in the `advocates` table.
+- **Specialties** are modeled as a many-to-many relationship using a linking table (`advocate_specialties`).
+- **Advocates** reference `cityId` and `degreeId` as FKs, and their specialties are joined via the linking table.
+
+```typescript
+export const cities = pgTable('cities', {
+  id: integer('id').primaryKey(),
+  name: varchar('name', { length: 255 }).notNull().unique(),
+});
+
+export const degrees = pgTable('degrees', {
+  id: integer('id').primaryKey(),
+  name: varchar('name', { length: 255 }).notNull().unique(),
+});
+
+export const specialties = pgTable('specialties', {
+  id: integer('id').primaryKey(),
+  name: varchat('name', { length: 255 }).notNull().unique(),
+});
+
+export const advocates = pgTable('advocates', {
+  id: serial('id').primaryKey(),
+  firstName: varchar('first_name', { length: 255 }).notNull(),
+  lastName: varchar('last_name', { length: 255 }).notNull(),
+  cityId: integer('city_id')
+    .notNull()
+    .references(() => cities.id),
+  degreeId: integer('degree_id')
+    .notNull()
+    .references(() => degrees.id),
+  yearsOfExperience: integer('years_of_experience').notNull(),
+  phoneNumber: bigint('phone_number', { mode: 'number' }).notNull(),
+  createdAt: timestamp('created_at').default(sql`CURRENT_TIMESTAMP`),
+});
+
+export const advocateSpecialties = pgTable('advocate_specialties', {
+  advocateId: integer('advocate_id')
+    .notNull()
+    .references(() => advocates.id),
+  specialtyId: integer('specialty_id')
+    .notNull()
+    .references(() => specialties.id),
+});
+```
+
+#### **Why Normalize?**
+
+- **Performance:** Foreign key (FK) indexes are much faster for filtering and joining than scanning and matching strings or arrays in a JSONB column.
+- **Scalability:** As the dataset grows, relational queries with proper indexes remain fast, while JSONB array scans become slow.
+- **Data Integrity:** Referential integrity is enforced by the database, preventing orphaned or invalid references.
+- **Flexibility:** Adding, removing, or updating specialties, cities, or degrees is trivial and does not require updating every advocate record.
+
+#### **Why Not JSONB for Specialties?**
+
+- While Postgres can index JSONB, querying for "all advocates with specialty X" requires converting arrays and using special operators, which is more complex and less efficient than a join on a linking table. Converting the JSONB to an array and then dealing with all the scalar headache isnt worth it. Plus, we would be doing that conversion every query, which is unecessary repeated work. I went with the KISS approach (Keep it simple stupid)
+- Relational databases are designed to handle relationships—using a linking table is idiomatic, efficient, and easy to query.
+
+---
+
+### 2. **API Endpoints for Reference Data**
+
+- Created separate API routes for `cities`, `degrees`, and `specialties` (e.g., `/api/cities`, `/api/degrees`, `/api/specialties`).
+- These endpoints return lists of options for use in dropdowns and filters.
+- This separation allows for easy caching and reduces unnecessary database load.
+
+---
+
+### 3. **Caching for Static Data**
+
+- Implemented a simple in-memory cache for endpoints like `cities`, `degrees`, and `specialties` since this data rarely changes.
+- This reduces database queries and improves response times.
+- For production, a distributed cache like Redis would be recommended for scalability and multi-instance support. For this exercise, I wrote a simple in mem cache.
+
+---
+
+### 4. **Advocate Query Improvements**
+
+- The `/api/advocates` endpoint supports filtering by all fields, including specialties, cities, degrees, and years of experience.
+- Used efficient SQL queries with joins and filters on indexed columns.
+- For specialties, used a subquery to filter advocates by selected specialties via the linking table.
+
+#### **Counting Results**
+
+- To get the total count for pagination, I run a separate `SELECT COUNT(*)` query with the same filters.
+- While Postgres supports window functions to get the count in a single query, Drizzle's documentation recommends two queries for clarity and compatibility.
+- If performance becomes an issue, we could switch to a window function like:
+  ```sql
+  SELECT *, COUNT(*) OVER() as total_count
+  FROM advocates
+  LEFT JOIN ...
+  WHERE ...
+  LIMIT ... OFFSET ...
+  ```
+  This would return the total count with each row, eliminating the need for a second query.
+
+---
+
+### 5. **Indexing for Fast Search**
+
+- Added GIN indexes with the `pg_trgm` extension on `first_name` and `last_name` to support fast, case-insensitive substring search (`ILIKE '%name%'`).
+- This is critical for performance at scale and is handled via a migration file, not in the schema definition.
+- Without a GIN index, the Postgres query planner would perform a sequential scan of the entire table, evaluating every record for filtering if no other selective indexes are present. This results in linear time complexity (O(n)), which does not scale well as the table grows.
+- Chose `varchar` over `text` for first and last name fields to semantically indicate these are short strings. In PostgreSQL, both types are stored inline for small values, but using `varchar` makes the intent clearer and can help prevent accidental misuse for large text data. If a large `text` value is accidentally stored, PostgreSQL will move it to the `pg_toast` table for out-of-line storage, which adds overhead and can negatively impact query performance
+
+---
+
+### 6. **Preventing Too Many Database Connections in Development**
+
+During development, I encountered the error `PostgresError: sorry, too many clients already`. This happens because, in environments with hot reloading (like Next.js), the backend code can be re-executed multiple times, causing a new Postgres client to be created on each reload. As a result, the database quickly hits its connection limit, leading to this error.
+
+To fix this, I used the `globalThis` object to store the Postgres client and Drizzle instance. Before creating a new client, the code checks if one already exists on `globalThis` and reuses it if available. This ensures that only a single connection pool is maintained during development, preventing connection leaks and excessive client creation. This pattern is widely recommended for Node.js/Next.js projects and is safe because `globalThis` persists across module reloads in development, but not in production.
+
+---
+
+## **Summary**
+
+- **Normalized the schema** for scalability and performance.
+- **Split out reference data** (cities, degrees, specialties) into their own tables and API endpoints.
+- **Used a linking table** for specialties to support efficient many-to-many queries.
+- **Implemented caching** for static data.
+- **Optimized advocate queries** for filtering and pagination.
+- **Added proper indexes** for fast search.
+
+---
+
+## Frontend & UX Improvements
+
+The original frontend was a simple React page with a single search box, a reset button, and a table that displayed all advocates. Filtering was done entirely on the client, and all data was fetched and held in memory. While this works for small datasets, it does not scale and does not provide a modern, user-friendly experience.
+
+---
+
+### 1. **Advanced Filtering & UX Rationale**
+
+- **Specialty, Degree, and Location Filters:**  
+  I introduced multi-select dropdowns for specialties, degrees, and cities. This allows users to filter advocates by any combination of these attributes.
+
+- **OR Logic:** I chose to implement all filters (specialties, degrees, and locations) as an "OR" condition. This means advocates are shown if they match _any_ of the selected specialties, _any_ of the selected degrees, or _any_ of the selected cities. I intentionally avoided mixing "AND" and "OR" logic between filter types, as this can be confusing for users, especially since some fields (like location and degree) are mutually exclusive for a single advocate. Keeping all filters as "OR" conditions provides a consistent and predictable experience, similar to what users expect from major platforms like Amazon.
+
+- **Pillboxes for Active Filters:**  
+  I added pillboxes to visually represent each active filter. This provides immediate feedback to the user about what filters are applied and allows for quick removal of any filter by clicking the "X" on the pill. This is a familiar pattern from e-commerce and search UIs, reducing cognitive load and making the interface more intuitive.
+
+---
+
+### 2. **Pagination for Scalability**
+
+- **Why Pagination:**  
+  The original implementation fetched all advocates and filtered them on the client. This approach does not scale, fetching and rendering millions of records is not feasible and would result in poor performance and user experience.
+- **Server-Side Pagination:**  
+  I implemented server-side pagination, fetching only the advocates needed for the current page. This keeps the UI fast and responsive, regardless of the total dataset size, and is a best practice for scalable web applications. We can control the how beefy our servers are, we cant control the specs of the clients computer. Always better to have control and push the heavy lifting on systems we can control.
+
+---
+
+### 3. **Multi-Select Dropdowns**
+
+- **Componentization:**  
+  I built a reusable `MultiSelectFilter` component for specialties, degrees, and cities. This component supports selecting multiple options, displays the count of selected items, and is keyboard accessible.
+- **UX:**  
+  Multi-select dropdowns are a familiar and efficient way for users to select multiple filters without cluttering the UI with dozens of checkboxes.
+
+---
+
+### 4. **Branding & Visual Design**
+
+- **Tailwind Customization:**  
+  I configured Tailwind CSS to use the Solace brand color (`#347866`) throughout the UI for buttons, pillboxes, and highlights. This ensures a cohesive and professional look that matches the company's identity. One concern I did not address with the colors is any a11y issues with the color scheme. It might not be a11y friendly for people with red/green color blindness.
+- **Brand Logo:**  
+  I extracted the SVG logo from the Solace website and created a reusable React component for it, placing it prominently in the page header.
+
+---
+
+### 5. **Other UX Enhancements**
+
+- **Clear Filter Feedback:**  
+  The UI always shows which filters are active, and users can remove any filter with a single click.
+- **Responsive Layout:**  
+  The filter section and table are responsive and look good on various screen sizes.
+- **Accessible Controls:**  
+  All form controls are labeled, and the dropdowns are keyboard navigable.
+- **No Sorting for Now:**  
+  I chose not to implement column sorting, as most fields are text-based and do not lend themselves to meaningful ordering. The only numeric field, years of experience, can be filtered by setting a minimum value, which is more useful for this context.
+
+---
+
+### 6. **Specialties Display in the Table**
+
+- **Reducing Visual Noise:**  
+  Initially, displaying all specialties for each advocate in the table made the UI cluttered and overwhelming, especially for advocates with many specialties. To improve readability and reduce noise, I truncated the list to show only the first two specialties by default. If an advocate has more specialties, a "+N more" link appears, allowing users to expand and view the full list on demand. This keeps the table clean and focused, while still making all information accessible.
+
+- **Prioritizing Filtered Specialties:**  
+  When a user filters by specialty, advocates who match the filter may have many specialties. To make the filtered results more meaningful and user-friendly, I biased the display order so that the specialties matching the user's filter appear first in the truncated list. This ensures that the most relevant information is immediately visible, and users can quickly see why a particular advocate matched their search criteria, even before expanding the full list.
+
+---
+
+### 7. **Summary of Frontend Improvements**
+
+- **Modern, scalable filtering UI** with multi-selects and pillboxes.
+- **Server-side pagination** for performance and scalability.
+- **Brand-consistent design** with custom colors and logo.
+- **Component-based architecture** for maintainability and reuse.
+- **User-centric UX decisions** inspired by best practices from leading platforms.
+
+---
diff --git a/README.md b/README.md
@@ -1 +1,41 @@
-# SolaceNextJS
+## Solace Candidate Assignment
+
+This is a [Next.js](https://nextjs.org/) project bootstrapped with [`create-next-app`](https://github.com/vercel/next.js/tree/canary/packages/create-next-app).
+
+## Getting Started
+
+Install dependencies
+
+```bash
+npm i
+```
+
+Run the development server:
+
+```bash
+npm run dev
+```
+
+## Database set up
+
+The app is configured to return a default list of advocates. This will allow you to get the app up and running without needing to configure a database. If you’d like to configure a database, you’re encouraged to do so. You can uncomment the url in `.env` and the line in `src/app/api/advocates/route.ts` to test retrieving advocates from the database.
+
+1. Feel free to use whatever configuration of postgres you like. The project is set up to use docker-compose.yml to set up postgres. The url is in .env.
+
+```bash
+docker compose up -d
+```
+
+2. Create a `solaceassignment` database.
+
+3. Push migration to the database
+
+```bash
+npx drizzle-kit push
+```
+
+4. Seed the database
+
+```bash
+curl -X POST http://localhost:3000/api/seed
+```
diff --git a/docker-compose.yml b/docker-compose.yml
@@ -0,0 +1,14 @@
+services:
+  db:
+    image: postgres
+    restart: always
+    environment:
+      POSTGRES_USER: postgres
+      POSTGRES_PASSWORD: password
+      POSTGRES_DB: solaceassignment
+    volumes:
+      - psql:/var/lib/postgresql/data
+    ports:
+      - 5432:5432
+volumes:
+  psql:
diff --git a/drizzle.config.ts b/drizzle.config.ts
@@ -0,0 +1,11 @@
+const config = {
+  dialect: "postgresql",
+  schema: "./src/db/schema.ts",
+  dbCredentials: {
+    url: process.env.DATABASE_URL,
+  },
+  verbose: true,
+  strict: true,
+};
+
+export default config;
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1 @@
		#DATABASE_URL=postgresql://postgres:password@localhost/solaceassignment