Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .env
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
#DATABASE_URL=postgresql://postgres:password@localhost/solaceassignment
3 changes: 3 additions & 0 deletions .eslintrc.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"extends": "next/core-web-vitals"
}
36 changes: 36 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# See https://help.github.com/articles/ignoring-files/ for more about ignoring files.

# dependencies
/node_modules
/.pnp
.pnp.js
.yarn/install-state.gz

# testing
/coverage

# next.js
/.next/
/out/

# production
/build

# misc
.DS_Store
*.pem

# debug
npm-debug.log*
yarn-debug.log*
yarn-error.log*

# local env files
.env*.local

# vercel
.vercel

# typescript
*.tsbuildinfo
next-env.d.ts
242 changes: 242 additions & 0 deletions DISCUSSION.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,242 @@
# Solace Advocate Search Improvements

This document outlines the backend and data schema changes I made to improve scalability, maintainability, and performance for the Solace assignment. We'll start with the backend and data model, then cover the frontend in a later section.

NOTE: The goal of this exercise is to communicate my ability and profeciency in building systems. I am a big fan of small digestable pull requests. I felt making lots of small PRs with the amount of improvements I wanted to do would make it very hard to communicate the goal of this exercise. It would be taxing to ask someone to glue 5-8 small PRs in their head all while trying to understand all the changes. I decided to spend more time making this doc and stepping through everything I did.

---

![Screenshot of the Solace Advocate Search UI](./screenshot.png)

[![Watch a walkthrough video on Loom](https://cdn.loom.com/sessions/thumbnails/c67e8b910b404b289c55d8aab891205a-899f85c1e09d2c59-full-play.gif)](https://www.loom.com/share/c67e8b910b404b289c55d8aab891205a)

## Introduction

The original backend schema used a flat structure with embedded JSONB arrays for specialties and stored city and degree as plain text fields. While this works for small datasets, it does not scale well for larger, relational data or for efficient querying and filtering. My changes focus on leveraging relational database best practices, improving query performance, and making the API more maintainable and extensible. I learned early on, you are not smarter than the people who wrote postgres. This is relational data and relational databases handle relational data really really well.

---

## Backend & Data Schema Changes

### 1. **Normalized Data Model**

#### **Original Schema Example**

```typescript
const advocates = pgTable('advocates', {
id: serial('id').primaryKey(),
firstName: text('first_name').notNull(),
lastName: text('last_name').notNull(),
city: text('city').notNull(),
degree: text('degree').notNull(),
specialties: jsonb('payload').default([]).notNull(),
yearsOfExperience: integer('years_of_experience').notNull(),
phoneNumber: bigint('phone_number', { mode: 'number' }).notNull(),
createdAt: timestamp('created_at').default(sql`CURRENT_TIMESTAMP`),
});
```

#### **My Improved Schema**

- **Cities and Degrees** are now separate tables, referenced by foreign keys in the `advocates` table.
- **Specialties** are modeled as a many-to-many relationship using a linking table (`advocate_specialties`).
- **Advocates** reference `cityId` and `degreeId` as FKs, and their specialties are joined via the linking table.

```typescript
export const cities = pgTable('cities', {
id: integer('id').primaryKey(),
name: varchar('name', { length: 255 }).notNull().unique(),
});

export const degrees = pgTable('degrees', {
id: integer('id').primaryKey(),
name: varchar('name', { length: 255 }).notNull().unique(),
});

export const specialties = pgTable('specialties', {
id: integer('id').primaryKey(),
name: varchat('name', { length: 255 }).notNull().unique(),
});

export const advocates = pgTable('advocates', {
id: serial('id').primaryKey(),
firstName: varchar('first_name', { length: 255 }).notNull(),
lastName: varchar('last_name', { length: 255 }).notNull(),
cityId: integer('city_id')
.notNull()
.references(() => cities.id),
degreeId: integer('degree_id')
.notNull()
.references(() => degrees.id),
yearsOfExperience: integer('years_of_experience').notNull(),
phoneNumber: bigint('phone_number', { mode: 'number' }).notNull(),
createdAt: timestamp('created_at').default(sql`CURRENT_TIMESTAMP`),
});

export const advocateSpecialties = pgTable('advocate_specialties', {
advocateId: integer('advocate_id')
.notNull()
.references(() => advocates.id),
specialtyId: integer('specialty_id')
.notNull()
.references(() => specialties.id),
});
```

#### **Why Normalize?**

- **Performance:** Foreign key (FK) indexes are much faster for filtering and joining than scanning and matching strings or arrays in a JSONB column.
- **Scalability:** As the dataset grows, relational queries with proper indexes remain fast, while JSONB array scans become slow.
- **Data Integrity:** Referential integrity is enforced by the database, preventing orphaned or invalid references.
- **Flexibility:** Adding, removing, or updating specialties, cities, or degrees is trivial and does not require updating every advocate record.

#### **Why Not JSONB for Specialties?**

- While Postgres can index JSONB, querying for "all advocates with specialty X" requires converting arrays and using special operators, which is more complex and less efficient than a join on a linking table. Converting the JSONB to an array and then dealing with all the scalar headache isnt worth it. Plus, we would be doing that conversion every query, which is unecessary repeated work. I went with the KISS approach (Keep it simple stupid)
- Relational databases are designed to handle relationships—using a linking table is idiomatic, efficient, and easy to query.

---

### 2. **API Endpoints for Reference Data**

- Created separate API routes for `cities`, `degrees`, and `specialties` (e.g., `/api/cities`, `/api/degrees`, `/api/specialties`).
- These endpoints return lists of options for use in dropdowns and filters.
- This separation allows for easy caching and reduces unnecessary database load.

---

### 3. **Caching for Static Data**

- Implemented a simple in-memory cache for endpoints like `cities`, `degrees`, and `specialties` since this data rarely changes.
- This reduces database queries and improves response times.
- For production, a distributed cache like Redis would be recommended for scalability and multi-instance support. For this exercise, I wrote a simple in mem cache.

---

### 4. **Advocate Query Improvements**

- The `/api/advocates` endpoint supports filtering by all fields, including specialties, cities, degrees, and years of experience.
- Used efficient SQL queries with joins and filters on indexed columns.
- For specialties, used a subquery to filter advocates by selected specialties via the linking table.

#### **Counting Results**

- To get the total count for pagination, I run a separate `SELECT COUNT(*)` query with the same filters.
- While Postgres supports window functions to get the count in a single query, Drizzle's documentation recommends two queries for clarity and compatibility.
- If performance becomes an issue, we could switch to a window function like:
```sql
SELECT *, COUNT(*) OVER() as total_count
FROM advocates
LEFT JOIN ...
WHERE ...
LIMIT ... OFFSET ...
```
This would return the total count with each row, eliminating the need for a second query.

---

### 5. **Indexing for Fast Search**

- Added GIN indexes with the `pg_trgm` extension on `first_name` and `last_name` to support fast, case-insensitive substring search (`ILIKE '%name%'`).
- This is critical for performance at scale and is handled via a migration file, not in the schema definition.
- Without a GIN index, the Postgres query planner would perform a sequential scan of the entire table, evaluating every record for filtering if no other selective indexes are present. This results in linear time complexity (O(n)), which does not scale well as the table grows.
- Chose `varchar` over `text` for first and last name fields to semantically indicate these are short strings. In PostgreSQL, both types are stored inline for small values, but using `varchar` makes the intent clearer and can help prevent accidental misuse for large text data. If a large `text` value is accidentally stored, PostgreSQL will move it to the `pg_toast` table for out-of-line storage, which adds overhead and can negatively impact query performance

---

### 6. **Preventing Too Many Database Connections in Development**

During development, I encountered the error `PostgresError: sorry, too many clients already`. This happens because, in environments with hot reloading (like Next.js), the backend code can be re-executed multiple times, causing a new Postgres client to be created on each reload. As a result, the database quickly hits its connection limit, leading to this error.

To fix this, I used the `globalThis` object to store the Postgres client and Drizzle instance. Before creating a new client, the code checks if one already exists on `globalThis` and reuses it if available. This ensures that only a single connection pool is maintained during development, preventing connection leaks and excessive client creation. This pattern is widely recommended for Node.js/Next.js projects and is safe because `globalThis` persists across module reloads in development, but not in production.

---

## **Summary**

- **Normalized the schema** for scalability and performance.
- **Split out reference data** (cities, degrees, specialties) into their own tables and API endpoints.
- **Used a linking table** for specialties to support efficient many-to-many queries.
- **Implemented caching** for static data.
- **Optimized advocate queries** for filtering and pagination.
- **Added proper indexes** for fast search.

---

## Frontend & UX Improvements

The original frontend was a simple React page with a single search box, a reset button, and a table that displayed all advocates. Filtering was done entirely on the client, and all data was fetched and held in memory. While this works for small datasets, it does not scale and does not provide a modern, user-friendly experience.

---

### 1. **Advanced Filtering & UX Rationale**

- **Specialty, Degree, and Location Filters:**
I introduced multi-select dropdowns for specialties, degrees, and cities. This allows users to filter advocates by any combination of these attributes.

- **OR Logic:** I chose to implement all filters (specialties, degrees, and locations) as an "OR" condition. This means advocates are shown if they match _any_ of the selected specialties, _any_ of the selected degrees, or _any_ of the selected cities. I intentionally avoided mixing "AND" and "OR" logic between filter types, as this can be confusing for users, especially since some fields (like location and degree) are mutually exclusive for a single advocate. Keeping all filters as "OR" conditions provides a consistent and predictable experience, similar to what users expect from major platforms like Amazon.

- **Pillboxes for Active Filters:**
I added pillboxes to visually represent each active filter. This provides immediate feedback to the user about what filters are applied and allows for quick removal of any filter by clicking the "X" on the pill. This is a familiar pattern from e-commerce and search UIs, reducing cognitive load and making the interface more intuitive.

---

### 2. **Pagination for Scalability**

- **Why Pagination:**
The original implementation fetched all advocates and filtered them on the client. This approach does not scale, fetching and rendering millions of records is not feasible and would result in poor performance and user experience.
- **Server-Side Pagination:**
I implemented server-side pagination, fetching only the advocates needed for the current page. This keeps the UI fast and responsive, regardless of the total dataset size, and is a best practice for scalable web applications. We can control the how beefy our servers are, we cant control the specs of the clients computer. Always better to have control and push the heavy lifting on systems we can control.

---

### 3. **Multi-Select Dropdowns**

- **Componentization:**
I built a reusable `MultiSelectFilter` component for specialties, degrees, and cities. This component supports selecting multiple options, displays the count of selected items, and is keyboard accessible.
- **UX:**
Multi-select dropdowns are a familiar and efficient way for users to select multiple filters without cluttering the UI with dozens of checkboxes.

---

### 4. **Branding & Visual Design**

- **Tailwind Customization:**
I configured Tailwind CSS to use the Solace brand color (`#347866`) throughout the UI for buttons, pillboxes, and highlights. This ensures a cohesive and professional look that matches the company's identity. One concern I did not address with the colors is any a11y issues with the color scheme. It might not be a11y friendly for people with red/green color blindness.
- **Brand Logo:**
I extracted the SVG logo from the Solace website and created a reusable React component for it, placing it prominently in the page header.

---

### 5. **Other UX Enhancements**

- **Clear Filter Feedback:**
The UI always shows which filters are active, and users can remove any filter with a single click.
- **Responsive Layout:**
The filter section and table are responsive and look good on various screen sizes.
- **Accessible Controls:**
All form controls are labeled, and the dropdowns are keyboard navigable.
- **No Sorting for Now:**
I chose not to implement column sorting, as most fields are text-based and do not lend themselves to meaningful ordering. The only numeric field, years of experience, can be filtered by setting a minimum value, which is more useful for this context.

---

### 6. **Specialties Display in the Table**

- **Reducing Visual Noise:**
Initially, displaying all specialties for each advocate in the table made the UI cluttered and overwhelming, especially for advocates with many specialties. To improve readability and reduce noise, I truncated the list to show only the first two specialties by default. If an advocate has more specialties, a "+N more" link appears, allowing users to expand and view the full list on demand. This keeps the table clean and focused, while still making all information accessible.

- **Prioritizing Filtered Specialties:**
When a user filters by specialty, advocates who match the filter may have many specialties. To make the filtered results more meaningful and user-friendly, I biased the display order so that the specialties matching the user's filter appear first in the truncated list. This ensures that the most relevant information is immediately visible, and users can quickly see why a particular advocate matched their search criteria, even before expanding the full list.

---

### 7. **Summary of Frontend Improvements**

- **Modern, scalable filtering UI** with multi-selects and pillboxes.
- **Server-side pagination** for performance and scalability.
- **Brand-consistent design** with custom colors and logo.
- **Component-based architecture** for maintainability and reuse.
- **User-centric UX decisions** inspired by best practices from leading platforms.

---
42 changes: 41 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1 +1,41 @@
# SolaceNextJS
## Solace Candidate Assignment

This is a [Next.js](https://nextjs.org/) project bootstrapped with [`create-next-app`](https://github.com/vercel/next.js/tree/canary/packages/create-next-app).

## Getting Started

Install dependencies

```bash
npm i
```

Run the development server:

```bash
npm run dev
```

## Database set up

The app is configured to return a default list of advocates. This will allow you to get the app up and running without needing to configure a database. If you’d like to configure a database, you’re encouraged to do so. You can uncomment the url in `.env` and the line in `src/app/api/advocates/route.ts` to test retrieving advocates from the database.

1. Feel free to use whatever configuration of postgres you like. The project is set up to use docker-compose.yml to set up postgres. The url is in .env.

```bash
docker compose up -d
```

2. Create a `solaceassignment` database.

3. Push migration to the database

```bash
npx drizzle-kit push
```

4. Seed the database

```bash
curl -X POST http://localhost:3000/api/seed
```
14 changes: 14 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
services:
db:
image: postgres
restart: always
environment:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: password
POSTGRES_DB: solaceassignment
volumes:
- psql:/var/lib/postgresql/data
ports:
- 5432:5432
volumes:
psql:
11 changes: 11 additions & 0 deletions drizzle.config.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
const config = {
dialect: "postgresql",
schema: "./src/db/schema.ts",
dbCredentials: {
url: process.env.DATABASE_URL,
},
verbose: true,
strict: true,
};

export default config;
Loading