Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions .jules/sentinel.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,13 @@
**Vulnerability:** The application was passing unvalidated HTML variables, specifically `citation.formattedHtml`, to React's `dangerouslySetInnerHTML` prop in multiple components (`src/components/wiki/sortable-citation.tsx`, `src/app/cite/page.tsx`, `src/app/share/[code]/page.tsx`).
**Learning:** This is a classic pattern for Cross-Site Scripting (XSS). If a citation's contents originated from an untrusted source or were maliciously formatted, an attacker could execute arbitrary scripts in a user's session when the citation is rendered.
**Prevention:** Always sanitize any untrusted or dynamic HTML before rendering it in React. In a Next.js (SSR) application, use a library like `isomorphic-dompurify` to safely strip malicious scripts from the HTML payload on both the client and server side without hydration errors.

## 2025-02-28 - [SSRF via Unrestricted Node.js fetch]
**Vulnerability:** The application used Node.js native `fetch` to scrape arbitrary URLs supplied by the user (`/api/lookup/url`) without checking the resolved IP address.
**Learning:** This exposes internal services (e.g., `127.0.0.1`, AWS metadata `169.254.169.254`, or VPC private subnets) to Server-Side Request Forgery (SSRF). Attackers can bypass domain-based blocklists by pointing their domain to an internal IP (DNS rebinding) or simply providing a local IP.
**Prevention:** To prevent SSRF when using native `fetch`, validate protocols (e.g., only `http:` and `https:`) and perform pre-fetch DNS validation using `dns.lookup`. Ensure the resolved IP does not fall within private, loopback, or reserved IP ranges before initiating the HTTP request.

## 2025-02-28 - [DNS Rebinding Mitigation Tradeoffs]
**Vulnerability:** Even when performing pre-fetch DNS validation before calling native `fetch(url)` in Node.js to protect against SSRF, an attacker can use DNS Rebinding to switch the IP address between the Time-Of-Check (TOC) and Time-Of-Use (TOU).
**Learning:** Fixing TOCTOU completely in native `fetch` by rewriting the URL to use the resolved IP breaks Server Name Indication (SNI) and TLS certificate validation for HTTPS requests, making it impractical without introducing other major flaws.
**Prevention:** Full protection against DNS rebinding while preserving SNI in Node.js requires using lower-level custom HTTP agents (like `undici` directly) with pinned DNS resolution. Since this is an architectural change, the initial pre-fetch validation check provides the best defense-in-depth tradeoff for this specific codebase constraints.
54 changes: 54 additions & 0 deletions src/app/api/lookup/url/route.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,13 @@ import { NextRequest } from 'next/server';
// Mock fetch globally
global.fetch = vi.fn();

// Mock dns promises
vi.mock('dns/promises', () => ({
lookup: vi.fn().mockResolvedValue({ address: '93.184.216.34', family: 4 }),
}));

import { lookup } from 'dns/promises';

describe('URL Lookup API', () => {
beforeEach(() => {
vi.clearAllMocks();
Expand Down Expand Up @@ -38,6 +45,53 @@ describe('URL Lookup API', () => {
expect(data.error).toContain('Invalid URL format');
});

it('should return error for invalid protocol', async () => {
const request = new NextRequest('http://localhost/api/lookup/url', {
method: 'POST',
body: JSON.stringify({ url: 'file:///etc/passwd' }),
});

const response = await POST(request);
const data = await response.json();

expect(response.status).toBe(400);
expect(data.success).toBe(false);
expect(data.error).toContain('Invalid protocol');
});

it('should return error for private IP addresses (SSRF protection)', async () => {
// Mock lookup to return a private IP
vi.mocked(lookup).mockResolvedValueOnce({ address: '127.0.0.1', family: 4 });

const request = new NextRequest('http://localhost/api/lookup/url', {
method: 'POST',
body: JSON.stringify({ url: 'http://localhost:8080/admin' }),
});

const response = await POST(request);
const data = await response.json();

expect(response.status).toBe(403);
expect(data.success).toBe(false);
expect(data.error).toContain('Access to internal networks is forbidden');
});

it('should return error if DNS lookup fails', async () => {
vi.mocked(lookup).mockRejectedValueOnce(new Error('ENOTFOUND'));

const request = new NextRequest('http://localhost/api/lookup/url', {
method: 'POST',
body: JSON.stringify({ url: 'http://this-domain-does-not-exist-1234.com' }),
});

const response = await POST(request);
const data = await response.json();

expect(response.status).toBe(400);
expect(data.success).toBe(false);
expect(data.error).toContain('Could not resolve hostname');
});
Comment on lines +62 to +93
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new SSRF tests cover invalid protocol, loopback IPv4, and DNS failures, but they don’t exercise important bypass cases introduced by the implementation: IPv6 ULA/link-local (e.g. fdxx::/8, fe8x::/10), multi-address DNS results, and redirects to internal networks. Adding test cases for these would help ensure the protections actually hold for the intended threat model.

Copilot uses AI. Check for mistakes.

it('should extract metadata from HTML', async () => {
const mockHtml = `
<html>
Expand Down
62 changes: 62 additions & 0 deletions src/app/api/lookup/url/route.ts
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,43 @@
*/

import { NextRequest, NextResponse } from 'next/server';
import { lookup } from 'dns/promises';
import type { SourceType } from '@/types';

// Utility to check if an IP is private/local
function isPrivateIP(ip: string): boolean {
// IPv4 Private/Reserved Ranges
if (
ip.startsWith('10.') ||
ip.startsWith('127.') ||
ip.startsWith('169.254.') ||
ip.startsWith('192.168.') ||
ip === '0.0.0.0' ||
ip === '255.255.255.255'
) {
return true;
}

// 172.16.0.0/12
if (ip.startsWith('172.')) {
const secondOctet = parseInt(ip.split('.')[1], 10);
if (secondOctet >= 16 && secondOctet <= 31) return true;
}

// IPv6 Private/Reserved Ranges
if (
ip === '::1' ||
ip === '::' ||
ip.toLowerCase().startsWith('fc00:') ||
ip.toLowerCase().startsWith('fd00:') ||
ip.toLowerCase().startsWith('fe80:')
) {
return true;
}

return false;
Comment on lines 13 to +46
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The IPv6 checks in isPrivateIP are incomplete for the CIDR ranges being targeted: fc00::/7 is not equivalent to startsWith('fc00:')/startsWith('fd00:') (e.g. fd12::1 would be allowed), and fe80::/10 is not equivalent to startsWith('fe80:') (e.g. fe9a:: would be allowed). This leaves bypasses for internal/ULA/link-local IPv6 addresses; consider using a proper IP parser + CIDR range checks (e.g., ipaddr.js or similar) instead of string prefixes.

Suggested change
import type { SourceType } from '@/types';
// Utility to check if an IP is private/local
function isPrivateIP(ip: string): boolean {
// IPv4 Private/Reserved Ranges
if (
ip.startsWith('10.') ||
ip.startsWith('127.') ||
ip.startsWith('169.254.') ||
ip.startsWith('192.168.') ||
ip === '0.0.0.0' ||
ip === '255.255.255.255'
) {
return true;
}
// 172.16.0.0/12
if (ip.startsWith('172.')) {
const secondOctet = parseInt(ip.split('.')[1], 10);
if (secondOctet >= 16 && secondOctet <= 31) return true;
}
// IPv6 Private/Reserved Ranges
if (
ip === '::1' ||
ip === '::' ||
ip.toLowerCase().startsWith('fc00:') ||
ip.toLowerCase().startsWith('fd00:') ||
ip.toLowerCase().startsWith('fe80:')
) {
return true;
}
return false;
import * as ipaddr from 'ipaddr.js';
import type { SourceType } from '@/types';
// Utility to check if an IP is private/local
function isPrivateIP(ip: string): boolean {
try {
const addr = ipaddr.parse(ip);
if (addr.kind() === 'ipv4') {
return (
addr.match(ipaddr.parseCIDR('10.0.0.0/8')) ||
addr.match(ipaddr.parseCIDR('127.0.0.0/8')) ||
addr.match(ipaddr.parseCIDR('169.254.0.0/16')) ||
addr.match(ipaddr.parseCIDR('172.16.0.0/12')) ||
addr.match(ipaddr.parseCIDR('192.168.0.0/16')) ||
addr.toString() === '0.0.0.0' ||
addr.toString() === '255.255.255.255'
);
}
return (
addr.toString() === '::1' ||
addr.toString() === '::' ||
addr.match(ipaddr.parseCIDR('fc00::/7')) ||
addr.match(ipaddr.parseCIDR('fe80::/10'))
);
} catch {
return false;
}

Copilot uses AI. Check for mistakes.
}

interface MetadataResult {
title?: string;
description?: string;
Expand Down Expand Up @@ -77,6 +112,33 @@
);
}

// SSRF Protection: Only allow HTTP and HTTPS
if (parsedUrl.protocol !== 'http:' && parsedUrl.protocol !== 'https:') {
return NextResponse.json(
{ success: false, error: 'Invalid protocol' },
{ status: 400 }
);
}

Comment on lines +115 to +122
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even with the pre-fetch DNS/IP validation, fetch() will follow redirects by default. A request to a public URL that 3xx-redirects to an internal host/IP would bypass the current guard because only the original hostname is validated. Consider using redirect: 'manual' and explicitly validating (and optionally following) redirect targets with the same SSRF checks.

Copilot uses AI. Check for mistakes.
// SSRF Protection: DNS rebinding and internal network check
let resolvedIp: string;
try {
const { address } = await lookup(parsedUrl.hostname);
if (isPrivateIP(address)) {
return NextResponse.json(
{ success: false, error: 'Access to internal networks is forbidden' },
{ status: 403 }
);
}
resolvedIp = address;

Check failure on line 133 in src/app/api/lookup/url/route.ts

View workflow job for this annotation

GitHub Actions / Lint

'resolvedIp' is assigned a value but never used. Allowed unused vars must match /^_/u
Comment on lines +126 to +133
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dns.lookup(parsedUrl.hostname) only returns a single address. If the hostname has multiple A/AAAA records (or changes between the check and fetch), the pre-check can pass on a public IP while the actual fetch resolves to a private/loopback/link-local IP. Consider calling lookup with { all: true } and rejecting if any resolved address is private, and/or pinning resolution at request time via a custom dispatcher/agent if you need stronger DNS-rebinding resistance.

Suggested change
const { address } = await lookup(parsedUrl.hostname);
if (isPrivateIP(address)) {
return NextResponse.json(
{ success: false, error: 'Access to internal networks is forbidden' },
{ status: 403 }
);
}
resolvedIp = address;
const addresses = await lookup(parsedUrl.hostname, { all: true });
if (addresses.length === 0) {
return NextResponse.json(
{ success: false, error: 'Could not resolve hostname' },
{ status: 400 }
);
}
if (addresses.some(({ address }) => isPrivateIP(address))) {
return NextResponse.json(
{ success: false, error: 'Access to internal networks is forbidden' },
{ status: 403 }
);
}
resolvedIp = addresses[0].address;

Copilot uses AI. Check for mistakes.
} catch (dnsError) {
Comment on lines +124 to +134
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolvedIp is assigned but never used. Either remove it, or use it for something meaningful (e.g., logging/telemetry), otherwise it becomes dead code and can mislead readers into thinking the resolved IP is being used/pinned for the subsequent request.

Copilot uses AI. Check for mistakes.
console.warn(`DNS lookup failed for ${parsedUrl.hostname}:`, dnsError);
return NextResponse.json(
{ success: false, error: 'Could not resolve hostname' },
{ status: 400 }
);
}

// Detect the source type from the URL and grab any type-specific hints
const detection = detectSourceTypeFromUrl(parsedUrl);

Expand Down
Loading