Skip to content

Commit b8de7c1

Browse files
committed
Add look up table for FQN to cache
1 parent 4da4dde commit b8de7c1

8 files changed

Lines changed: 169 additions & 53 deletions

File tree

docs/CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
2626
- **Attribute constructor signature help.** Signature help and named parameter completion now fire inside PHP 8 attribute parentheses (`#[Route('/path', <>)]`), showing the attribute class's constructor parameters.
2727
- **Closure variable scope isolation.** Variables declared outside a closure are no longer offered as completions inside the closure body unless captured via `use()`. Previously outer variables leaked into closure scope.
2828

29+
### Changed
30+
31+
- **Faster class lookups.** Class resolution now uses an O(1) hash-map lookup by fully-qualified name instead of scanning every parsed file. Projects with hundreds of open files see reduced latency on completion, hover, and go-to-definition requests.
32+
2933
### Fixed
3034

3135
- **Signature help on function definitions.** Signature help no longer fires when the cursor is inside a function or method definition's parameter list (e.g. `function foo(int $a, |)`). Previously it could incorrectly show the signature of a same-named global function.

docs/todo.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -76,11 +76,11 @@ parallel file processing.
7676
| 76 | Inline Variable | Medium | Code Actions | [actions.md §7](todo/actions.md#7-inline-variable) |
7777
| 77 | Extract Variable | Medium | Code Actions | [actions.md §8](todo/actions.md#8-extract-variable) |
7878
| 78 | Inline Function/Method | High | Code Actions | [actions.md §9](todo/actions.md#9-inline-functionmethod) |
79-
| 82 | FQN secondary index for `find_class_in_ast_map` | Low | Performance | [performance.md §1](todo/performance.md#1-fqn-secondary-index-for-find_class_in_ast_map) |
8079
| 84 | `HashSet` dedup in inheritance merging | Low | Performance | [performance.md §4](todo/performance.md#4-hashset-dedup-in-inheritance-merging) |
8180
| 87 | Reference-counted `ClassInfo` (`Arc<ClassInfo>`) | Medium | Performance | [performance.md §2](todo/performance.md#2-reference-counted-classinfo-arcclassinfo) |
8281
| 88 | Early-exit and `Cow` return in `apply_substitution` | Low | Performance | [performance.md §7](todo/performance.md#7-recursive-string-substitution-in-apply_substitution) |
8382
| 90 | Lazy autoload file indexing | Medium | Indexing | [indexing.md §2.5](todo/indexing.md#phase-25-lazy-autoload-file-indexing) |
83+
| 91 | Non-Composer function/constant discovery | Low | Indexing | [indexing.md §2.6](todo/indexing.md#phase-26-non-composer-function-and-constant-discovery) |
8484

8585
---
8686

docs/todo/indexing.md

Lines changed: 99 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -327,6 +327,102 @@ limitation applies to the classmap scanner today.
327327

328328
---
329329

330+
## Phase 2.6: Non-Composer function and constant discovery
331+
332+
**Goal:** In projects without Composer, discover standalone functions,
333+
`define()` constants, and top-level `const` declarations across the
334+
entire workspace so that function completion, go-to-definition, and
335+
constant resolution work without a `vendor/composer/autoload_files.php`
336+
manifest.
337+
338+
### Problem
339+
340+
Composer projects separate class discovery from function/constant
341+
discovery. Classes live in PSR-4/classmap directories and are found
342+
by namespace-to-path mapping. Functions and constants live in files
343+
listed explicitly in `autoload.files` (and transitively via
344+
`require_once` chains). PHPantom already handles both paths: the
345+
classmap scanner (Phase 1) covers classes, and the autoload-file
346+
walker covers functions and constants.
347+
348+
Non-Composer projects have no such separation. A single PHP file may
349+
define classes, standalone functions, and constants side by side.
350+
Today PHPantom's self-scan (Phase 1 fallback) only extracts class
351+
names from these files. Functions and constants are invisible until
352+
the user happens to open the file that defines them, at which point
353+
`update_ast` populates `global_functions` and `global_defines`.
354+
355+
This means non-Composer projects get no function name completion, no
356+
go-to-definition for cross-file functions, and no constant resolution
357+
for anything outside the currently open files.
358+
359+
### Relationship to Phase 2.5
360+
361+
Phase 2.5 replaces eager `update_ast` calls on Composer autoload files
362+
with a lightweight byte-level scan that extracts function names, constant
363+
names, and class names without building a full AST. The scanner design
364+
in Phase 2.5 (recognising `function`, `define(`, and `const` keywords
365+
alongside class keywords) is exactly what non-Composer discovery needs.
366+
367+
The difference is scope, not mechanism:
368+
369+
| Scenario | Files to scan for classes | Files to scan for functions/constants |
370+
|---|---|---|
371+
| **Composer** | PSR-4 + classmap directories + vendor packages | `autoload_files.php` entries only |
372+
| **Non-Composer** | All PHP files in workspace | All PHP files in workspace |
373+
374+
In Composer mode, the classmap scan and the autoload-file scan are
375+
separate passes over disjoint file sets. In non-Composer mode, a single
376+
pass over all workspace files extracts classes, functions, and constants
377+
together. The byte-level scanner from Phase 2.5 handles both cases: it
378+
just runs on a wider set of files and populates additional indices.
379+
380+
### Implementation
381+
382+
Extend the Phase 1 self-scan fallback (the path taken when no
383+
`composer.json` exists) to also extract function and constant names:
384+
385+
1. **Scanner.** Reuse the extended byte-level scanner from Phase 2.5.
386+
When scanning a file, extract class declarations (as today) plus
387+
function declarations and `define()`/`const` constants. This is a
388+
single pass per file with no additional I/O.
389+
390+
2. **Indices.** Populate three indices from the scan results:
391+
- `classmap` — FQN → file path (already done by Phase 1).
392+
- `autoload_function_index` — function FQN → file path (new,
393+
same structure as Phase 2.5).
394+
- `autoload_constant_index` — constant name → file path (new,
395+
same structure as Phase 2.5).
396+
397+
3. **Resolution.** No additional resolution changes beyond Phase 2.5.
398+
`find_or_load_function` and constant resolution already consult
399+
the autoload indices when Phase 2.5 is in place. The only
400+
difference is that non-Composer mode populates those indices from
401+
a workspace walk instead of from `autoload_files.php`.
402+
403+
4. **Composer mode.** When Composer is present, the workspace-wide
404+
function/constant scan is unnecessary because `autoload_files.php`
405+
already tells us which files to scan. The broader scan only runs
406+
in non-Composer mode (no `composer.json`) or `"self"` strategy
407+
mode.
408+
409+
### Effort and dependencies
410+
411+
**Effort:** Low. This is a thin integration layer on top of Phase 2.5's
412+
scanner. The scanner already exists; the only new work is calling it on
413+
all workspace files (instead of just autoload files) and populating
414+
the function/constant indices from the results.
415+
416+
**Dependencies:** Phase 2.5 (the scanner and the autoload indices must
417+
exist before this phase can populate them from a different file set).
418+
Phase 1 (the workspace file walk for non-Composer projects must exist).
419+
420+
**Sequencing:** This phase should land immediately after Phase 2.5 or
421+
as part of the same PR, since the two share the scanner and index
422+
structures.
423+
424+
---
425+
330426
## Phase 3: Parallel file processing
331427

332428
**Goal:** Speed up workspace-wide operations (find references,
@@ -440,11 +536,9 @@ scanning, and complete completion item detail.
440536

441537
**Prerequisites (from [performance.md](performance.md)):**
442538

443-
- **§1 FQN secondary index.** The second pass calls `update_ast` on
444-
every file, populating `ast_map` with thousands of entries. Without
445-
a FQN index, every `find_class_in_ast_map` call (Phase 1 of
446-
`find_or_load_class`) becomes an O(thousands) linear scan. With the
447-
index, it is O(1).
539+
- **§1 FQN secondary index.** ✅ Done. `fqn_index` provides O(1)
540+
lookups by fully-qualified name, so the second pass populating
541+
`ast_map` with thousands of entries no longer causes linear scans.
448542
- **§2 `Arc<ClassInfo>`.** Full indexing stores a `ClassInfo` for every
449543
class in the project. Without `Arc`, every resolution clones the
450544
entire struct out of the map. With `Arc`, retrieval is a

docs/todo/performance.md

Lines changed: 14 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -16,51 +16,20 @@ within the same impact tier.
1616
---
1717

1818
## 1. FQN secondary index for `find_class_in_ast_map`
19-
**Impact: High · Effort: Low**
20-
21-
`find_class_in_ast_map` is the Phase 1 lookup in `find_or_load_class`.
22-
It iterates **every file's classes** in the `ast_map` to find a class
23-
by short name + namespace match. In a project with hundreds of parsed
24-
files, this O(files × classes_per_file) scan runs for every class
25-
lookup in every resolution chain. A single completion request that
26-
resolves `$this->` can invoke this dozens of times as it walks the
27-
inheritance chain, loads traits, resolves interfaces, and processes
28-
mixins.
29-
30-
The `class_index` (`HashMap<String, String>`) already maps FQN → URI
31-
but stops short: after finding the URI, the code still iterates all
32-
classes in that file to find the right one.
33-
34-
Under full background indexing (indexing.md Phase 5), "all files"
35-
means every PHP file in the project. The linear scan becomes the
36-
dominant bottleneck.
37-
38-
### Fix
39-
40-
Add a secondary index `HashMap<String, ClassInfo>` (or
41-
`HashMap<String, Arc<ClassInfo>>` if §2 lands first) that maps
42-
fully-qualified class names directly to their parsed `ClassInfo`.
43-
This turns every Phase 1 lookup into an O(1) hash lookup.
44-
45-
### Maintenance
46-
47-
The index must be updated in `update_ast_inner` (when files are
48-
opened/changed) and in `parse_and_cache_content_versioned` (when
49-
files are loaded on demand via classmap, PSR-4, or stubs). Both
50-
code paths already maintain `ast_map` and `class_index`, so adding
51-
a third insertion is straightforward.
52-
53-
When a file is re-parsed, remove all old entries for that file's
54-
classes (snapshot FQNs before overwriting `ast_map`) and insert
55-
the new ones. This mirrors the existing `class_index` maintenance
56-
in `update_ast_inner`.
57-
58-
### Migration path
59-
60-
Once the FQN index is in place, `find_class_in_ast_map` becomes a
61-
single hash lookup. The linear scan can be kept as a fallback for
62-
edge cases (e.g. anonymous classes that don't have stable FQNs) but
63-
should never be the primary path.
19+
**Impact: High · Effort: Low (fixed)**
20+
21+
**Status:** Fixed. `Backend` now carries a `fqn_index`
22+
(`Arc<RwLock<HashMap<String, ClassInfo>>>`) that maps fully-qualified
23+
class names directly to their parsed `ClassInfo`.
24+
`find_class_in_ast_map` performs an O(1) hash lookup against this
25+
index before falling back to the linear `ast_map` scan (which now
26+
only serves as a safety net for anonymous classes or race conditions
27+
during initial indexing).
28+
29+
The index is maintained in both `update_ast_inner` (stale entries
30+
removed via the `old_fqns` snapshot, new entries inserted alongside
31+
`class_index`) and `parse_and_cache_content_versioned` (entries
32+
inserted after the `ast_map` write).
6433

6534
---
6635

src/lib.rs

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -190,6 +190,14 @@ pub struct Backend {
190190
/// Populated during `update_ast` (using the file's namespace + class
191191
/// short name) and during server initialization for autoload files.
192192
pub(crate) class_index: Arc<RwLock<HashMap<String, String>>>,
193+
/// Secondary index mapping fully-qualified class names directly to
194+
/// their parsed `ClassInfo`.
195+
///
196+
/// This turns every Phase 1 lookup in [`find_or_load_class`] into an
197+
/// O(1) hash lookup instead of scanning all files in `ast_map`.
198+
/// Maintained alongside `class_index` in `update_ast_inner` and
199+
/// `parse_and_cache_content_versioned`.
200+
pub(crate) fqn_index: Arc<RwLock<HashMap<String, ClassInfo>>>,
193201
/// Composer classmap: fully-qualified class name → file path on disk.
194202
///
195203
/// Parsed from `<vendor>/composer/autoload_classmap.php` during server
@@ -317,6 +325,7 @@ impl Backend {
317325
global_functions: Arc::new(RwLock::new(HashMap::new())),
318326
global_defines: Arc::new(RwLock::new(HashMap::new())),
319327
class_index: Arc::new(RwLock::new(HashMap::new())),
328+
fqn_index: Arc::new(RwLock::new(HashMap::new())),
320329
classmap: Arc::new(RwLock::new(HashMap::new())),
321330
stub_index: stubs::build_stub_class_index(),
322331
stub_function_index: stubs::build_stub_function_index(),
@@ -458,6 +467,7 @@ impl Backend {
458467
global_functions: Arc::clone(&self.global_functions),
459468
global_defines: Arc::clone(&self.global_defines),
460469
class_index: Arc::clone(&self.class_index),
470+
fqn_index: Arc::clone(&self.fqn_index),
461471
classmap: Arc::clone(&self.classmap),
462472
stub_index: self.stub_index.clone(),
463473
resolved_class_cache: Arc::clone(&self.resolved_class_cache),

src/parser/ast_update.rs

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -303,13 +303,20 @@ impl Backend {
303303
// that files with multiple namespace blocks produce correct FQNs.
304304
{
305305
let mut idx = self.class_index.write();
306+
let mut fqn_idx = self.fqn_index.write();
306307
// Remove stale entries from previous parses of this file.
307308
// When a file's namespace changes (e.g. while the user is
308309
// typing a namespace declaration), old FQNs linger under
309310
// the previous namespace and pollute completions.
310311
idx.retain(|_, uri| uri != &uri_string);
311312

312-
for (class, class_ns) in &classes_with_ns {
313+
// Remove stale fqn_index entries for FQNs that belonged to
314+
// the previous version of this file.
315+
for old_fqn in &old_fqns {
316+
fqn_idx.remove(old_fqn);
317+
}
318+
319+
for (i, (class, class_ns)) in classes_with_ns.iter().enumerate() {
313320
// Anonymous classes (named `__anonymous@<offset>`) are
314321
// internal bookkeeping — they should never appear in
315322
// cross-file lookups or completion results.
@@ -321,7 +328,10 @@ impl Backend {
321328
} else {
322329
class.name.clone()
323330
};
324-
idx.insert(fqn, uri_string.clone());
331+
idx.insert(fqn.clone(), uri_string.clone());
332+
// The `classes` vec already has `file_namespace` set,
333+
// so use it for the fqn_index entry.
334+
fqn_idx.insert(fqn, classes[i].clone());
325335
}
326336
}
327337

src/resolution.rs

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -201,6 +201,22 @@ impl Backend {
201201
.write()
202202
.insert(uri.to_owned(), file_namespace);
203203

204+
// Populate the fqn_index so that `find_class_in_ast_map` can
205+
// resolve these classes via O(1) hash lookup.
206+
{
207+
let mut fqn_idx = self.fqn_index.write();
208+
for cls in &classes {
209+
if cls.name.starts_with("__anonymous@") {
210+
continue;
211+
}
212+
let fqn = match &cls.file_namespace {
213+
Some(ns) if !ns.is_empty() => format!("{}\\{}", ns, cls.name),
214+
_ => cls.name.clone(),
215+
};
216+
fqn_idx.insert(fqn, cls.clone());
217+
}
218+
}
219+
204220
// Selectively invalidate the resolved-class cache for the
205221
// classes defined in this file. Loading a new file from disk
206222
// (classmap, PSR-4, stubs) should not nuke cached resolutions

src/util.rs

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -621,6 +621,19 @@ impl Backend {
621621
/// Returns a cloned `ClassInfo` if found, or `None`.
622622
pub(crate) fn find_class_in_ast_map(&self, class_name: &str) -> Option<ClassInfo> {
623623
let normalized = class_name.strip_prefix('\\').unwrap_or(class_name);
624+
625+
// ── Fast path: O(1) lookup via fqn_index ──
626+
// For namespace-qualified names the FQN is the normalized name
627+
// itself. For bare names (no backslash) the FQN equals the
628+
// short name, which is also stored in the index.
629+
if let Some(cls) = self.fqn_index.read().get(normalized) {
630+
return Some(cls.clone());
631+
}
632+
633+
// ── Slow fallback: linear scan of ast_map ──
634+
// Covers edge cases where the fqn_index has not been populated
635+
// yet (e.g. anonymous classes, or race conditions during initial
636+
// indexing).
624637
let last_segment = short_name(normalized);
625638
let expected_ns: Option<&str> = if normalized.contains('\\') {
626639
Some(&normalized[..normalized.len() - last_segment.len() - 1])

0 commit comments

Comments
 (0)