Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++20] [Modules] Support module level lookup #122887

Merged
merged 1 commit into from
Jan 15, 2025

Conversation

ChuanqiXu9
Copy link
Member

@ChuanqiXu9 ChuanqiXu9 commented Jan 14, 2025

Close #90154

This patch is also an optimization to the lookup process to utilize the information provided by export keyword.

Previously, in the lookup process, the export keyword only takes part in the check part, it doesn't get involved in the lookup process. That said, previously, in a name lookup for 'name', we would load all of declarations with the name 'name' and check if these declarations are valid or not. It works well. But it is inefficient since it may load declarations that may not be wanted.

Note that this patch actually did a trick in the lookup process instead of bring module information to DeclarationName or considering module information when deciding if two declarations are the same. So it may not be a surprise to me if there are missing cases. But it is not a regression. It should be already the case. Issue reports are welcomed.

In this patch, I tried to split the big lookup table into a lookup table as before and a module local lookup table, which takes a combination of the ID of the DeclContext and hash value of the primary module name as the key. And refactored DeclContext::lookup() method to take the module information. So that a lookup in a DeclContext won't load declarations that are local to other modules.

And also I think it is already beneficial to split the big lookup table since it may reduce the conflicts during lookups in the hash table.

BTW, this patch introduced a regression for a reachability rule in C++20 but it was false-negative. See
'clang/test/CXX/module/module.interface/p7.cpp' for details.

This patch is not expected to introduce any other
regressions for non-c++20-modules users since the module local lookup table should be empty for them.


On the API side, this patch unfortunately add a maybe-confusing argument Module *NamedModule to ExternalASTSource::FindExternalVisibleDeclsByName(). People may think we can get the information from the first argument const DeclContext *DC. But sadly there are declarations (e.g., namespace) can appear in multiple different modules as a single declaration. So we have to add additional information to indicate this.

@ChuanqiXu9 ChuanqiXu9 added clang:modules C++20 modules and Clang Header Modules skip-precommit-approval PR for CI feedback, not intended for review labels Jan 14, 2025
@ChuanqiXu9 ChuanqiXu9 self-assigned this Jan 14, 2025
@llvmbot llvmbot added clang Clang issues not falling into any other category lldb clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Jan 14, 2025
@llvmbot
Copy link
Member

llvmbot commented Jan 14, 2025

@llvm/pr-subscribers-lldb
@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-modules

Author: Chuanqi Xu (ChuanqiXu9)

Changes

Close #90154

This patch is also an optimization to the lookup process to utilize the information provided by export keyword.

Previously, in the lookup process, the export keyword only takes part in the check part, it doesn't get involved in the lookup process. That said, previously, in a name lookup for 'name', we would load all of declarations with the name 'name' and check if these declarations are valid or not. It works well. But it is inefficient since it may load declarations that may not be wanted.

Note that this patch actually did a trick in the lookup process instead of bring module information to DeclarationName or considering module information when deciding if two declarations are the same. So it may not be a surprise to me if there are missing cases. But it is not a regression. It should be already the case. Issue reports are welcomed.

In this patch, I tried to split the big lookup table into a lookup table as before and a module local lookup table, which takes a combination of the ID of the DeclContext and hash value of the primary module name as the key. And refactored DeclContext::lookup() method to take the module information. So that a lookup in a DeclContext won't load declarations that are local to other modules.

And also I think it is already beneficial to split the big lookup table since it may reduce the conflicts during lookups in the hash table.

BTW, this patch introduced a regression for a reachability rule in C++20 but it was false-negative. See
'clang/test/CXX/module/module.interface/p7.cpp' for details.

This patch is not expected to introduce any other
regressions for non-c++20-modules users since the module local lookup table should be empty for them.


Patch is 76.01 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/122887.diff

35 Files Affected:

  • (modified) clang/docs/ReleaseNotes.rst (+2)
  • (modified) clang/include/clang/AST/DeclBase.h (+10)
  • (modified) clang/include/clang/AST/ExternalASTMerger.h (+2-1)
  • (modified) clang/include/clang/AST/ExternalASTSource.h (+8-2)
  • (modified) clang/include/clang/Sema/MultiplexExternalSemaSource.h (+2-1)
  • (modified) clang/include/clang/Serialization/ASTBitCodes.h (+6)
  • (modified) clang/include/clang/Serialization/ASTReader.h (+28-4)
  • (modified) clang/include/clang/Serialization/ASTWriter.h (+13-3)
  • (modified) clang/lib/AST/DeclBase.cpp (+19-4)
  • (modified) clang/lib/AST/ExternalASTMerger.cpp (+2-1)
  • (modified) clang/lib/AST/ExternalASTSource.cpp (+3-3)
  • (modified) clang/lib/Interpreter/CodeCompletion.cpp (+4-2)
  • (modified) clang/lib/Sema/MultiplexExternalSemaSource.cpp (+4-3)
  • (modified) clang/lib/Serialization/ASTReader.cpp (+158-37)
  • (modified) clang/lib/Serialization/ASTReaderDecl.cpp (+52-17)
  • (modified) clang/lib/Serialization/ASTReaderInternals.h (+56-16)
  • (modified) clang/lib/Serialization/ASTWriter.cpp (+213-60)
  • (modified) clang/lib/Serialization/ASTWriterDecl.cpp (+11-2)
  • (modified) clang/test/CXX/basic/basic.scope/basic.scope.namespace/p2.cpp (+2-2)
  • (modified) clang/test/CXX/module/basic/basic.link/p2.cppm (+1-2)
  • (modified) clang/test/CXX/module/module.import/p2.cpp (+2-8)
  • (modified) clang/test/CXX/module/module.interface/p7.cpp (+4-6)
  • (modified) clang/test/CXX/module/module.reach/p5.cpp (+1-2)
  • (modified) clang/test/Modules/Reachability-template-default-arg.cpp (+1-2)
  • (modified) clang/test/Modules/cxx20-10-1-ex2.cpp (+1-2)
  • (modified) clang/test/Modules/deduction-guide3.cppm (+1-3)
  • (added) clang/test/Modules/module-local-with-templates.cppm (+68)
  • (added) clang/test/Modules/pr90154.cppm (+25)
  • (modified) clang/unittests/AST/ExternalASTSourceTest.cpp (+2-1)
  • (modified) lldb/source/Plugins/ExpressionParser/Clang/ASTUtils.h (+6-4)
  • (modified) lldb/source/Plugins/ExpressionParser/Clang/ClangASTSource.cpp (+2-1)
  • (modified) lldb/source/Plugins/ExpressionParser/Clang/ClangASTSource.h (+5-3)
  • (modified) lldb/source/Plugins/ExpressionParser/Clang/ClangExternalASTSourceCallbacks.cpp (+2-1)
  • (modified) lldb/source/Plugins/ExpressionParser/Clang/ClangExternalASTSourceCallbacks.h (+2-1)
  • (modified) lldb/source/Plugins/LanguageRuntime/ObjC/AppleObjCRuntime/AppleObjCDeclVendor.cpp (+2-1)
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 9eeb872aa57d79..142ed6e19b0a79 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -316,6 +316,8 @@ C++23 Feature Support
 C++20 Feature Support
 ^^^^^^^^^^^^^^^^^^^^^
 
+- Implemented module level lookup for C++20 modules. (#GH90154)
+
 
 Resolutions to C++ Defect Reports
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
diff --git a/clang/include/clang/AST/DeclBase.h b/clang/include/clang/AST/DeclBase.h
index 77abd8b657a616..b999ae6724e3cd 100644
--- a/clang/include/clang/AST/DeclBase.h
+++ b/clang/include/clang/AST/DeclBase.h
@@ -836,6 +836,10 @@ class alignas(8) Decl {
     return isFromASTFile() ? getImportedOwningModule() : getLocalOwningModule();
   }
 
+  /// Get the top level owning named module that owns this declaration if any.
+  /// \returns nullptr if the declaration is not owned by a named module.
+  Module *getTopLevelOwningNamedModule() const;
+
   /// Get the module that owns this declaration for linkage purposes.
   /// There only ever is such a standard C++ module.
   Module *getOwningModuleForLinkage() const;
@@ -2711,6 +2715,12 @@ class DeclContext {
                    bool Deserialize = false) const;
 
 private:
+  /// Lookup all external visible declarations and the external declarations
+  /// within the same module specified by \param NamedModule. We can't
+  /// get it from \param this since the same declaration may be declared in
+  /// multiple modules. e.g., namespace.
+  lookup_result lookupImpl(DeclarationName Name, Module *NamedModule) const;
+
   /// Whether this declaration context has had externally visible
   /// storage added since the last lookup. In this case, \c LookupPtr's
   /// invariant may not hold and needs to be fixed before we perform
diff --git a/clang/include/clang/AST/ExternalASTMerger.h b/clang/include/clang/AST/ExternalASTMerger.h
index ec4cfbe2175c02..46f187c5e06948 100644
--- a/clang/include/clang/AST/ExternalASTMerger.h
+++ b/clang/include/clang/AST/ExternalASTMerger.h
@@ -141,7 +141,8 @@ class ExternalASTMerger : public ExternalASTSource {
 
   /// Implementation of the ExternalASTSource API.
   bool FindExternalVisibleDeclsByName(const DeclContext *DC,
-                                      DeclarationName Name) override;
+                                      DeclarationName Name,
+                                      Module *NamedModule) override;
 
   /// Implementation of the ExternalASTSource API.
   void
diff --git a/clang/include/clang/AST/ExternalASTSource.h b/clang/include/clang/AST/ExternalASTSource.h
index 4d7ff822fceb75..52f47802095171 100644
--- a/clang/include/clang/AST/ExternalASTSource.h
+++ b/clang/include/clang/AST/ExternalASTSource.h
@@ -51,6 +51,7 @@ class RecordDecl;
 class Selector;
 class Stmt;
 class TagDecl;
+class Module;
 
 /// Abstract interface for external sources of AST nodes.
 ///
@@ -145,12 +146,17 @@ class ExternalASTSource : public RefCountedBase<ExternalASTSource> {
   /// Find all declarations with the given name in the given context,
   /// and add them to the context by calling SetExternalVisibleDeclsForName
   /// or SetNoExternalVisibleDeclsForName.
+  /// \param NamedModule, when this is set the external module local
+  /// declarations within the same module of \param NamedModule will be found
+  /// too. The \param NamedModule may be different than the owning module of
+  /// \param DC since the same namespace can appear in multiple module units.
   /// \return \c true if any declarations might have been found, \c false if
   /// we definitely have no declarations with tbis name.
   ///
   /// The default implementation of this method is a no-op returning \c false.
-  virtual bool
-  FindExternalVisibleDeclsByName(const DeclContext *DC, DeclarationName Name);
+  virtual bool FindExternalVisibleDeclsByName(const DeclContext *DC,
+                                              DeclarationName Name,
+                                              Module *NamedModule);
 
   /// Load all the external specializations for the Decl \param D if \param
   /// OnlyPartial is false. Otherwise, load all the external **partial**
diff --git a/clang/include/clang/Sema/MultiplexExternalSemaSource.h b/clang/include/clang/Sema/MultiplexExternalSemaSource.h
index 0c92c52854c9e7..08d6143f7caaf3 100644
--- a/clang/include/clang/Sema/MultiplexExternalSemaSource.h
+++ b/clang/include/clang/Sema/MultiplexExternalSemaSource.h
@@ -95,7 +95,8 @@ class MultiplexExternalSemaSource : public ExternalSemaSource {
   /// Find all declarations with the given name in the
   /// given context.
   bool FindExternalVisibleDeclsByName(const DeclContext *DC,
-                                      DeclarationName Name) override;
+                                      DeclarationName Name,
+                                      Module *NamedModule) override;
 
   bool LoadExternalSpecializations(const Decl *D, bool OnlyPartial) override;
 
diff --git a/clang/include/clang/Serialization/ASTBitCodes.h b/clang/include/clang/Serialization/ASTBitCodes.h
index aac165130b7192..40dae25f7b54b7 100644
--- a/clang/include/clang/Serialization/ASTBitCodes.h
+++ b/clang/include/clang/Serialization/ASTBitCodes.h
@@ -738,6 +738,8 @@ enum ASTRecordTypes {
   CXX_ADDED_TEMPLATE_SPECIALIZATION = 74,
 
   CXX_ADDED_TEMPLATE_PARTIAL_SPECIALIZATION = 75,
+
+  UPDATE_MODULE_LOCAL_VISIBLE = 76,
 };
 
 /// Record types used within a source manager block.
@@ -1334,6 +1336,10 @@ enum DeclCode {
   /// into a DeclContext via DeclContext::lookup.
   DECL_CONTEXT_VISIBLE,
 
+  /// A record containing the set of declarations that are
+  /// only visible from DeclContext in the same module.
+  DECL_CONTEXT_MODULE_LOCAL_VISIBLE,
+
   /// A LabelDecl record.
   DECL_LABEL,
 
diff --git a/clang/include/clang/Serialization/ASTReader.h b/clang/include/clang/Serialization/ASTReader.h
index 9f978762a6fb6b..ea12adaec3ee81 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -353,6 +353,7 @@ class ASTIdentifierLookupTrait;
 
 /// The on-disk hash table(s) used for DeclContext name lookup.
 struct DeclContextLookupTable;
+struct ModuleLocalLookupTable;
 
 /// The on-disk hash table(s) used for specialization decls.
 struct LazySpecializationInfoLookupTable;
@@ -523,9 +524,14 @@ class ASTReader
   /// in the chain.
   DeclUpdateOffsetsMap DeclUpdateOffsets;
 
+  struct LookupBlockOffsets {
+    uint64_t LexicalOffset;
+    uint64_t VisibleOffset;
+    uint64_t ModuleLocalOffset;
+  };
+
   using DelayedNamespaceOffsetMapTy =
-      llvm::DenseMap<GlobalDeclID, std::pair</*LexicalOffset*/ uint64_t,
-                                             /*VisibleOffset*/ uint64_t>>;
+      llvm::DenseMap<GlobalDeclID, LookupBlockOffsets>;
 
   /// Mapping from global declaration IDs to the lexical and visible block
   /// offset for delayed namespace in reduced BMI.
@@ -631,6 +637,9 @@ class ASTReader
   /// Map from a DeclContext to its lookup tables.
   llvm::DenseMap<const DeclContext *,
                  serialization::reader::DeclContextLookupTable> Lookups;
+  llvm::DenseMap<const DeclContext *,
+                 serialization::reader::ModuleLocalLookupTable>
+      ModuleLocalLookups;
 
   using SpecLookupTableTy =
       llvm::DenseMap<const Decl *,
@@ -659,6 +668,8 @@ class ASTReader
   /// Updates to the visible declarations of declaration contexts that
   /// haven't been loaded yet.
   llvm::DenseMap<GlobalDeclID, DeclContextVisibleUpdates> PendingVisibleUpdates;
+  llvm::DenseMap<GlobalDeclID, DeclContextVisibleUpdates>
+      PendingModuleLocalVisibleUpdates;
 
   using SpecializationsUpdate = SmallVector<UpdateData, 1>;
   using SpecializationsUpdateMap =
@@ -696,7 +707,8 @@ class ASTReader
   /// Read the record that describes the visible contents of a DC.
   bool ReadVisibleDeclContextStorage(ModuleFile &M,
                                      llvm::BitstreamCursor &Cursor,
-                                     uint64_t Offset, GlobalDeclID ID);
+                                     uint64_t Offset, GlobalDeclID ID,
+                                     bool IsModuleLocal);
 
   bool ReadSpecializations(ModuleFile &M, llvm::BitstreamCursor &Cursor,
                            uint64_t Offset, Decl *D, bool IsPartial);
@@ -1132,6 +1144,10 @@ class ASTReader
   /// Number of visible decl contexts read/total.
   unsigned NumVisibleDeclContextsRead = 0, TotalVisibleDeclContexts = 0;
 
+  /// Number of module local visible decl contexts read/total.
+  unsigned NumModuleLocalVisibleDeclContexts = 0,
+           TotalModuleLocalVisibleDeclContexts = 0;
+
   /// Total size of modules, in bits, currently loaded
   uint64_t TotalModulesSizeInBits = 0;
 
@@ -1444,6 +1460,9 @@ class ASTReader
   const serialization::reader::DeclContextLookupTable *
   getLoadedLookupTables(DeclContext *Primary) const;
 
+  const serialization::reader::ModuleLocalLookupTable *
+  getModuleLocalLookupTables(DeclContext *Primary) const;
+
   /// Get the loaded specializations lookup tables for \p D,
   /// if any.
   serialization::reader::LazySpecializationInfoLookupTable *
@@ -2119,7 +2138,8 @@ class ASTReader
   /// The current implementation of this method just loads the entire
   /// lookup table as unmaterialized references.
   bool FindExternalVisibleDeclsByName(const DeclContext *DC,
-                                      DeclarationName Name) override;
+                                      DeclarationName Name,
+                                      Module *NamedModule) override;
 
   /// Read all of the declarations lexically stored in a
   /// declaration context.
@@ -2607,6 +2627,10 @@ inline bool shouldSkipCheckingODR(const Decl *D) {
          (D->isFromGlobalModule() || D->isFromHeaderUnit());
 }
 
+/// Calculate a hash value for the primary module name of the given module.
+/// \returns std::nullopt if M is not a C++ standard module.
+std::optional<unsigned> getPrimaryModuleHash(const Module *M);
+
 } // namespace clang
 
 #endif // LLVM_CLANG_SERIALIZATION_ASTREADER_H
diff --git a/clang/include/clang/Serialization/ASTWriter.h b/clang/include/clang/Serialization/ASTWriter.h
index adb7cce522a803..53b09cc914392e 100644
--- a/clang/include/clang/Serialization/ASTWriter.h
+++ b/clang/include/clang/Serialization/ASTWriter.h
@@ -492,6 +492,10 @@ class ASTWriter : public ASTDeserializationListener,
   /// file.
   unsigned NumVisibleDeclContexts = 0;
 
+  /// The number of module local visible declcontexts written to the AST
+  /// file.
+  unsigned NumModuleLocalDeclContexts = 0;
+
   /// A mapping from each known submodule to its ID number, which will
   /// be a positive integer.
   llvm::DenseMap<const Module *, unsigned> SubmoduleIDs;
@@ -587,11 +591,15 @@ class ASTWriter : public ASTDeserializationListener,
   uint64_t WriteSpecializationInfoLookupTable(
       const NamedDecl *D, llvm::SmallVectorImpl<const Decl *> &Specializations,
       bool IsPartial);
-  void GenerateNameLookupTable(ASTContext &Context, const DeclContext *DC,
-                               llvm::SmallVectorImpl<char> &LookupTable);
+  void
+  GenerateNameLookupTable(ASTContext &Context, const DeclContext *DC,
+                          llvm::SmallVectorImpl<char> &LookupTable,
+                          llvm::SmallVectorImpl<char> &ModuleLocalLookupTable);
   uint64_t WriteDeclContextLexicalBlock(ASTContext &Context,
                                         const DeclContext *DC);
-  uint64_t WriteDeclContextVisibleBlock(ASTContext &Context, DeclContext *DC);
+  void WriteDeclContextVisibleBlock(ASTContext &Context, DeclContext *DC,
+                                    uint64_t &VisibleBlockOffset,
+                                    uint64_t &ModuleLocalBlockOffset);
   void WriteTypeDeclOffsets();
   void WriteFileDeclIDsMap();
   void WriteComments(ASTContext &Context);
@@ -624,7 +632,9 @@ class ASTWriter : public ASTDeserializationListener,
   unsigned DeclParmVarAbbrev = 0;
   unsigned DeclContextLexicalAbbrev = 0;
   unsigned DeclContextVisibleLookupAbbrev = 0;
+  unsigned DeclModuleLocalVisibleLookupAbbrev = 0;
   unsigned UpdateVisibleAbbrev = 0;
+  unsigned ModuleLocalUpdateVisibleAbbrev = 0;
   unsigned DeclRecordAbbrev = 0;
   unsigned DeclTypedefAbbrev = 0;
   unsigned DeclVarAbbrev = 0;
diff --git a/clang/lib/AST/DeclBase.cpp b/clang/lib/AST/DeclBase.cpp
index fb701f76231bcd..42daaa4f3dcc37 100644
--- a/clang/lib/AST/DeclBase.cpp
+++ b/clang/lib/AST/DeclBase.cpp
@@ -1850,15 +1850,28 @@ void DeclContext::buildLookupImpl(DeclContext *DCtx, bool Internal) {
   }
 }
 
+Module *Decl::getTopLevelOwningNamedModule() const {
+  if (getOwningModule() &&
+      getOwningModule()->getTopLevelModule()->isNamedModule())
+    return getOwningModule()->getTopLevelModule();
+
+  return nullptr;
+}
+
 DeclContext::lookup_result
 DeclContext::lookup(DeclarationName Name) const {
+  return lookupImpl(Name, cast<Decl>(this)->getTopLevelOwningNamedModule());
+}
+
+DeclContext::lookup_result DeclContext::lookupImpl(DeclarationName Name,
+                                                   Module *NamedModule) const {
   // For transparent DeclContext, we should lookup in their enclosing context.
   if (getDeclKind() == Decl::LinkageSpec || getDeclKind() == Decl::Export)
-    return getParent()->lookup(Name);
+    return getParent()->lookupImpl(Name, NamedModule);
 
   const DeclContext *PrimaryContext = getPrimaryContext();
   if (PrimaryContext != this)
-    return PrimaryContext->lookup(Name);
+    return PrimaryContext->lookupImpl(Name, NamedModule);
 
   // If we have an external source, ensure that any later redeclarations of this
   // context have been loaded, since they may add names to the result of this
@@ -1889,7 +1902,8 @@ DeclContext::lookup(DeclarationName Name) const {
     if (!R.second && !R.first->second.hasExternalDecls())
       return R.first->second.getLookupResult();
 
-    if (Source->FindExternalVisibleDeclsByName(this, Name) || !R.second) {
+    if (Source->FindExternalVisibleDeclsByName(this, Name, NamedModule) ||
+        !R.second) {
       if (StoredDeclsMap *Map = LookupPtr) {
         StoredDeclsMap::iterator I = Map->find(Name);
         if (I != Map->end())
@@ -2115,7 +2129,8 @@ void DeclContext::makeDeclVisibleInContextImpl(NamedDecl *D, bool Internal) {
     if (ExternalASTSource *Source = getParentASTContext().getExternalSource())
       if (hasExternalVisibleStorage() &&
           Map->find(D->getDeclName()) == Map->end())
-        Source->FindExternalVisibleDeclsByName(this, D->getDeclName());
+        Source->FindExternalVisibleDeclsByName(
+            this, D->getDeclName(), D->getTopLevelOwningNamedModule());
 
   // Insert this declaration into the map.
   StoredDeclsList &DeclNameEntries = (*Map)[D->getDeclName()];
diff --git a/clang/lib/AST/ExternalASTMerger.cpp b/clang/lib/AST/ExternalASTMerger.cpp
index 8bad3b36244e15..c01b2e33aec632 100644
--- a/clang/lib/AST/ExternalASTMerger.cpp
+++ b/clang/lib/AST/ExternalASTMerger.cpp
@@ -472,7 +472,8 @@ static bool importSpecializationsIfNeeded(Decl *D, ASTImporter *Importer) {
 }
 
 bool ExternalASTMerger::FindExternalVisibleDeclsByName(const DeclContext *DC,
-                                                       DeclarationName Name) {
+                                                       DeclarationName Name,
+                                                       Module *NamedModule) {
   llvm::SmallVector<NamedDecl *, 1> Decls;
   llvm::SmallVector<Candidate, 4> Candidates;
 
diff --git a/clang/lib/AST/ExternalASTSource.cpp b/clang/lib/AST/ExternalASTSource.cpp
index 543846c0093af8..4a29f4944f73c0 100644
--- a/clang/lib/AST/ExternalASTSource.cpp
+++ b/clang/lib/AST/ExternalASTSource.cpp
@@ -90,9 +90,9 @@ ExternalASTSource::GetExternalCXXBaseSpecifiers(uint64_t Offset) {
   return nullptr;
 }
 
-bool
-ExternalASTSource::FindExternalVisibleDeclsByName(const DeclContext *DC,
-                                                  DeclarationName Name) {
+bool ExternalASTSource::FindExternalVisibleDeclsByName(const DeclContext *DC,
+                                                       DeclarationName Name,
+                                                       Module *NamedModule) {
   return false;
 }
 
diff --git a/clang/lib/Interpreter/CodeCompletion.cpp b/clang/lib/Interpreter/CodeCompletion.cpp
index bbc8830d76bc00..9092d4705ca58a 100644
--- a/clang/lib/Interpreter/CodeCompletion.cpp
+++ b/clang/lib/Interpreter/CodeCompletion.cpp
@@ -228,7 +228,8 @@ class ExternalSource : public clang::ExternalASTSource {
   ExternalSource(ASTContext &ChildASTCtxt, FileManager &ChildFM,
                  ASTContext &ParentASTCtxt, FileManager &ParentFM);
   bool FindExternalVisibleDeclsByName(const DeclContext *DC,
-                                      DeclarationName Name) override;
+                                      DeclarationName Name,
+                                      Module *NamedModule) override;
   void
   completeVisibleDeclsMap(const clang::DeclContext *childDeclContext) override;
 };
@@ -271,7 +272,8 @@ ExternalSource::ExternalSource(ASTContext &ChildASTCtxt, FileManager &ChildFM,
 }
 
 bool ExternalSource::FindExternalVisibleDeclsByName(const DeclContext *DC,
-                                                    DeclarationName Name) {
+                                                    DeclarationName Name,
+                                                    Module *NamedModule) {
 
   IdentifierTable &ParentIdTable = ParentASTCtxt.Idents;
 
diff --git a/clang/lib/Sema/MultiplexExternalSemaSource.cpp b/clang/lib/Sema/MultiplexExternalSemaSource.cpp
index 54944267b4868a..c19a0f980c1e9a 100644
--- a/clang/lib/Sema/MultiplexExternalSemaSource.cpp
+++ b/clang/lib/Sema/MultiplexExternalSemaSource.cpp
@@ -107,11 +107,12 @@ MultiplexExternalSemaSource::hasExternalDefinitions(const Decl *D) {
   return EK_ReplyHazy;
 }
 
-bool MultiplexExternalSemaSource::
-FindExternalVisibleDeclsByName(const DeclContext *DC, DeclarationName Name) {
+bool MultiplexExternalSemaSource::FindExternalVisibleDeclsByName(
+    const DeclContext *DC, DeclarationName Name, Module *NamedModule) {
   bool AnyDeclsFound = false;
   for (size_t i = 0; i < Sources.size(); ++i)
-    AnyDeclsFound |= Sources[i]->FindExternalVisibleDeclsByName(DC, Name);
+    AnyDeclsFound |=
+        Sources[i]->FindExternalVisibleDeclsByName(DC, Name, NamedModule);
   return AnyDeclsFound;
 }
 
diff --git a/clang/lib/Serialization/ASTReader.cpp b/clang/lib/Serialization/ASTReader.cpp
index 7361cace49dd7b..06853a227215e0 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -1235,7 +1235,7 @@ unsigned DeclarationNameKey::getHash() const {
 }
 
 ModuleFile *
-ASTDeclContextNameLookupTrait::ReadFileRef(const unsigned char *&d) {
+ASTDeclContextNameLookupTraitBase::ReadFileRef(const unsigned char *&d) {
   using namespace llvm::support;
 
   uint32_t ModuleFileID =
@@ -1244,12 +1244,12 @@ ASTDeclContextNameLookupTrait::ReadFileRef(const unsigned char *&d) {
 }
 
 std::pair<unsigned, unsigned>
-ASTDeclContextNameLookupTrait::ReadKeyDataLength(const unsigned char *&d) {
+ASTDeclContextNameLookupTraitBase::ReadKeyDataLength(const unsigned char *&d) {
   return readULEBKeyDataLength(d);
 }
 
-ASTDeclContextNameLookupTrait::internal_key_type
-ASTDeclContextNameLookupTrait::ReadKey(const unsigned char *d, unsigned) {
+DeclarationNameKey
+ASTDeclContextNameLookupTraitBase::ReadKeyBase(const unsigned char *&d) {
   using namespace llvm::support;
 
   auto Kind = (DeclarationName::NameKind)*d++;
@@ -1283,10 +1283,13 @@ ASTDeclContextNameLookupTrait::ReadKey(const unsigned char *d, unsigned) {
   return DeclarationNameKey(Kind, Data);
 }
 
-void ASTDeclContextNameLookupTrait::ReadDataInto(internal_key_type,
-                                                 const unsigned char *d,
-                                                 unsigned DataLen,
-                                                 data_type_builder &Val) {
+ASTDeclContextNameLookupTrait::internal_key_type
+ASTDeclContextNameLookupTrait::ReadKey(const unsigned char *d, unsigned) {
+  return ReadKeyBase(d);
+}
+
+void ASTDeclContextNameLookupTraitBase::ReadDataIntoImpl(
+    const unsigned char *d, unsigned DataLen, data_type_builder &Val) {
   using namespace llvm::support;
 
   for (unsigned NumDecls = DataLen / sizeof(DeclI...
[truncated]

Copy link
Member

@Michael137 Michael137 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLDB API adjustments look fine. Just left some minor comments re. documentation

@@ -2711,6 +2715,12 @@ class DeclContext {
bool Deserialize = false) const;

private:
/// Lookup all external visible declarations and the external declarations
/// within the same module specified by \param NamedModule. We can't
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we replace \param here (and other places around the patch) with \c (or \ref)? IIUC \param is just used to start a parameter description.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link

github-actions bot commented Jan 14, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Comment on lines 154 to 155
/// multiple modules. \return \c true if any declarations might have been
/// found, \c false if we definitely have no declarations with tbis name.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// multiple modules. \return \c true if any declarations might have been
/// found, \c false if we definitely have no declarations with tbis name.
/// multiple modules.
///
/// \return \c true if any declarations might have been found, and \c false
/// if we definitely have no declarations with this name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks.

@@ -145,12 +146,18 @@ class ExternalASTSource : public RefCountedBase<ExternalASTSource> {
/// Find all declarations with the given name in the given context,
/// and add them to the context by calling SetExternalVisibleDeclsForName
/// or SetNoExternalVisibleDeclsForName.
/// \return \c true if any declarations might have been found, \c false if
/// we definitely have no declarations with tbis name.
/// \param DC since the same namespace can appear in multiple module units.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this just the DeclContext that's used for lookup? Do we need to mention namespaces and multiple modules here? Maybe we can leave that for the NamedModule description

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, done.

Close llvm#90154

This patch is also an optimization to the lookup process to utilize the
information provided by `export` keyword.

Previously, in the lookup process, the `export` keyword only takes part
in the check part, it doesn't get involved in the lookup process. That
said, previously, in a name lookup for 'name', we would load all of
declarations with the name 'name' and check if these declarations are
valid or not. It works well. But it is inefficient since it may load
declarations that may not be wanted.

Note that this patch actually did a trick in the lookup process instead
of bring module information to DeclarationName or considering module
information when deciding if two declarations are the same. So it may
not be a surprise to me if there are missing cases. But it is not a
regression. It should be already the case. Issue reports are welcomed.

In this patch, I tried to split the big lookup table into a lookup table
as before and a module local lookup table, which takes a combination of
the ID of the DeclContext and hash value of the primary module name as the key.
And refactored `DeclContext::lookup()` method to take the module
information. So that a lookup in a DeclContext won't load declarations
that are local to **other** modules.

And also I think it is already beneficial to split the big lookup table
since it may reduce the conflicts during lookups in the hash table.

BTW, this patch introduced a **regression** for a reachability rule in
C++20 but it was false-negative. See
'clang/test/CXX/module/module.interface/p7.cpp' for details.

This patch is not expected to introduce any other
regressions for non-c++20-modules users since the module local lookup
table should be empty for them.
@ChuanqiXu9
Copy link
Member Author

Now CI passes. I want to land this in 20.x and give it some baking times so that we can find issues in it if any. Post commit review comments are welcomed as always.

@ChuanqiXu9 ChuanqiXu9 merged commit 7201cae into llvm:main Jan 15, 2025
9 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Jan 15, 2025

LLVM Buildbot has detected a new failure on builder openmp-s390x-linux running on systemz-1 while building clang,lldb at step 6 "test-openmp".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/88/builds/6804

Here is the relevant piece of the build log for the reference
Step 6 (test-openmp) failure: 1200 seconds without output running [b'ninja', b'-j 4', b'check-openmp'], attempting to kill

ChuanqiXu9 added a commit to ChuanqiXu9/llvm-project that referenced this pull request Jan 15, 2025
…ther TU

Close llvm#61427

And this is also helpful to implement
llvm#112294 partially.

The implementation strategy mimics
llvm#122887. This patch split the
internal declarations from the general lookup table so that other TU
can't find the internal declarations.
@nikic
Copy link
Contributor

nikic commented Jan 15, 2025

Looks like this change has some compile-time impact even if modules are not used: https://llvm-compile-time-tracker.com/compare.php?from=edc02351dd11cc4a39b7c541b26b71c6f36c8e55&to=7201cae106260aeb3e9bbbb7d5291ff30f05076a&stat=instructions:u It seems to add about 0.5% to C++ compilations.

cc @AaronBallman on the process here. I find it a bit concerning that big changes get landed without approval to make it into the LLVM 20 release. This is not how things are supposed to work.

hubert-reinterpretcast added a commit that referenced this pull request Jan 15, 2025
#122887 uses `Module` to refer
to `clang::Module` in a test that has `using namespace llvm;`.

This causes lookup ambiguity with `llvm::Module` if the headers involved
expose that name (e.g., for downstream codebases).
github-actions bot pushed a commit to arm/arm-toolchain that referenced this pull request Jan 15, 2025
llvm/llvm-project#122887 uses `Module` to refer
to `clang::Module` in a test that has `using namespace llvm;`.

This causes lookup ambiguity with `llvm::Module` if the headers involved
expose that name (e.g., for downstream codebases).
@ChuanqiXu9
Copy link
Member Author

Looks like this change has some compile-time impact even if modules are not used: https://llvm-compile-time-tracker.com/compare.php?from=edc02351dd11cc4a39b7c541b26b71c6f36c8e55&to=7201cae106260aeb3e9bbbb7d5291ff30f05076a&stat=instructions:u It seems to add about 0.5% to C++ compilations.

Then it looks it relates to https://github.com/llvm/llvm-project/pull/122887/files#diff-e32652ec4ba35b724c68c1c00c7f522d33f0f9beab16ac437140adc0f2843c01R1861-R1874. It is not expected to introduce the regression if modules are not used. I'll revert this and try to send another PR.

cc @AaronBallman on the process here. I find it a bit concerning that big changes get landed without approval to make it into the LLVM 20 release. This is not how things are supposed to work.

I am the maintainer for modules&serialization. I think I can land PR directly. And also there are lacking reviewers on the topics of C++20 modules. There are multiple experiences that a patch waiting for approval for months without rare actual reviews. And after I am the maintainer, I'll try to land patches that only affecting C++20 modules if I am confident. And the outcoming looks good for these years.

ChuanqiXu9 added a commit that referenced this pull request Jan 16, 2025
ChuanqiXu9 added a commit to ChuanqiXu9/llvm-project that referenced this pull request Jan 16, 2025
ExternalASTSource::FindExternalVisibleDeclsByName

Part of llvm#122887.

I split this to test where the performance regession comes from if
modules are not used.
nikic pushed a commit to nikic/llvm-project that referenced this pull request Jan 16, 2025
…sByName

Part of llvm#122887.

I split this to test where the performance regession comes from if
modules are not used.
@AaronBallman
Copy link
Collaborator

Looks like this change has some compile-time impact even if modules are not used: https://llvm-compile-time-tracker.com/compare.php?from=edc02351dd11cc4a39b7c541b26b71c6f36c8e55&to=7201cae106260aeb3e9bbbb7d5291ff30f05076a&stat=instructions:u It seems to add about 0.5% to C++ compilations.

cc @AaronBallman on the process here. I find it a bit concerning that big changes get landed without approval to make it into the LLVM 20 release. This is not how things are supposed to work.

We generally trust the maintainer's discretion but prefer there to be review before committing (https://llvm.org/docs/CodeReview.html#must-code-be-reviewed-prior-to-being-committed). In this case, @ChuanqiXu9 is the expert in the area and is the one who primarily reviews all modules patches anyway, so I'm inclined to trust his judgement.

I want to land this in 20.x and give it some baking times so that we can find issues in it if any.

FWIW, if we feel the need for it to have baking time, we usually would do that after the branch. (We get significant testing from community stakeholders like Google, Intel, etc because they pull from top of tree and test a bunch of their massive internal code bases on it.) If there are problems with the patch, pulling it out of the branch is more work and impacts folks like the release managers. If it were me, I probably would have waited to land this one, but I also don't think we need a process-related revert either.

@ChuanqiXu9
Copy link
Member Author

Looks like this change has some compile-time impact even if modules are not used: https://llvm-compile-time-tracker.com/compare.php?from=edc02351dd11cc4a39b7c541b26b71c6f36c8e55&to=7201cae106260aeb3e9bbbb7d5291ff30f05076a&stat=instructions:u It seems to add about 0.5% to C++ compilations.
cc @AaronBallman on the process here. I find it a bit concerning that big changes get landed without approval to make it into the LLVM 20 release. This is not how things are supposed to work.

We generally trust the maintainer's discretion but prefer there to be review before committing (https://llvm.org/docs/CodeReview.html#must-code-be-reviewed-prior-to-being-committed). In this case, @ChuanqiXu9 is the expert in the area and is the one who primarily reviews all modules patches anyway, so I'm inclined to trust his judgement.

Thank you!

I want to land this in 20.x and give it some baking times so that we can find issues in it if any.

FWIW, if we feel the need for it to have baking time, we usually would do that after the branch. (We get significant testing from community stakeholders like Google, Intel, etc because they pull from top of tree and test a bunch of their massive internal code bases on it.) If there are problems with the patch, pulling it out of the branch is more work and impacts folks like the release managers. If it were me, I probably would have waited to land this one, but I also don't think we need a process-related revert either.

Yeah, understood. I think it is based on the risky level. For this patch, although it has some lines, it is relatively trivial to me. So in my mind, its risky is more or less controllable. For other patches which has higher risky level, I generally choose to land them in the early of the release circle and give its longer baking time. e.g, the Reduced BMI.

@nikic
Copy link
Contributor

nikic commented Jan 16, 2025

Thanks for the context! "I'm landing this without approval because I'm the maintainer for this component" is a lot less scary than "I'm landing this without approval because there's no time to wait on a review".

ChuanqiXu9 added a commit to ChuanqiXu9/llvm-project that referenced this pull request Jan 17, 2025
ExternalASTSource::FindExternalVisibleDeclsByName

Part of llvm#122887.

I split this to test where the performance regession comes from if
modules are not used.
ChuanqiXu9 added a commit that referenced this pull request Jan 17, 2025
…leDeclsByName (#123152)

Part for relanding #122887.

I split this to test where the performance regession comes from if
modules are not used.
ChuanqiXu9 added a commit to ChuanqiXu9/llvm-project that referenced this pull request Jan 17, 2025
Close llvm#90154

This patch is also an optimization to the lookup process to utilize the
information provided by `export` keyword.

Previously, in the lookup process, the `export` keyword only takes part
in the check part, it doesn't get involved in the lookup process. That
said, previously, in a name lookup for 'name', we would load all of
declarations with the name 'name' and check if these declarations are
valid or not. It works well. But it is inefficient since it may load
declarations that may not be wanted.

Note that this patch actually did a trick in the lookup process instead
of bring module information to DeclarationName or considering module
information when deciding if two declarations are the same. So it may
not be a surprise to me if there are missing cases. But it is not a
regression. It should be already the case. Issue reports are welcomed.

In this patch, I tried to split the big lookup table into a lookup table
as before and a module local lookup table, which takes a combination of
the ID of the DeclContext and hash value of the primary module name as
the key. And refactored `DeclContext::lookup()` method to take the
module information. So that a lookup in a DeclContext won't load
declarations that are local to **other** modules.

And also I think it is already beneficial to split the big lookup table
since it may reduce the conflicts during lookups in the hash table.

BTW, this patch introduced a **regression** for a reachability rule in
C++20 but it was false-negative. See
'clang/test/CXX/module/module.interface/p7.cpp' for details.

This patch is not expected to introduce any other
regressions for non-c++20-modules users since the module local lookup
table should be empty for them.

---

On the API side, this patch unfortunately add a maybe-confusing argument
`Module *NamedModule` to
`ExternalASTSource::FindExternalVisibleDeclsByName()`. People may think
we can get the information from the first argument `const DeclContext
*DC`. But sadly there are declarations (e.g., namespace) can appear in
multiple different modules as a single declaration. So we have to add
additional information to indicate this.
github-actions bot pushed a commit to arm/arm-toolchain that referenced this pull request Jan 17, 2025
…ternalVisibleDeclsByName (#123152)

Part for relanding llvm/llvm-project#122887.

I split this to test where the performance regession comes from if
modules are not used.
ChuanqiXu9 added a commit that referenced this pull request Jan 17, 2025
Close #90154

This patch is also an optimization to the lookup process to utilize the
information provided by `export` keyword.

Previously, in the lookup process, the `export` keyword only takes part
in the check part, it doesn't get involved in the lookup process. That
said, previously, in a name lookup for 'name', we would load all of
declarations with the name 'name' and check if these declarations are
valid or not. It works well. But it is inefficient since it may load
declarations that may not be wanted.

Note that this patch actually did a trick in the lookup process instead
of bring module information to DeclarationName or considering module
information when deciding if two declarations are the same. So it may
not be a surprise to me if there are missing cases. But it is not a
regression. It should be already the case. Issue reports are welcomed.

In this patch, I tried to split the big lookup table into a lookup table
as before and a module local lookup table, which takes a combination of
the ID of the DeclContext and hash value of the primary module name as
the key. And refactored `DeclContext::lookup()` method to take the
module information. So that a lookup in a DeclContext won't load
declarations that are local to **other** modules.

And also I think it is already beneficial to split the big lookup table
since it may reduce the conflicts during lookups in the hash table.

BTW, this patch introduced a **regression** for a reachability rule in
C++20 but it was false-negative. See
'clang/test/CXX/module/module.interface/p7.cpp' for details.

This patch is not expected to introduce any other
regressions for non-c++20-modules users since the module local lookup
table should be empty for them.
ChuanqiXu9 added a commit to ChuanqiXu9/llvm-project that referenced this pull request Jan 17, 2025
…ther TU

Close llvm#61427

And this is also helpful to implement
llvm#112294 partially.

The implementation strategy mimics
llvm#122887. This patch split the
internal declarations from the general lookup table so that other TU
can't find the internal declarations.
ChuanqiXu9 added a commit that referenced this pull request Jan 17, 2025
…ther TU (#123059)

Close #61427

And this is also helpful to implement
#112294 partially.

The implementation strategy mimics
#122887. This patch split the
internal declarations from the general lookup table so that other TU
can't find the internal declarations.
github-actions bot pushed a commit to arm/arm-toolchain that referenced this pull request Jan 17, 2025
… found by other TU (#123059)

Close llvm/llvm-project#61427

And this is also helpful to implement
llvm/llvm-project#112294 partially.

The implementation strategy mimics
llvm/llvm-project#122887. This patch split the
internal declarations from the general lookup table so that other TU
can't find the internal declarations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:modules C++20 modules and Clang Header Modules clang Clang issues not falling into any other category lldb skip-precommit-approval PR for CI feedback, not intended for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[C++20] [Modules] We didn't implement lookup in module level actually
6 participants