Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Context-Aware Sanitization #550

Open
lukehinds opened this issue Jan 11, 2025 · 3 comments
Open

Context-Aware Sanitization #550

lukehinds opened this issue Jan 11, 2025 · 3 comments

Comments

@lukehinds
Copy link
Contributor

lukehinds commented Jan 11, 2025

Overview

Context-Aware Sanitization would provide a way for CodeGate to selectively filter, transform, or prevent AI-generated code suggestions from overwriting declared code snippets based on contextual rules. While our existing “secret scanning on the fly” feature already redacts and encrypts tokens (such as passwords, API keys, or other sensitive data) before passing them to the LLM to then switch back to the original format upon return path, Context-Aware Sanitization would extend this a step further by letting users define custom constraints and exceptions for certain code snippets.

For example, a user might declare that:

  • Environment variables (e.g., DB_HOST) must never be overwritten or read by the LLM.
  • Specific database tables (e.g., Users, Payments) must never be dropped or altered etc.
  • Regulated data fields (e.g., PII or compliance-related data) must never be exposed or modified.
  • Path names (e.g., /User/**LukeHinds**/folder must never be exposed or modified)

This ensures that any AI recommendation via an assistant, agent or MCP, do not violates the user rules by changing the content of the declared code snippet (e.g. examples above).

NOTE: As always, start small, simple and validate, the following acts as a guideline of where this could lead.

Key Objectives

  1. Granular Control
    Users can define item-specific or file-specific rules that precisely dictate what the AI assistant can modify.
  2. Pipeline Integration
    Leverages a similar intercept-and-filter pipeline approach used for “secret scanning on the fly".
  3. User-Friendly Rule Management
    A simple UI allowing creating, editing, and deleting these sanitization rules.
  4. Reduced Risk Exposure
    Protects not just secrets but also critical code sections, databases, or environment configurations from unintentional or even malicious changes recommended by the LLM via an agent, assistant of MCP instance.

Possible Functionality

  1. Rule Definition & Storage
    • Extend the database schema introduced by CodeGate Projects to store “sanitization rules.”
    • Each rule may include:
      • Target: Regex patterns, file paths, environment variable names, DB table references, etc.
      • Action: block, sanitize, or warn (e.g., do not allow changes, automatically mask references, or show a warning to the user).
      • Scope: Could apply globally or to a specific project/folder path.
  2. Intercept & Scan Process
    • Inbound (Prompt to LLM): When an assistant/agent triggers an AI recommendation, CodeGate first checks the prompt or code snippet context against the stored rules. If any “do-not-touch” references appear, they are either masked, removed, or flagged depending on the rule.
    • Outbound (LLM response to Developer): After receiving the LLM’s suggestion, CodeGate restores any disallowed references (similar to how secret scanning and redaction currently work). If a rule is violated, CodeGate can block or modify the suggestion before it’s rendered in the IDE.
  3. UI & Configuration
    • Integrates into the upcoming CodeGate Projects dashboard, where users can:
      • Create rules with name, target pattern, and action.
      • Edit existing rules to refine or broaden their scope.
      • Delete rules that are no longer relevant.
    • Provide an optional “rules wizard” to simplify the creation of patterns for common tasks (e.g., environment variables, DB schema references).
  4. Alerts & Logging
    • Whenever the AI suggestion triggers a sanitization event (e.g., an attempt to rename a restricted table), log the event locally.

Relationship to Other Work

  • Dependencies:
    This feature depends on [Issue 454](Codegate Workspaces (repos) #454) (the forthcoming “CodeGate Projects” feature), which will provide the underlying data model and UI scaffolding for storing and managing sanitization rules for specific code bases (i,e. repos)
  • Extension of “secret scanning on the fly”:
    We may be able reuse the existing pipeline where tokens are redacted and encrypted prior to LLM submission, ensuring a uniform approach. Context-Aware Sanitization will simply plug additional scanning rules into that pipeline.

Technical Considerations

  • Performance:
    • The scanning must be efficient enough to avoid significant latency when interacting with the LLM.
    • Caching or indexing rules may be required to handle complex patterns at scale.
  • Rule Definition Language:
    • Could be JSON or YAML-based, with flexible placeholders for environment variable references, database schemas, etc.
    • Provide default or “starter” rules for common use cases.
  • Edge Cases:
    • Overly broad patterns might inadvertently block or sanitize large portions of code suggestions. We need fallback warnings or safe checks so the user isn’t blindsided by excessive sanitization.
    • Collaboration across teams and projects: rules could be project-specific or globally enforced.

Example Use Cases

  1. Prevent DB Drops
    • Rule: block modifications to Users or Payments table.
    • Outcome: Any suggestion that attempts a DROP TABLE Users; or modifies a “Payments” schema is automatically replaced with a sanitized placeholder (or flagged to the developer).
  2. Lock Down Environment Variables
    • Rule: warn for any attempt to reassign environment variables in .env files.
    • Outcome: Developer sees a warning if the AI tries to inject code that modifies these variables.
  3. Mask Sensitive Info
    • Rule: sanitize any /User/lukehinds/mycode references from code suggestions or prompts, to save leaking unwanted information.
    • Outcome: Replace or redact so that the LLM never sees the actual sensitive string, nor attempts to change it.

Feasibility

  • Overall Feasibility: Medium to High
    • We already have a foundation with “secret scanning on the fly”; adding user-defined patterns is a logical next step.
    • Main complexity lies in how granular and flexible the rule definitions become, and ensuring the solution remains performant and user-friendly. We should start with something very simple and validate with users ASAP.

User request reference from @aj47 that was the inspiration for this idea.

@lukehinds lukehinds changed the title Context-Aware Sanitization User driven Context-Aware Sanitization Jan 11, 2025
@lukehinds lukehinds changed the title User driven Context-Aware Sanitization Context-Aware Sanitization Jan 11, 2025
@jhrozek
Copy link
Contributor

jhrozek commented Jan 11, 2025

I think this is a pretty powerful request and we migth want to take it step by steps (i.e. split into multiple).

Without thinking too much the first two things that came to my mind are:

  • a soft version of many of the rules could be done by tuning the system prompts, something we already have on the roadmap for when we have a way to set repo-specific (workspace-specific) system prompts. This doesn't prevent a malfunctioning LLM (or even worse, an evil LLM) from doing that, but it could go a long way
  • it appears that we'll want to make our pipelines configurable. I would prefer not to include a full Minder-sized policy engine, but we should think about a way to extensibly build policies. Maybe we could start by a) exposing which pipeline steps and in which order do run at all and b) have a way for the steps to read their config which might be just a generic YAML/JSON file read by the pipeline step (and interpreted by the pipeline step).

More thinking tbd :-)

@lukehinds
Copy link
Contributor Author

@jhrozek I agree, and I was thinking the prompt route would be the first to investigate, but as you say , sometimes these things can be brittle and vary from model to model. I don't think we need a policy engine per-say, starting with regex that can be set by the user should be sufficient. Are you thinking of being able to plugin in evaluation engines, agree that is a lot more extensive (but very interesting too and I had not even thought of that).

To reiterate though, yes, simple and light should be the first approach, but not of a type that could make it difficult to extend and refine over time.

@ptelang
Copy link
Contributor

ptelang commented Jan 12, 2025

To address these usecases, Codegate will need to use a combination of contextual: (a) rules based inspection/filtering/sanitization of the request (user->llm) and response (llm->user), and (b) system prompts with the request. Another idea is to integrate Codgate with bandit to identify and report any CWEs in the llm generated code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants