Skip to content

Fix for auto-submit in Grok explained (solution included) #195

@gr4ssi

Description

@gr4ssi

Hi, I put the git in Claude code and it fixed the auto-submit in Grok for me, which didn't work. Love it with working auto submit.
Full disclosure, I don't wanna be a jerk who submits shitty ai bug reports, but it actually works now in fixed dev mode on my side, so I asked it to generate a comprehensive bug report.

here you go, hope this helps someone:

[Bug Fix] Auto Submit toggle broken on all supported platforms (ChatGPT, Gemini, Grok, DeepSeek, Mistral, Perplexity, and more)

Summary

The Auto Submit toggle in the MCP SuperAssistant sidebar had no effect on any supported chat platform. The extension correctly detected tool calls, pasted their results into the chat input field, but then silently stopped — never submitting the message. The user was required to press Enter manually every single time.

Investigation revealed three distinct root causes that compounded each other:

  1. ChatGPT: insertText() wrote text directly to the DOM via innerHTML, bypassing ProseMirror's internal state — so the editor considered itself empty and the submit button's click handler silently did nothing.
  2. All adapters: submitForm() called submitButton.click() and immediately returned true, with no detection of whether the click actually triggered submission. For React-based UIs, untrusted programmatic clicks frequently do not fire synthetic event handlers.
  3. Grok: The SUBMIT_BUTTON selectors no longer matched the live Grok UI, and the Enter-key fallback path only queried the first CHAT_INPUT selector (which also didn't match), making both submission paths fail silently.

A bonus issue was also found: chrome-extension/manifest.ts referenced icon-16.png in the icons map, a file that does not exist in public/, preventing Chrome from loading the extension at all in some configurations.

All four issues are fixed in this PR.


Affected Platforms

Every supported platform was broken by at least Bug 2. Bug 1 additionally broke ChatGPT specifically. Bug 3 additionally broke Grok specifically.

Platform Adapters affected
ChatGPT chatgpt.adapter.ts (Bugs 1 + 2)
Gemini gemini.adapter.ts (Bug 2)
DeepSeek deepseek.adapter.ts (Bug 2)
GitHub Copilot ghcopilot.adapter.ts (Bug 2)
Grok grok.adapter.ts (Bugs 2 + 3)
Kimi kimi.adapter.ts (Bug 2)
Mistral mistral.adapter.ts (Bug 2)
Perplexity perplexity.adapter.ts (Bug 2)
QwenChat qwenchat.adapter.ts (Bug 2)
OpenRouter openrouter.adapter.ts (Bug 2)
T3Chat t3chat.adapter.ts (Bug 2)
Z AI z.adapter.ts (Bug 2)

How to Reproduce

Prerequisites

  • MCP SuperAssistant Chrome extension installed (v0.6.0 from store)
  • An MCP server connected and a tool configured to return results
  • Open any supported chat platform (e.g., chatgpt.com, grok.com - I'm testing with Grok.)

Steps

  1. Open ChatGPT (or any supported platform).
  2. In the MCP SuperAssistant sidebar, enable Auto Insert and Auto Submit.
  3. Send a message that triggers an MCP tool call.
  4. Observe the tool call appear in the sidebar and its result get pasted into the chat input.
  5. Expected: The result is automatically submitted and the conversation continues.
  6. Actual: The text sits in the chat input box. Nothing is submitted. The user must press Enter manually.

Confirming the ChatGPT-specific root cause (Bug 1)

With the old code, after text insertion on ChatGPT, open DevTools and run:

// With the old innerHTML approach:
const editor = document.querySelector('#prompt-textarea');
console.log(editor.textContent); // "tool result text" — visually present
// But ProseMirror's internal state is empty — the submit button remains grey/disabled

Root Cause Analysis

Bug 1 — ChatGPT insertText(): Direct DOM manipulation bypasses ProseMirror's internal state

File: pages/content/src/plugins/adapters/chatgpt.adapter.ts

ChatGPT's input field is a ProseMirror editor rendered in a <div id="prompt-textarea" contenteditable="true">. ProseMirror maintains its own immutable state tree that is separate from the visible DOM. It only updates its internal state in response to its own transaction system, which is triggered by beforeinput/input events carrying the correct inputType, or via document.execCommand('insertText').

The old insertText() implementation set targetElement.innerHTML directly and then dispatched a generic new Event('input', { bubbles: true }). This made text visible in the browser but left ProseMirror's internal state empty. React saw an empty editor and kept the submit button logically disabled. Both button.click() and Enter key simulation then failed silently — the text was visible but the underlying state said "nothing to submit".

Before (simplified — old logic):

// chatgpt.adapter.ts — old insertText()
targetElement.innerHTML = '';
const paragraph = document.createElement('p');
paragraph.textContent = newContent;
targetElement.appendChild(paragraph);
// Generic Event does NOT update ProseMirror's state tree:
targetElement.dispatchEvent(new Event('input', { bubbles: true }));

After (fixed):

// chatgpt.adapter.ts — new insertText()
// Select all existing content first
const selection = window.getSelection();
const range = document.createRange();
range.selectNodeContents(targetElement);
if (selection) {
  selection.removeAllRanges();
  selection.addRange(range);
}

// execCommand('insertText') routes through ProseMirror's beforeinput → state
// transaction → DOM update chain. Its internal state is correctly updated.
const inserted = document.execCommand('insertText', false, newContent);

if (!inserted) {
  // execCommand unavailable — fall back to direct DOM + proper InputEvent
  targetElement.innerHTML = '';
  const paragraph = document.createElement('p');
  paragraph.textContent = newContent;
  targetElement.appendChild(paragraph);
  targetElement.dispatchEvent(new InputEvent('input', {
    inputType: 'insertText',
    data: newContent,
    bubbles: true,
    cancelable: true,
    composed: true,
  }));
}

The key distinction: execCommand('insertText') is one of the few browser APIs that ProseMirror's beforeinput handler actually listens for and uses to produce a proper state transaction. A plain new Event('input') without the correct inputType is ignored.


Bug 2 — All adapters: submitForm() returns true immediately after button.click() with no verification

Files: All 12 adapter files (see table above).

Every adapter's submitForm() previously followed this pattern:

// Old pattern — representative example, consistent across all adapters
async submitForm(): Promise<boolean> {
  const submitButton = document.querySelector(this.selectors.SUBMIT_BUTTON);
  if (!submitButton) return false;
  if (submitButton.disabled) return false;

  submitButton.click();  // <-- optimistically assumes this worked
  return true;           // <-- no verification whatsoever
}

For React-based chat UIs (ChatGPT, Gemini, Perplexity, Mistral, etc.), button.click() dispatches a MouseEvent with isTrusted: false. React's synthetic event system and many custom onClick handlers explicitly check event.isTrusted and short-circuit if the event is not user-initiated. Even when the handler does fire, if the editor's internal state is stale (see Bug 1), it reads "empty" and suppresses the submission silently.

Some adapters (Perplexity, QwenChat, DeepSeek, Kimi, T3Chat, Z) had submitWithEnterKey() methods, but these were only invoked when the submit button was not found or was disabled — never when .click() fired but submission failed to happen.

Fix: A new protected method performSubmitWithFallback() was added to BaseAdapterPlugin and all 12 adapters updated to use it:

// base.adapter.ts — new shared method
protected async performSubmitWithFallback(
  submitButton: HTMLButtonElement,
  chatInputSelector: string
): Promise<void> {
  submitButton.click();

  // Wait for the UI to process the click
  await new Promise<void>(resolve => setTimeout(resolve, 300));

  // Walk selector list to find the actual editable root
  let inputEl: Element | null = null;
  for (const sel of chatInputSelector.split(',')) {
    const el = document.querySelector(sel.trim());
    if (el) {
      if (el.tagName === 'TEXTAREA' || el.tagName === 'INPUT' || (el as HTMLElement).isContentEditable) {
        inputEl = el;
      } else {
        inputEl = el.closest('[contenteditable="true"]') || el.closest('textarea') || el;
      }
      break;
    }
  }

  if (!inputEl) return;

  // If the input is now empty, the button click worked — nothing more to do
  const content =
    (inputEl as HTMLTextAreaElement).value?.trim() ||
    inputEl.textContent?.trim() ||
    '';
  if (content === '') return;

  // Fallback A: keyboard Enter simulation with composed:true (crosses shadow DOM)
  (inputEl as HTMLElement).focus();
  for (const eventType of ['keydown', 'keypress', 'keyup']) {
    inputEl.dispatchEvent(
      new KeyboardEvent(eventType, {
        key: 'Enter',
        code: 'Enter',
        keyCode: 13,
        which: 13,
        bubbles: true,
        cancelable: true,
        composed: true,   // Required to cross shadow DOM boundaries
      })
    );
  }

  // Fallback B: 'insertParagraph' beforeinput event — the modern browser signal
  // for "user pressed Enter in a contenteditable". ProseMirror-based editors
  // (ChatGPT, Mistral) handle this directly as a submit command.
  await new Promise<void>(resolve => setTimeout(resolve, 100));
  const stillHasContent =
    (inputEl as HTMLTextAreaElement).value?.trim() ||
    inputEl.textContent?.trim();
  if (stillHasContent) {
    inputEl.dispatchEvent(
      new InputEvent('beforeinput', {
        inputType: 'insertParagraph',
        bubbles: true,
        cancelable: true,
        composed: true,
      })
    );
  }
}

All 12 adapters updated from:

submitButton.click();
return true;

To:

await this.performSubmitWithFallback(submitButton, this.selectors.CHAT_INPUT);
return true;

The method's logic is: try the click, wait 300 ms, check if the input field is now empty. If it is — the click worked. If content remains, escalate to keyboard Enter simulation, then to beforeinput insertParagraph. This three-stage approach covers the full spectrum of React synthetic event handling and ProseMirror's internal event dispatch.


Bug 3 — Grok adapter: stale UI selectors + broken fallback path

File: pages/content/src/plugins/adapters/grok.adapter.ts

Problem A — Stale SUBMIT_BUTTON selectors

The old selector list for Grok's submit button was:

// Old — grok.adapter.ts
SUBMIT_BUTTON: 'button[aria-label="Submit"], button.send-button, button.chat-submit, button.submit-button',

Grok's current UI on grok.com does not use any of these selectors. As a result, submitButton was always null. The code fell through to the Enter-key fallback — which also failed (see Problem B).

Problem B — Broken Enter-key fallback

When no submit button was found, the original code used:

// Old — grok.adapter.ts (simplified)
const chatInput = document.querySelector(this.selectors.CHAT_INPUT.split(', ')[0]);
// ^ Only queries the FIRST selector: 'textarea[aria-label="Ask Grok anything"]'
// This selector didn't match the current Grok UI either.
// But insertText() iterates ALL selectors and DID find a matching element further down.

if (chatInput) {
  chatInput.dispatchEvent(new KeyboardEvent('keydown', { key: 'Enter', bubbles: true }));
  // ^ Only keydown, no keypress/keyup. No composed:true.
} else {
  logger.error('Could not find chat input for Enter key fallback');
  // ^ This branch was always hit, so nothing was ever submitted.
}

The mismatch: insertText() iterates all selectors (splitting on ', '), finds a match further down the list, and successfully pastes text. But the old fallback path only checked split(', ')[0] — the first selector, which didn't match — and then logged an error and returned false. The text was visible in the UI but submission always failed.

Fix A — Updated selectors

// New — grok.adapter.ts
CHAT_INPUT: [
  'textarea[aria-label="Ask Grok anything"]',
  'textarea[placeholder*="Grok"]',
  'textarea[placeholder*="Ask"]',
  'textarea[data-testid="grok-compose-input"]',
  'textarea[data-testid*="input"]',
  'textarea[class*="compose"]',
  'textarea[spellcheck="false"]',
  'textarea[data-gramm="false"]',
  'div[contenteditable="true"][data-lexical-editor="true"]',
  'div[contenteditable="true"]',
  'textarea',                           // generic last-resort fallback
].join(', '),

SUBMIT_BUTTON: [
  'button[aria-label="Submit"]',
  'button[aria-label="Send message"]',
  'button[aria-label*="Send"]',
  'button[aria-label*="send"]',
  'button[data-testid="send-button"]',
  'button[data-testid*="submit"]',
  'button[data-testid*="send"]',
  'button[type="submit"]',
  'button.send-button',
  'button.chat-submit',
  'button.submit-button',
].join(', '),

Fix B — Rewritten Enter-key fallback path

The no-button fallback now iterates all CHAT_INPUT selectors (matching the approach used by insertText()) and dispatches the full set of keyboard events with composed: true:

// New — grok.adapter.ts: Enter-key fallback when no submit button found
let chatInput: HTMLElement | null = null;
for (const sel of this.selectors.CHAT_INPUT.split(',')) {
  chatInput = document.querySelector(sel.trim()) as HTMLElement;
  if (chatInput) break;
}

if (chatInput) {
  chatInput.focus();
  for (const eventType of ['keydown', 'keypress', 'keyup']) {
    chatInput.dispatchEvent(new KeyboardEvent(eventType, {
      key: 'Enter',
      code: 'Enter',
      keyCode: 13,
      which: 13,
      bubbles: true,
      cancelable: true,
      composed: true,  // crosses shadow DOM boundaries
    }));
  }
  return true;
}

Bonus Fix — Missing icon-16.png in manifest

File: chrome-extension/manifest.ts

The icons map referenced icon-16.png:

// Old — manifest.ts
icons: {
  128: 'icon-128.png',
  34:  'icon-34.png',
  16:  'icon-16.png',   // <-- does not exist in public/
},

Only icon-34.png and icon-128.png exist in chrome-extension/public/. Chrome logs:

Could not load icon 'icon-16.png' specified in 'icons'.

In strict environments this prevents the extension from loading at all.

Fix: Point the 16 entry to icon-34.png (Chrome scales it down automatically) and ensure icon-16.png is not listed in web_accessible_resources:

// New — manifest.ts
icons: {
  128: 'icon-128.png',
  34:  'icon-34.png',
  16:  'icon-34.png',   // icon-16.png doesn't exist; Chrome scales 34px fine
},
// web_accessible_resources — icon-16.png removed
web_accessible_resources: [
  {
    resources: ['*.js', '*.css', 'content/*.css', '*.svg', 'icon-128.png', 'icon-34.png'],
    matches: ['*://*/*'],
  },
],

Files Changed

File Change
pages/content/src/plugins/adapters/base.adapter.ts Added protected performSubmitWithFallback() method
pages/content/src/plugins/adapters/chatgpt.adapter.ts Fixed insertText() to use execCommand; updated submitForm() to use performSubmitWithFallback()
pages/content/src/plugins/adapters/gemini.adapter.ts Updated submitForm() to use performSubmitWithFallback()
pages/content/src/plugins/adapters/deepseek.adapter.ts Updated submitForm() to use performSubmitWithFallback()
pages/content/src/plugins/adapters/ghcopilot.adapter.ts Updated submitForm() to use performSubmitWithFallback()
pages/content/src/plugins/adapters/grok.adapter.ts Updated stale selectors; rewrote Enter-key fallback; updated submitForm() to use performSubmitWithFallback()
pages/content/src/plugins/adapters/kimi.adapter.ts Updated submitForm() to use performSubmitWithFallback()
pages/content/src/plugins/adapters/mistral.adapter.ts Updated submitForm() to use performSubmitWithFallback()
pages/content/src/plugins/adapters/perplexity.adapter.ts Updated submitForm() to use performSubmitWithFallback()
pages/content/src/plugins/adapters/qwenchat.adapter.ts Updated submitForm() to use performSubmitWithFallback()
pages/content/src/plugins/adapters/openrouter.adapter.ts Updated submitForm() to use performSubmitWithFallback()
pages/content/src/plugins/adapters/t3chat.adapter.ts Updated submitForm() to use performSubmitWithFallback()
pages/content/src/plugins/adapters/z.adapter.ts Updated submitForm() to use performSubmitWithFallback()
chrome-extension/manifest.ts Fixed icon-16.png reference → icon-34.png; removed from web_accessible_resources

Testing

Confirmed working after fix

  • Grok (grok.com) — Auto Submit now submits the tool result without manual Enter.
  • Chrome (Manifest V3, unpacked extension) — Extension loads without icon errors.

Additional platforms to verify

The following platforms share the same code path through performSubmitWithFallback() and the logic is sound, but live testing on each is recommended before merging:

  • ChatGPT (chatgpt.com)
  • Gemini (gemini.google.com)
  • Perplexity (perplexity.ai)
  • Mistral (chat.mistral.ai)
  • DeepSeek (chat.deepseek.com)
  • GitHub Copilot (github.com)
  • Kimi (kimi.com)
  • QwenChat (chat.qwen.ai)
  • OpenRouter (openrouter.ai)
  • T3Chat (t3.chat)
  • Z AI (chat.z.ai)

Environment

Item Value
Browser Chrome (Manifest V3)
Extension version (broken) v0.6.0
Extension version (fixed) v0.6.2 - my own version bump ;)
OS Windows (WSL2 for build), Chrome on Windows
Node / pnpm As specified in package.json / pnpm-workspace.yaml
Manifest version 3

Technical Background: Why execCommand and composed: true matter

document.execCommand('insertText') vs. innerHTML

ProseMirror's architecture is intentionally designed so that the DOM is a read-only projection of an internal state object. The only way to drive a state change through the public surface that Chrome exposes to extensions is via the beforeinput event with inputType: 'insertText', which execCommand('insertText') reliably produces. Setting innerHTML directly modifies the DOM projection but leaves the state tree untouched — ProseMirror's next render cycle may even overwrite the injected content.

composed: true on keyboard events

Modern browser UIs increasingly use Shadow DOM to encapsulate components. An event dispatched without composed: true stops at the nearest shadow root boundary and never reaches the host document's event listeners. Setting composed: true allows the synthetic keyboard event to bubble across shadow DOM boundaries and reach the framework-level handlers that process the Enter key for submission.

beforeinput with inputType: 'insertParagraph'

The Input Events Level 2 spec defines insertParagraph as the inputType fired when a user presses Enter in a contenteditable. Both ProseMirror and Lexical (used by Perplexity, ChatGPT) listen for this event on their root element and treat it as a "submit" or "new paragraph" command. Dispatching it is more semantically correct than a raw keydown with key: 'Enter' and more likely to survive future framework updates.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions