Skip to content

Conversation

@michaelraskansky
Copy link
Contributor

Description

This PR introduces the directSend feature that enables Lambda handlers to send LLM tokens directly to clients via AppSync GraphQL mutations, bypassing the SNS/SQS messaging path for significantly reduced latency during streaming responses.

Motivation

The default token delivery path (Lambda → SNS → SQS → WebSocket) introduces unnecessary latency for streaming LLM responses. By sending tokens directly via AppSync, we achieve:

  • Lower latency for token-by-token streaming
  • Reduced infrastructure overhead
  • Better user experience for real-time conversations

Changes

Core Implementation

  • Configuration: Added directSend?: boolean to SystemConfig (default: false)
  • AppSync Utility: New appsync.py module with SigV4-authenticated GraphQL mutations
  • Routing Logic: Updated websocket.py to route LLM_NEW_TOKEN actions based on DIRECT_SEND env var
  • Model Interfaces: Applied to all interfaces (langchain, idefics, bedrock-agents)

Files Changed (11)

  • cli/magic-config.ts - Configuration wizard support
  • lib/shared/types.ts - Type definition
  • lib/shared/layers/python-sdk/python/genai_core/utils/appsync.py - NEW: Direct send implementation
  • lib/shared/layers/python-sdk/python/genai_core/utils/websocket.py - Routing logic
  • lib/model-interfaces/{langchain,idefics,bedrock-agents}/index.ts - Interface updates
  • lib/aws-genai-llm-chatbot-stack.ts - Pass GraphQL API to interfaces
  • tests/shared/test_*.py - Unit tests

Testing

  • Unit tests for routing logic and AppSync integration
  • Deployed and tested in dev environment
  • Verified token streaming with Claude 3 Haiku
  • Confirmed backward compatibility (feature is opt-in)

Backward Compatibility

  • Breaking Changes: None
  • Default Behavior: Disabled (directSend: false)
  • Migration: Opt-in via configuration

Deployment Notes

When enabled, Lambda functions require:

  • APPSYNC_ENDPOINT environment variable
  • DIRECT_SEND=true environment variable
  • IAM permissions: appsync:GraphQL (query/mutation)

Checklist

  • Tests added and passing
  • Documentation updated
  • Backward compatible
  • Tested in deployed environment

Michael Raskansky added 2 commits November 19, 2025 14:01
This feature enables Lambda handlers to send tokens directly to clients
via AppSync, bypassing the SNS/SQS path for improved latency.

Changes:
- Add directSend configuration option
- Add AppSync direct send utility
- Update websocket handler to support direct send
- Update model interfaces (langchain, idefics) to support direct send

feat: apply directSend to bedrock-agents interface

Extend directSend feature to the new bedrock-agents interface to enable
direct token streaming via AppSync for Bedrock Agent responses.

docs: add directSend configuration documentation

fix: correct import and boolean logic in websocket.py

- Fix ChatbotAction import from genai_core.types instead of index
- Fix DIRECT_SEND environment variable check to properly evaluate boolean value
- Fix grammar in configuration prompt

test: add unit tests for directSend feature

- Test directSend routing logic for token vs non-token actions
- Test environment variable configuration
- Test AppSync mutation formatting
- Remove verbose request/response logging to reduce CloudWatch costs
- Only log errors when status != 200
- Keep X-Ray tracing for observability
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant