Skip to content

Commit c633d75

Browse files
author
Bob Strahan
committed
Merge branch 'develop' v0.3.19
2 parents b419a3c + a3bbea0 commit c633d75

File tree

103 files changed

+9660
-1453
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

103 files changed

+9660
-1453
lines changed

.gitlab-ci.yml

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ image: public.ecr.aws/docker/library/python:3.13-bookworm
1616

1717
stages:
1818
- developer_tests
19+
- deployment_validation
1920
- integration_tests
2021

2122
developer_tests:
@@ -93,4 +94,23 @@ integration_tests:
9394
- poetry install
9495
- make put
9596
- make wait
97+
98+
deployment_validation:
99+
stage: deployment_validation
100+
rules:
101+
- when: always
102+
103+
before_script:
104+
- apt-get update -y
105+
- apt-get install curl unzip python3-pip -y
106+
# Install AWS CLI
107+
- curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
108+
- unzip awscliv2.zip
109+
- ./aws/install
110+
# Install PyYAML for template analysis
111+
- pip install PyYAML
112+
113+
script:
114+
# Check if service role has sufficient permissions for main stack deployment
115+
- python3 scripts/validate_service_role_permissions.py
96116

CHANGELOG.md

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,8 +5,74 @@ SPDX-License-Identifier: MIT-0
55

66
## [Unreleased]
77

8+
9+
## [0.3.19]
10+
11+
### Added
12+
13+
- **Error Analyzer (Troubleshooting Tool) for AI-Powered Failure Diagnosis**
14+
- Introduced intelligent AI-powered troubleshooting agent that automatically diagnoses document processing failures using Claude Sonnet 4 with the Strands agent framework
15+
- **Key Capabilities**: Natural language query interface, intelligent routing between document-specific and system-wide analysis, multi-source data correlation (CloudWatch Logs, DynamoDB, Step Functions), root cause identification with actionable recommendations, evidence-based analysis with collapsible log details
16+
- **Web UI Integration**: Accessible via "Troubleshoot" button on failed documents with real-time job status, progress tracking, automatic job resumption, and formatted results (Root Cause, Recommendations, Evidence sections)
17+
- **Tool Ecosystem**: 8 specialized tools including analyze_errors (main router), analyze_document_failure, analyze_recent_system_errors, CloudWatch log search tools, DynamoDB integration tools, and Lambda context retrieval - additional tools will be added as the feature evolves.
18+
- **Configuration**: Configurable via Web UI including model selection (Claude Sonnet 4 recommended), system prompt customization, max_log_events (default: 5), and time_range_hours_default (default: 24)
19+
- **Documentation**: Comprehensive guide in `docs/error-analyzer.md` with architecture diagrams, usage examples, best practices, troubleshooting guide.
20+
21+
- **Claude Sonnet 4.5 Model Support**
22+
- Added support for Claude Sonnet 4.5 and Claude Sonnet 4.5 - Long Context models
23+
- Available for configuration across all document processing steps
24+
25+
26+
### Fixed
27+
- **Problem with setting correctly formatted WAF IPv4 CIDR range** - #73
28+
29+
- **Duplicate Step Functions Executions on Document Reprocess - [GitHub Issue #66](https://github.com/aws-solutions-library-samples/accelerated-intelligent-document-processing-on-aws/issues/66)**
30+
- Eliminated duplicate workflow executions when reprocessing large documents (>40MB, 500+ pages)
31+
- **Root Cause**: S3 `copy_object` operations were triggering multiple "Object Created" events for large files, causing `queue_sender` to create duplicate document entries and workflow executions
32+
- **Solution**: Refactored `reprocess_document_resolver` to directly create fresh Document objects and queue to SQS, completely bypassing S3 event notifications
33+
- **Benefits**: Eliminates unnecessary S3 copy operations (cost savings)
34+
35+
## [0.3.18]
36+
37+
### Added
38+
39+
- **Lambda Function Execution Cost Metering for Complete Cost Visibility**
40+
- Added Lambda execution cost tracking to all core processing functions across all three processing patterns
41+
- **Dual Metrics**: Tracks both invocation counts ($0.20 per 1M requests) and GB-seconds duration ($16.67 per 1M GB-seconds) aligned with official AWS Lambda pricing
42+
- **Context-Specific Tracking**: Separate cost attribution for each processing step enabling granular cost analysis per document processing context
43+
- **Automatic Integration**: Lambda costs automatically integrate with existing cost reporting infrastructure and appear alongside AWS service costs (Textract, Bedrock, SageMaker)
44+
- **Configuration Integration**: Added Lambda pricing entries to all 7 configuration files in `config_library/` using official US East pricing
45+
46+
### Fixed
47+
- Defect in v0.3.17 causing workflow tracker failure to (1) update status of failed workflows, and (2) update reporting database for all workflows #72
48+
49+
50+
## [0.3.17]
51+
852
### Added
953

54+
- **Edit Sections Feature for Modifying Class/Type and Reprocessing Extraction**
55+
- Added Edit Sections interface for Pattern-2 and Pattern-3 workflows with reprocessing optimization
56+
- **Key Features**: Section management (create, update, delete), classification updates, page reassignment with overlap detection, real-time validation
57+
- **Selective Reprocessing**: Only modified sections are reprocessed while preserving existing data for unmodified sections
58+
- **Processing Pipeline**: All functions (OCR/Classification/Extraction/Assessment) automatically skip redundant operations based on data presence
59+
- **Pattern Compatibility**: Full functionality for Pattern-2/Pattern-3, informative modal for Pattern-1 explaining BDA not yet supported
60+
61+
- **Analytics Agent Schema Optimization for Improved Performance**
62+
- **Embedded Database Overview**: Complete table listing and guidance embedded directly in system prompt (no tool call needed)
63+
- **On-Demand Detailed Schemas**: `get_table_info(['specific_tables'])` loads detailed column information only for tables actually needed by the query
64+
- **Significant Performance Gains**: Eliminates redundant tool calls on every query while maintaining token efficiency
65+
- **Enhanced SQL Guidance**: Comprehensive Athena/Trino function reference with explicit PostgreSQL operator warnings to prevent common query failures like `~` regex operator mistakes
66+
- **Faster Time-to-Query**: Agent has immediate access to table overview and can proceed directly to detailed schema loading for relevant tables
67+
68+
### Changed
69+
- Add UI code lint/validation to publish.py script
70+
71+
### Fixed
72+
- Fix missing data in Glue tables when using a document class that contains a dash (-).
73+
- Added optional Bedrock Guardrails support to (a) Agent Analytics and (b) Chat with Document
74+
- Fixed regressions on Permission Boundary support for all roles, and added autimated tests to prevent recurrance - fixes #70
75+
1076
## [0.3.16]
1177

1278
### Added

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
0.3.16
1+
0.3.19

0 commit comments

Comments
 (0)