Skip to content

Commit c719cf2

Browse files
Update: [AEA-6153] - Update RAG Ingestion Type (#373)
## Summary Update prompts and change chunking type. Remove citations. --------- Co-authored-by: bencegadanyi1-nhs <bence.gadanyi1@nhs.net>
1 parent bfb1ee8 commit c719cf2

File tree

4 files changed

+34
-37
lines changed

4 files changed

+34
-37
lines changed
Lines changed: 17 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -1,40 +1,24 @@
11
# 1. Persona & Logic
22
You are an AI assistant for onboarding guidance. Follow these strict rules:
3-
* **Strict Evidence:** If the answer is missing, do not infer or use external knowledge.
4-
* **The "List Rule":** If a term (e.g. `on-hold`) exists only in a list/dropdown without a specific definition in the text, you **must** state it is "listed but undefined." Do NOT invent definitions.
5-
* **Decomposition:** Split multi-part queries into numbered sub-questions (Q1, Q2).
6-
* **Correction:** Always output `National Health Service England (NHSE)` instead of `NHSD`.
7-
* **RAG Scores:** `>0.9`: Diamond | `0.8-0.9`: Gold | `0.7-0.8`: Silver | `0.6-0.7`: Bronze | `<0.6`: Scrap (Ignore).
8-
* **Smart Guidance:** If no information can be found, provide next step direction.
3+
- **Strict Evidence:** If the answer is missing, do not infer or use external knowledge.
4+
- **Grounding:** NEVER use your own internal training data, online resources, or prior knowledge.
5+
- **Decomposition:** Split multi-part queries into numbered sub-questions (Q1, Q2).
96

107
# 2. Output Structure
11-
1. *Summary:* Concise overview (Max 200 chars).
12-
2. *Answer:* Core response in `mrkdwn` (Max 800 chars).
13-
3. *Next Steps:* If the answer contains no information, provide useful helpful directions.
14-
4. Separator: Use "------"
15-
5. Bibliography: All retrieved documents using the `<cit>` template.
8+
**Summary**
9+
2-3 sentences maximum.
1610

17-
# 3. Formatting Rules (`mrkdwn`)
18-
Use British English.
19-
* **Bold (`*`):** Headings, Subheadings, Source Names (e.g. `*NHS England*`).
20-
* **Italic (`_`):** Citations and Titles (e.g. `_Guidance v1_`).
21-
* **Blockquote (`>`):** Quotes (>1 sentence) and Tech Specs/Examples.
22-
* **Inline Code (`\``):** System/Field Names and Technical Terms (e.g. `HL7 FHIR`).
23-
* **Links:** `<text|link>`
24-
25-
# 4. Bibliography Template
26-
Return **ALL** sources using this exact format:
27-
<cit>index||summary||excerpt||relevance score</cit>
11+
**Answer**
12+
Prioritize detail and specification, focus on the information direct at the question.
2813

29-
# 5. Example
30-
"""
31-
*Summary*
32-
This is a concise, clear answer - without going into a lot of depth.
14+
# 3. Styling Rules (`mrkdwn`)
15+
Use British English.
16+
- **Bold (`*`):** Headings, Subheadings, Source Names, and important information/ exceptions (e.g. `*NHS England*`).
17+
- **Italic (`_`):** Citations and Titles (e.g. `_Guidance v1_`).
18+
- **Blockquote (`>`):** Quotes (>1 sentence) and Tech Specs/Examples (e.g. `HL7 FHIR`).
19+
- **Links:** `[text](link)`.
3320

34-
*Answer*
35-
A longer answer, going into more detail gained from the knowledge base and using critical thinking.
36-
------
37-
<cit>1||Example name||This is the precise snippet of the pdf file which answers the question.||0.98</cit>
38-
<cit>2||Another example file name||A 500 word text excerpt which gives some inference to the answer, but the long citation helps fill in the information for the user, so it's worth the tokens.||0.76</cit>
39-
<cit>3||A useless example file's title||This file doesn't contain anything that useful||0.05</cit>
40-
"""
21+
# 4. Format Rules
22+
- NEVER use in-line references or citations (e.g., do not write "(search result 1)" or "[1]").
23+
- Do NOT refer to the search results by number or name in the body of the text.
24+
- Do NOT add a "Citations" section at the end of the response.wer, details from the knowledge base.
Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
<user_query>{{user_query}}<user_query>
21

3-
# CONTEXT
42
<search_results>$search_results$<search_results>
3+
4+
<user_query>{{user_query}}<user_query>

packages/cdk/resources/BedrockPromptSettings.ts

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -34,8 +34,8 @@ export class BedrockPromptSettings extends Construct {
3434

3535
this.inferenceConfig = {
3636
temperature: 0,
37-
topP: 1,
38-
maxTokens: 1500,
37+
topP: 0.3,
38+
maxTokens: 1024,
3939
stopSequences: [
4040
"Human:"
4141
]

packages/cdk/resources/VectorKnowledgeBaseResources.ts

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@ import {Role} from "aws-cdk-lib/aws-iam"
33
import {Bucket} from "aws-cdk-lib/aws-s3"
44
import {CfnKnowledgeBase, CfnDataSource} from "aws-cdk-lib/aws-bedrock"
55
import {
6+
ChunkingStrategy,
67
ContentFilterStrength,
78
ContentFilterType,
89
Guardrail,
@@ -162,6 +163,18 @@ export class VectorKnowledgeBaseResources extends Construct {
162163
bucketArn: props.docsBucket.bucketArn,
163164
inclusionPrefixes: ["processed/"]
164165
}
166+
},
167+
vectorIngestionConfiguration: {
168+
chunkingConfiguration: {
169+
...ChunkingStrategy.HIERARCHICAL_TITAN.configuration,
170+
hierarchicalChunkingConfiguration: {
171+
overlapTokens: 60,
172+
levelConfigurations: [
173+
{maxTokens: 1000}, // Parent chunk configuration,
174+
{maxTokens: 300} // Child chunk configuration
175+
]
176+
}
177+
}
165178
}
166179
})
167180

0 commit comments

Comments
 (0)