unit-mesh
diff --git a/‎.cspell.json
+4-1 b/‎.cspell.json
+4-1
diff --git a/‎docs/configuration.md
+52-21 b/‎docs/configuration.md
+52-21
diff --git a/‎docs/features/code-completion.md
+74-19 b/‎docs/features/code-completion.md
+74-19
diff --git a/‎package.json
+68-12 b/‎package.json
+68-12
@@ -11,13 +11,16 @@
   "useGitignore": true,
   "ignorePaths": [],
   "words": [
+    "codeqwen",
+    "identifer",
     "inversify",
+    "lancedb",
     "ollama",
     "onnx",
     "openai",
     "phodal",
-		"identifer",
     "qianfan",
+    "qwen",
     "tolist",
     "tongyi",
     "xenova"
 
@@ -105,42 +105,75 @@ Choose a default model provider to give code generation.
 
 Model for overwrite provider in the provider completion model
 
-### Template
+### FIM Special Tokens
 
-Customize your modeling cue template.
+Fill-in-the-middle (FIM) is a special prompt format supported by the code completion model can complete code between two already written code blocks.
 
-> [!IMPORTANT]
-> Variables use string replacement, please fill in strictly according to instructions.
+See [Code Completions](./features/code-completion.md).
 
-The recommended format is FIM ( filling in the middle ), for example:
+```json
+{
+	"autodev.completions.fimSpecialTokens": {
+		"prefix": "<PRE>",
+		"suffix": "<SUF>",
+		"middle": "<MID>"
+	}
+}
+```
 
-```sh
-<fim_prefix>{prefix}<fim_suffix>{suffix}<fim_middle>
+### Parameters
 
-# or
+Some model parameters.
 
-<PRE>{prefix}<SUF>{suffix}<MID>
+> Please don't modify it unless you know what you're doing.
+
+```json
+{
+	"autodev.completions.parameters": {
+		"temperature": 0,
+		"top_p": 0.9,
+		"max_tokens": 500
+	}
+}
 ```
 
+### Stops
+
+Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
+
+### Request Delay
+
+Code auto-completion delay request time. Avoid excessive consumption of API tokens. `requestDelay` only works if `Autodev: Enable Code Completions` is enabled.
+
+### Enable Legacy Mode
+
+Use legacy `/v1/completion` instead of `/v1/chat/completion`. Only working `openai` provider.
+
+> Will be deprecated when infill is universally supported or openai stop "/v1/completions" support.
+
+### Template
+
+Customize your modeling cue template.
+
+> Place use [FIM Special Tokens](#fim-special-tokens) instead.
+
 Available Variables:
 
 - `prefix` Code before the cursor.
 - `suffix` Code after the cursor.
 - `language` Current editing file language, for example: "javascript".
 
-### Stops
-
-Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
-
-### Enable Legacy Mode
+The recommended format is FIM ( filling in the middle ), for example:
 
-> Only working `openai` provider
+```sh
+<fim_prefix>{prefix}<fim_suffix>{suffix}<fim_middle>
 
-Use legacy `/v1/completion` instead of `/v1/chat/completion`
+# or
 
-### Request Delay
+<PRE>{prefix}<SUF>{suffix}<MID>
+```
 
-Code auto-completion delay request time. Avoid excessive consumption of API tokens. `requestDelay` only works if `Autodev: Enable Code Completions` is enabled.
+Variables use string replacement, please fill in strictly according to instructions.
 
 ## Embeddings
 
@@ -201,6 +234,4 @@ See [ollama](https://ollama.com/)
 
 ### Transformers
 
-local model runtime. For codebase and embedding only. 
-
-
+local model runtime. For codebase and embedding only.
@@ -7,53 +7,108 @@ nav_order: 1
 
 Automatically completes your code based on the position of your cursor.
 
-## Enable Feature
+> We validate the codegemma, codeqwen, and codellama，other models need to be tested on their own.
+
+## Enable Code Completions
 
 Not enabled by default, see [AutoDev: Code Completion](../configuration.md#code-completion)。
 
 ```jsonc
 {
-  "autodev.openai.apiKey": "sk-xxxxx", // Your openai api key
-  "autodev.completions.enable": true // Enabled Inline Completions
+	"autodev.completions.enable": true, // Enabled or disable
 }
 ```
 
-Now let's try writing some code.
+Next step: select a code pre-training variant that specializes in code completion and generation of code prefixes and/or suffixes.
 
-## Select code Model
+## Fill-in-the-middle
 
-You can hope that you use specific code models instead of dialog models
+Fill-in-the-middle (FIM) is a special prompt format supported by the code completion model can complete code between two already written code blocks.
 
-```jsonc
+[codellama](https://ollama.com/blog/how-to-prompt-code-llama) expects a specific format for infilling code:
+
+```json
+{
+	"autodev.experimental.fimSpecialTokens": {
+		"prefix": "<PRE>",
+		"suffix": "<SUF>",
+		"middle": "<MID>"
+	}
+}
+```
+
+[codeqwen](https://github.com/QwenLM/CodeQwen1.5/blob/main/examples/CodeQwen1.5-base-fim.py) expects a specific format for infilling code:
+
+```json
+{
+	"autodev.experimental.fimSpecialTokens": {
+		"prefix": "<fim_prefix>",
+		"suffix": "<fim_suffix>",
+		"middle": "<fim_middle>"
+	}
+}
+```
+
+[codegemma](https://ai.google.dev/gemma/docs/formatting) expects a specific format for infilling code:
+
+```json
 {
-  "autodev.completions.model": "gpt-4o" // Overriding the default chat model
+	"autodev.experimental.fimSpecialTokens": {
+		"prefix": "<|fim_prefix|>",
+		"suffix": "<|fim_suffix|>",
+		"middle": "<|fim_middle|>"
+	}
 }
 ```
 
-Recommended to use a specially trained code model, or a base model that supports fim.
+For other models, please select the appropriate special format.
+
+**TIP:** codeqwen and codegemma can be used with the same.
+
+## Best practice
 
-## Connect to local model
+Because of the lack of resources, we used "ollama" to verify the reliability of the model.
 
-Here is an example of ollama, see [OpenAI compatibility](https://github.com/ollama/ollama/blob/main/docs/openai.md) for details.
+### Use CodeQwen
+
+Support for `codeqwen:7b-code-v1.5-q5_1`.
+
+> The most stable model available.
 
 ```jsonc
 {
-  "autodev.openai.baseURL": "http://127.0.0.1:11434/v1/", // Your local service url
-  "autodev.openai.apiKey": "sk-xxxxx", // Your local service api key
-  "autodev.completions.model": "codeqwen:7b-code-v1.5-q5_1" // Overriding the default chat model
+	"autodev.completions.provider": "ollama",
+	"autodev.completions.model": "codeqwen:7b-code-v1.5-q5_1",
 }
 ```
 
-If your self-built service is deployed in a mode that does not support chat, you may need to enable [legacy mode](#enable-legacy-mode).
+### Use CodeLlama
+
+Support for `codellama:7b`, `codellama:7b-code`, `codellama:7b-instruct`
+
+> Unstable generation.
+
+```jsonc
+{
+	"autodev.completions.provider": "ollama",
+	"autodev.completions.model": "codellama:7b-code",
+	"autodev.experimental.fimSpecialTokens": {
+		"prefix": "<PRE>",
+		"suffix": "<SUF>",
+		"middle": "<MID>",
+	},
+}
+```
 
-## Enable Legacy Mode
+### Use CodeGemma
 
-The default is the traditional `/v1/completions` instead of `/v1/chat/completions`, but you can fall back to the old mode.
+Support for `codegemma:2b-code`
 
-> Only working on openai provider
+> Unstable generation.
 
 ```jsonc
 {
-  "autodev.completions.enableLegacyMode": true
+	"autodev.completions.provider": "ollama",
+	"autodev.completions.model": "codegemma:2b-code",
 }
 ```
@@ -166,7 +166,7 @@
 				"properties": {
 					"autodev.completions.enable": {
 						"type": "boolean",
-						"default": true,
+						"default": false,
 						"description": "%configuration.completions.enable.description%",
 						"order": 1
 					},
@@ -180,38 +180,94 @@
 							"tongyi",
 							"ollama"
 						],
-						"default": "openai",
+						"default": "ollama",
 						"order": 1
 					},
 					"autodev.completions.model": {
 						"type": "string",
 						"description": "%configuration.completions.model.description%",
+						"default": "codeqwen:7b-code-v1.5-q5_1",
 						"order": 2
 					},
-					"autodev.completions.template": {
-						"type": "string",
-						"description": "%configuration.completions.template.description%",
-						"order": 3
+					"autodev.completions.parameters": {
+						"type": "object",
+						"properties": {
+							"temperature": {
+								"type": "number",
+								"default": 0
+							},
+							"top_p": {
+								"type": "number",
+								"default": 0.9
+							},
+							"max_tokens": {
+								"type": "integer",
+								"default": 500
+							}
+						},
+						"additionalProperties": false,
+						"default": {
+							"temperature": 0,
+							"top_p": 0.9,
+							"max_tokens": 500
+						},
+						"description": "%configuration.completions.parameters.description%",
+						"order": 4
+					},
+					"autodev.completions.fimSpecialTokens": {
+						"type": "object",
+						"description": "%configuration.completions.fimSpecialTokens.description%",
+						"properties": {
+							"prefix": {
+								"type": "string",
+								"description": "The prefix of the special token.",
+								"default": "<|fim_prefix|>"
+							},
+							"suffix": {
+								"type": "string",
+								"description": "The suffix of the special token.",
+								"default": "<|fim_suffix|>"
+							},
+							"middle": {
+								"type": "string",
+								"description": "The middle of the special token.",
+								"default": "<|fim_middle|>"
+							}
+						},
+						"additionalProperties": false,
+						"default": {
+							"prefix": "<|fim_prefix|>",
+							"suffix": "<|fim_suffix|>",
+							"middle": "<|fim_middle|>"
+						},
+						"order": 5
 					},
 					"autodev.completions.stops": {
 						"type": "array",
 						"items": {
 							"type": "string"
 						},
+						"additionalProperties": false,
+						"default": [],
 						"description": "%configuration.completions.stops.description%",
 						"order": 4
 					},
+					"autodev.completions.requestDelay": {
+						"type": "integer",
+						"default": 500,
+						"markdownDescription": "%configuration.completions.requestDelay.markdownDescription%",
+						"order": 6
+					},
 					"autodev.completions.enableLegacyMode": {
 						"type": "boolean",
 						"default": false,
 						"description": "Use legacy \"/v1/completions\" instead of \"/v1/chat/completions\"",
 						"markdownDescription": "%configuration.completions.enableLegacyMode.markdownDescription%",
 						"order": 7
 					},
-					"autodev.completions.requestDelay": {
-						"type": "integer",
-						"default": 500,
-						"markdownDescription": "%configuration.completions.requestDelay.markdownDescription%",
+					"autodev.completions.template": {
+						"type": "string",
+						"description": "%configuration.completions.template.description%",
 						"order": 8
 					}
 				},
@@ -545,7 +601,7 @@
 				"title": "%command.explainCode.title%"
 			},
 			{
-				"command": "autodev.codelens..optimizeCode",
+				"command": "autodev.codelens.optimizeCode",
 				"title": "%command.optimizeCode.title%"
 			},
 			{
@@ -830,7 +886,7 @@
 					"when": "false"
 				},
 				{
-					"command": "autodev.codelens..optimizeCode",
+					"command": "autodev.codelens.optimizeCode",
 					"when": "false"
 				},
 				{