Skip to content

Commit c9f48fd

Browse files
authored
Merge pull request #13 from JetBrains-Research/tigina/prompts_tools_adjusting
Updated project structure
2 parents 8212291 + 0869b58 commit c9f48fd

30 files changed

+1159
-746
lines changed

configs/template_generation/agent/few_shot.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -6,4 +6,4 @@ model_kwargs:
66
seed: 76097149
77
prompt:
88
_target_: src.eval.prompts.few_shot_prompt.FewShotPrompt
9-
execution_system_prompt_path: configs/template_generation/prompts/fleet_execution_system_prompt.md
9+
execution_system_prompt_path: configs/template_generation/prompts/fleet_execution_system_prompt_v1.md
Original file line numberDiff line numberDiff line change
@@ -1,45 +1,14 @@
11
_target_: src.eval.agents.openai_langchain_agent.OpenAILangchainAgent
22
name: planning
3-
model_name: gpt-3.5-turbo-1106
3+
model_name: gpt-4-1106-preview
44
temperature: 0
55
model_kwargs:
66
seed: 76097149
77
prompt:
88
_target_: src.eval.prompts.planning_prompt.PlanningPrompt
9-
model_name: gpt-3.5-turbo-1106
9+
model_name: gpt-4-1106-preview
1010
temperature: 0
1111
model_kwargs:
1212
seed: 76097149
13-
planning_system_prompt: |
14-
We are developing a benchmark to assess the quality of project template generation by code agents.
15-
The benchmark includes the task of generating project template code based on a description provided in natural language.
16-
We will use template repositories in Python, Java, and Kotlin from GitHub as the source of data for this task.
17-
The task is to use the given file system API to reproduce a given template based on its textual description.
18-
Template is a small compilable project that contains small examples of all mentioned in description libraries, technologies, functionality.
19-
The result template should contain the whole file structure of this project, including the code inside files and configuration files for project building and deployment if required.
20-
The code should be compilable and contain a minimal example of the required functionality as specified in the description.
21-
The file system API provides functionality to explore, create, delete, and modify files and directories in the working directory.
22-
23-
Now, you should provide a detailed step-by-step action plan for handling the task at hand, clearly emphasizing the sequential order of operations.
24-
Your plan should:
25-
- Specify the task's end goal and break down the process into individual actions, using bullet points for clarity.
26-
- Abstractly describe the logical and conditional flow between actions.
27-
- Do not execute any functions or write any code. This is a planning-only phase, intended to create a blueprint for the execution phase.
28-
execution_system_prompt: |
29-
We are developing a benchmark to assess the quality of project template generation by code agents.
30-
The benchmark includes the task of generating project template code based on a description provided in natural language.
31-
We will use template repositories in Python, Java, and Kotlin from GitHub as the source of data for this task.
32-
Your task is to use the given file system API to reproduce a given template based on its textual description.
33-
You should create the file structure of this project, including the code inside files and configuration files for project building and deployment if required.
34-
The code should be compilable and contain a minimal example of the required functionality as specified in the description.
35-
The file system API provides functionality to explore, create, delete, and modify files and directories in the working directory.
36-
37-
As a support, you also provided with a plan, which you should follow during the task. For each step of the plan you should:
38-
- Call the appropriate function with the required arguments, corresponding directly to the outlined plan's next action.
39-
- Evaluate the function call's result and adjust the next steps accordingly, maintaining fidelity to the plan's intended sequence.
40-
- If a function call fails or produces undesired outcomes, stop execution process.
41-
- Persist through different strategies, avoiding repetition of unsuccessful attempts, until the task is completed or a maximum of 50 steps have been taken.
42-
- If you are stuck, do not try to repeat the request, as it will not help. Instead, try to explore the reason for the failure, modify the input or rollback some previous actions which led to the error.
43-
- Document solely the function calls and their outcomes. Refrain from additional commentary or explanatory text during this execution phase.
44-
Begin execution with the plan start point, adhering strictly to the prescribed operations.
45-
The project’s template root directory has already been created, so you don't need to create it. Produce files inside this directory. Treat it as the root directory and use relative paths starting with an empty string ''.
13+
planning_system_prompt_path: configs/template_generation/prompts/planning_planning_system_prompt_v2.md
14+
execution_system_prompt_path: configs/template_generation/prompts/planning_execution_system_prompt_v2.md

configs/template_generation/config.yaml

+3-3
Original file line numberDiff line numberDiff line change
@@ -3,12 +3,12 @@ hydra:
33
job:
44
name: ${agent.name}_${agent.model_name}
55
run:
6-
dir: /Users/Maria.Tigina/PycharmProjects/agents-eval-data/template_generation/${hydra:job.name}
6+
dir: /Users/Maria.Tigina/PycharmProjects/agents-eval-data/fleet/template_generation/${hydra:job.name}
77
job_logging:
88
root:
99
handlers: [console, file]
1010
defaults:
1111
- _self_
1212
- data_source: hf
13-
- env: fleet
14-
- agent: few_shot
13+
- env: code_engine
14+
- agent: planning

configs/template_generation/data.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
data_path: /Users/Maria.Tigina/PycharmProjects/agents-eval-data
1+
data_path: /Users/Maria.Tigina/PycharmProjects/agents-eval-data/fleet
22
repos_path: /Users/Maria.Tigina/PycharmProjects/agents-eval-data/repos
33
projects_path: /Users/Maria.Tigina/PycharmProjects/agents-eval-data/ide_templates
44
archives_path: /Users/Maria.Tigina/PycharmProjects/agents-eval-data/ide_templates_arch

configs/template_generation/env/code_engine.yaml

+2-1
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,5 @@ _target_: src.eval.envs.code_engine_env.CodeEngineEnv
22
host: '127.0.0.1'
33
port: '5050'
44
docker_image_name: 'mariatigina/code-engine:latest'
5-
docker_container_name: 'code-engine'
5+
docker_container_name: 'code-engine'
6+
tools_path: 'configs/template_generation/tools/code_engine_tools_v2.jsonl'
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
_target_: src.eval.envs.few_shot_env.FewShotEnv

configs/template_generation/env/fleet.yaml

-2
This file was deleted.
+2-2
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
11
repos_path: /Users/Maria.Tigina/PycharmProjects/agents-eval-data/repos
2-
gen_templates_path: /Users/Maria.Tigina/PycharmProjects/agents-eval-data/template_generation
3-
metrics_path: /Users/Maria.Tigina/PycharmProjects/agents-eval-data/metrics
2+
gen_templates_path: /Users/Maria.Tigina/PycharmProjects/agents-eval-data/fleet/template_generation
3+
metrics_path: /Users/Maria.Tigina/PycharmProjects/agents-eval-data/fleet/metrics

configs/template_generation/prompts/fleet_execution_system_prompt.md configs/template_generation/prompts/fleet_execution_system_prompt_v1.md

+4-54
Original file line numberDiff line numberDiff line change
@@ -10,20 +10,15 @@ DO NOT create any code snippets on this step.
1010
2.2 Create project structure with directories and files, using tree command output.
1111

1212
3. Content generation step:
13-
2.1 Add code snippet: ```${'\n'}CONTENT_GENERATION${'\n'}```
14-
2.3 For each file in the project add code block with file with filepath as type and content
13+
3.1 Add code snippet: ```${'\n'}CONTENT_GENERATION${'\n'}```
14+
3.2 For each file in the project add code block with file with filepath as type and content
1515

1616
Strictly follow next rules:
1717
- ALWAYS write example code in the files without any placeholders.
1818
- ALWAYS add README file in the root of the project with short but comprehensive description of project's the structure.
1919
- ALWAYS add README_BUILD file in the root of the project with short but comprehensive description how to build project – describe all build dependencies.
2020
- ALWAYS add language notation 'PROJECT' for project structure
2121

22-
4. Validation step:
23-
3.1 Add code snippet: ```${'\n'}VALIDATION${'\n'}```
24-
3.2 Verbally analyse file content for mistake, errors and incoherence. When file is OK: add code snippet with file path for correct files.
25-
3.3 Change files according to analysis: add file path and then code block with CORRECTED file content
26-
2722
REMEMBER, I don't have fingers to print myself and my work DEPENDS on YOU.
2823

2924
############
@@ -96,37 +91,8 @@ fun main() {
9691
}
9792
```
9893

99-
```
100-
VALIDATION
101-
```
102-
103-
```/kotlin-project/build.gradle
104-
OK
105-
```
106-
107-
```/kotlin-project/gradle.properties
108-
OK
109-
```
110-
111-
```/kotlin-project/settings.gradle
112-
OK
113-
```
114-
115-
```/kotlin-project/src/main/kotlin/Main.kt
116-
OK
117-
```
118-
119-
kotlin-project/README.md file content is missing. Fixing it by adding content.
120-
```/kotlin-project/README.md
121-
[Description of the generated project]
122-
```
123-
124-
kotlin-project/README_BUILD.md file content is missing. Fixing it by adding content.
125-
```/kotlin-project/README_BUILD.md
126-
The project uses
127-
- Gradle build system.
128-
- Java SDK version 18
129-
```
94+
############
95+
Example 2:
13096

13197
```
13298
PLANNING
@@ -227,19 +193,3 @@ The project uses
227193
- Maven build system
228194
- Java SDK version 18
229195
```
230-
231-
```
232-
VALIDATION
233-
```
234-
235-
```/groovy-project/pom.xml
236-
OK
237-
```
238-
239-
```/groovy-project/README.md
240-
OK
241-
```
242-
243-
```/groovy-project/README_BUILD.md
244-
OK
245-
```
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
We are developing a benchmark to assess the quality of project template generation by code agents.
2+
The benchmark includes the task of generating project template code based on a description provided in natural language.
3+
We will use template repositories in Python, Java, and Kotlin from GitHub as the source of data for this task.
4+
Your task is to use the given file system API to reproduce a given template based on its textual description.
5+
You should create the file structure of this project, including the code inside files and configuration files for project building and deployment if required.
6+
The code should be compilable and contain a minimal example of the required functionality as specified in the description.
7+
The file system API provides functionality to explore, create, delete, and modify files and directories in the working directory.
8+
9+
As a support, you also provided with a plan, which you should follow during the task. For each step of the plan you should:
10+
- Call the appropriate function with the required arguments, corresponding directly to the outlined plan's next action.
11+
- Evaluate the function call's result and adjust the next steps accordingly, maintaining fidelity to the plan's intended sequence.
12+
- If a function call fails or produces undesired outcomes, stop execution process.
13+
- Persist through different strategies, avoiding repetition of unsuccessful attempts, until the task is completed or a maximum of 50 steps have been taken.
14+
- If you are stuck, do not try to repeat the request, as it will not help. Instead, try to explore the reason for the failure, modify the input or rollback some previous actions which led to the error.
15+
- Document solely the function calls and their outcomes. Refrain from additional commentary or explanatory text during this execution phase.
16+
Begin execution with the plan start point, adhering strictly to the prescribed operations.
17+
The project’s template root directory has already been created, so you don't need to create it. Produce files inside this directory. Treat it as the root directory and use relative paths starting with an empty string ''.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
For each step of the provided plan you should:
2+
- Call the appropriate function with the required arguments, corresponding directly to the outlined plan's next action.
3+
- Evaluate the function call's result and adjust the next steps accordingly, maintaining fidelity to the plan's intended sequence.
4+
- If a function call fails or produces undesired outcomes, try to solve the problem by fining the reason of the fail and selecting workaround path.
5+
- Persist through different strategies, avoiding repetition of unsuccessful attempts, until the task is completed.
6+
- If you are stuck, do not try to repeat the request, as it will not help. Instead, try to explore the reason for the failure, modify the input or rollback some previous actions which led to the error.
7+
- Document solely the function calls and their outcomes. Refrain from additional commentary or explanatory text during this execution phase.
8+
Begin execution with the plan start point, adhering strictly to the prescribed operations.
9+
The project’s template root directory has already been created, so you don't need to create it. Produce files inside this directory. Treat it as the root directory and use relative paths starting with an empty string ''.
10+
11+
############
12+
Example 1:
13+
14+
DESCRIPTION:
15+
Generate template for project named `ktor-samples__chat` which should be mainly written on Kotlin programing language and aligns following description:
16+
A chat application written with Ktor using WebSockets and Sessions.
17+
18+
PLAN:
19+
Detailed Step-by-Step Action Plan for Generating the Project Template
20+
1. Define the Project Directory Structure
21+
2. Generate files content and Write Files
22+
23+
EXECUTION:
24+
1. Using function calling invoke `create-project-tree` function with `parentDirectory` argument is "" and `tree_structure` argument equals to
25+
```
26+
ktor-samples__chat/
27+
├── README.md
28+
├── build.gradle
29+
├── tests/
30+
└── src/
31+
├── backendMain/
32+
│ ├── kotlin/
33+
│ │ ├── ChatApplication.kt
34+
│ │ └── ChatServer.kt
35+
│ └── resources/
36+
│ ├── application.conf
37+
│ ├── logback.xml
38+
│ └── web/
39+
│ └── index.html
40+
├── backendTest/
41+
│ └── kotlin/
42+
│ └── ChatApplicationTest.kt
43+
└── frontendMain/
44+
└── kotlin/
45+
└── main.kt
46+
```
47+
Make sure your tree in format of command line tool tree format and contains only │├└ and ─ symbols for brunch drawing
48+
49+
2. For each file in file tree using function calling invoke `set-file-text` function.
50+
For example, you can start with `filePath` argument equals to 'ktor-samples__chat/chat/src/backendMain/kotlin/ChatApplication.kt' and `text` equals to
51+
```
52+
package io.ktor.samples.chat.backend
53+
54+
import io.ktor.server.application.*
55+
import io.ktor.server.engine.*
56+
import io.ktor.server.http.content.*
57+
58+
fun main() {
59+
embeddedServer(Netty, port = 8080) {
60+
ChatApplication().apply { main() }
61+
}.start(wait = true)
62+
}
63+
```
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
We are developing a benchmark to assess the quality of project template generation by code agents.
2+
The benchmark includes the task of generating project template code based on a description provided in natural language.
3+
We will use template repositories in Python, Java, and Kotlin from GitHub as the source of data for this task.
4+
The task is to use the given file system API to reproduce a given template based on its textual description.
5+
Template is a small compilable project that contains small examples of all mentioned in description libraries, technologies, functionality.
6+
The result template should contain the whole file structure of this project, including the code inside files and configuration files for project building and deployment if required.
7+
The code should be compilable and contain a minimal example of the required functionality as specified in the description.
8+
The file system API provides functionality to explore, create, delete, and modify files and directories in the working directory.
9+
Now, you should provide a detailed step-by-step action plan for handling the task at hand, clearly emphasizing the sequential order of operations.
10+
Do not execute any functions or write any code. This is a planning-only phase, intended to create a blueprint for the execution phase.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
We are developing a benchmark to assess the quality of project template generation by textual description using AI Code Agents.
2+
Template is a simple compilable project that contains small examples of all mentioned in description libraries, technologies, functionality.
3+
The task is to generate project template based on its textual description using the given file system API function calls.
4+
The file system API provides functionality to explore, create, delete, and modify files and directories in the working directory.
5+
The result project template should contain:
6+
1. The whole complete directory structure of this project
7+
2. Code inside files. The code should be compilable and contain a minimal example of the required functionality as specified in the description.
8+
3. README.md with template description and project setup instructions
9+
4. Configuration files for project building and deployment (requirements.txt for Python, build.gradle for Java, build.gradle.kts for Kotlin)
10+
Now, you should provide a detailed step-by-step action plan for handling the task at hand, clearly emphasizing the sequential order of operations.
11+
Do not execute any functions or write any code. This is a planning-only phase, intended to create a blueprint for the execution phase.

0 commit comments

Comments
 (0)