Replies: 5 comments 5 replies
-
To resolve the issue with the Groq API struggling to generate structured output for longer context examples in your coding assistant implementation, you can use the from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_groq import ChatGroq
class Code(BaseModel):
"""Code output"""
prefix: str = Field(description="Description of the code snip. Describes the detailed function usage, the arguments, outputs and gives an example if needed.")
imports: str = Field(description="Code block import statements")
function_name: str = Field(description="Name of the function")
code: str = Field(description="Executable code block using pep 8 code style. Not including import statements and not including applying function")
chat = ChatGroq(
temperature=0,
model="llama3-70b-8192",
# api_key="" # Optional if not set as an environment variable
)
structured_llm = chat.with_structured_output(Code, include_raw=True)
response = structured_llm.invoke("Generate a Python function with detailed description, imports, and code.")
print(response) This method leverages the tool-calling functionality to ensure the output matches the specified schema, which can help manage longer context examples more effectively. Additionally, Groq supports JSON mode, which can be specified if needed: structured_llm = chat.with_structured_output(Code, method="json_mode", include_raw=True)
response = structured_llm.invoke(
"Generate a Python function with detailed description, imports, and code. Respond in JSON with `prefix`, `imports`, `function_name`, and `code` keys."
)
print(response) However, be aware that there are known issues with the Groq API's structured output functionality, as indicated by the |
Beta Was this translation helpful? Give feedback.
-
Any solution here? |
Beta Was this translation helpful? Give feedback.
-
Yes sure. The custom parser is really depending on your problem and also on the way you prompt the LLM. I advice you to clearly prompt the LLM to generate an output with the structure you need/want. And then write a parser. My way to go was to adjust the prompt to get a reproducible output structure. And then I used chatGPT to write a parser for my outputs given some examples. Works fine for me so far, i hope this helps you. The following example shows the instruction prompt, query and parser for my problem. Instruction Prompt prompt_instruct_gen = '''You are a coding assistant with expertise to write specific python functions.
Here is a python code example with a specific task \n ------- \n {code_example_context} \n ------- \n
Write the python code for a new set of parameters given in the user question.
Ensure the following JSON structure
\{{
"imports": "...", # of external tools
"code": """...""", # code block
\}}
\n Customize the code answer the following setup and parameters:''' Query query = f'''Write the python code to <<YOUR TASK>>
Given is the following set of parameters: \n ------- \n {params_input_file} \n ------- \n
Insert the generated function code in the <<<FUNCTION CONTENT>>> section.
Add a list of the variable parameter names in <<<VARIABLE PARAMETERS>>>.
Return the entire code block, including the function definition and the content, in executable Python format following PEP 8 style.
List the function arguments and the required dictionary keys in the function documentation header.
IMPORTANT: make sure you use the correct name for the custom import file. Depending on the project name defined in the parameters file
code:"""
imports ... # make sure of the correct import names depending on the project_name
def generate_geometry(argument1):
"""
Description of function.
Args:
argument1: Interactive instance ....
keys: <<<VARIABLE PARAMETERS>>>
Returns:
none
"""
<<<FUNCTION CONTENT>>>
return
"""
Respond in JSON with string keys: "imports": "...", "code": """...""".
''' Parser def custom_json_code_ouput_parser(input_string):
"""
Parser for the code output into JSON format with the keys `imports` and `code`.
The function searches for the defined keys and the defined JSON file pattern in the input string
and parses it into a JSON file.
Arguments:
input_string (str): code string generated by the LLM
Return:
parsed_dict (dict): parsed dictionary
"""
# Define a regex pattern to extract the components
pattern = re.compile(
r'"imports":\s*"([^"]+)",\s*'
r'"code":\s*"""(.*?)"""\s*}', re.DOTALL
)
try:
match = pattern.search(input_string)
if not match:
raise ValueError(f"Input string does not match the expected format! {input_string}")
except:
pass
pattern = re.compile(
r'"imports":\s*"([^"]+)",\s*'
r'"code":\s*"(.*?)"\s*}', re.DOTALL
)
try:
match = pattern.search(input_string)
if not match:
raise ValueError(f"Input string does not match the expected format! {input_string}")
except:
pass
imports_raw, code_raw = match.groups()
imports = imports_raw.replace('\\n', '\n')
code = (code_raw.replace('\\n', '\n')
.replace('\\"', '\"'))
# Create the dictionary
parsed_dict = ParsedOutput({
"imports": imports,
"code": code
})
return parsed_dict |
Beta Was this translation helpful? Give feedback.
-
I am facing same problem but it shows API error, inspite I have set api in environment as a variable. |
Beta Was this translation helpful? Give feedback.
-
I have a scenario similar to the question. In my case, I also have a desired output of a specific JSON expected by the LLM. I am using Groq for this. That being said, what I’ve observed is that sometimes, depending on the number of output tokens from Groq, I get the error When the number of tokens is less than 2000, my responses are as expected. These input and output token values can be found on page "console groq". Here are the images of those token outputs from the mentioned page. I hope this helps someone. |
Beta Was this translation helpful? Give feedback.
-
Checked other resources
Commit to Help
Example Code
Description
I'm trying to implement a coding assistant using an LLM with structured outuput (
llm.with_structured_output
) similar to the LangChain tutorial. Using the OpenAI API everything works fine, but using the Groq API the model struggles with generating a structured output. Especially with longer context examples.For me the model output looks quite okay and i don't know why the tool function caller struggles with the output.
I use
llama3-70b-8192
, using a model with larger context length results in the same error.Any idea what the problem is or how to implement a fallback for more robust pipeline.
Thanks for your help
The error message:
System Info
Platform: Linux
Python: 3.11.9
langchain 0.2.6
langchain-community 0.2.6
langchain-core 0.2.11
langchain-experimental 0.0.62
langchain-groq 0.1.6
langchain-nomic 0.1.2
langchain-openai 0.1.14
langchain-text-splitters 0.2.2
langgraph 0.1.5
langsmith 0.1.83
Beta Was this translation helpful? Give feedback.
All reactions