How does openAI framework converts the ChatComplete (training data ) into model ready format (SFT)

I have a question regarding the internal processing pipeline for Supervised Fine‑Tuning (SFT) when using the ChatCompletion-style JSON format for training data.

{
  "messages": [
    {
      "role": "system",
      "content": "You are an ..."
    },
    {
      "role": "user",
      "content": "Please tell me about the paper \"K205...\......""
    },
    {
      "role": "assistant",
      "content": "The paper titled \"K205R specific nanobody...\""
    }
  ]
}

How does the framework internally transform the ChatCompletion training data format into a model ready input. And how is the loss computed 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How does openAI framework converts the ChatComplete (training data ) into model ready format (SFT) #2075

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How does openAI framework converts the ChatComplete (training data ) into model ready format (SFT) #2075

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions