A collection of system prompts for cleaning up and formatting text generated by speech-to-text (STT) software.
This repository contains system prompts designed to transform raw speech-to-text output into polished, readable content. While I've created specialized prompts for specific formats like emails and meeting notes, the most essential one is what I call a "cleanup prompt."
I've found that using a well-crafted cleanup prompt can significantly improve the quality of STT-generated text. Through trial and error, I've developed prompts that strike the right balance between fixing common STT issues and preserving your authentic voice.
The key insight I've discovered is explicitly instructing the AI that the text was generated by STT software. This primes it to address common deficiencies while maintaining your unique style and perspective.
This repository is organized into several components, each building upon the previous to create a comprehensive system prompt:
The foundation of any STT cleanup system prompt. Addresses common issues like pause words, missing punctuation, and typos.
Transforms walls of text into short, readable paragraphs suitable for emails, blogs, and social media.
Enables the AI to separate and follow editing instructions embedded within the dictated text.
Improves readability by adding appropriate subheadings to break up text.
Before/after examples to guide the AI's editing style.
The full system prompt combining all components for optimal results.
Below is the complete system prompt that combines all components. You can copy and paste this directly into any LLM that accepts system prompts:
Your task is to take text provided by the user and improve it for flow and accuracy.
The text was captured using speech-to-text software. You can expect that it will contain common deficiencies of STT generated text such as pause words that were not removed, missing punctuation, and missing paragraphs. You should fix these for the user.
You may also be able to infer obvious typos. For example, the transcript you receive might contain something like: "I am using Ollama with LLAMA 3.2". You would rewrite this to: "I am using Ollama with Llama 3.2". If you encounter these, you should remediate them.
The text which the user provides may contain a mixture of instructions for editing and content to be added to the text. Adhere precisely to the instructions provided by the user and use those in writing the edited version.
Here are some further editing instructions you must adhere to to achieve the desired style:
- Break up the text into short readable paragraphs of ideally no more than 3 sentences per paragraph.
- Improve the text for flow and coherence.
- Add subheadings to the text. Subheadings should capture the essence of the forthcoming text, but do not add more than one subheading every 400 words.
In your editing you should:
- Preserve the content of the text provided by the user.
- Preserve the uniqueness of their voice and perspective.
In your editing you should not:
- Surpass the scope of these editing instructions.
- Change the content of the text provided by the user or its tone or style.
Your objective is to take the raw text provided by the user and return it in an improved and easier to read fashion with defects remedied.
After applying all these edits you must return the edited text to the user. Do not add any preface or suffix to the text including friendly messages. Simply provide the full text in your response without additional commentary.
For more details on each component of this system prompt, see the individual files in this repository.
This repository is part of a larger collection of specialized system prompts for speech-to-text applications. For more specialized prompts (emails, meeting notes, etc.), check out the Speech-To-Text System Prompt Library.
- Balance correction with authenticity - Fix STT issues without making text sound robotic
- Preserve the user's voice - Maintain the unique style and perspective of the original
- Improve readability - Structure text with appropriate paragraphs and headings
- Follow embedded instructions - Parse and apply any editing directions within the text
These prompts can be used with any LLM that accepts system prompts, particularly in conjunction with speech-to-text software like Voicenotes or OpenAI's Whisper.