Mellea is a library for writing generative programs. Generative programming replaces flaky agents and brittle prompts with structured, maintainable, robust, and efficient AI workflows.
- A standard library of opinionated prompting patterns.
- Sampling strategies for inference-time scaling.
- Clean integration between verifiers and samplers.
- Batteries-included library of verifiers.
- Support for efficient checking of specialized requirements using activated LoRAs.
- Train your own verifiers on proprietary classifier data.
- Compatible with many inference services and model families. Control cost and quality by easily lifting and shifting workloads between: - inference providers - model families - model sizes
- Easily integrate the power of LLMs into legacy code-bases (mify).
- Sketch applications by writing specifications and letting
mellea
fill in the details (generative slots). - Get started by decomposing your large unwieldy prompts into structured and maintainable mellea problems.
You can get started with a local install, or by using Colab notebooks.
Install with uv:
uv pip install mellea
Install with pip:
pip install mellea
Note
mellea
comes with some additional packages as defined in our pyproject.toml
. I you would like to install all the extra optional dependencies, please run the following commands:
uv pip install mellea[hf] # for Huggingface extras and Alora capabilities.
uv pip install mellea[watsonx] # for watsonx backend
uv pip install mellea[docling] # for docling
uv pip install mellea[all] # for all the optional dependencies
You can also install all the optional dependencies with uv sync --all-extras
Note
If running on an Intel mac, you may get errors related to torch/torchvision versions. Conda maintains updated versions of these packages. You will need to create a conda environment and run conda install 'torchvision>=0.22.0'
(this should also install pytorch and torchvision-extra). Then, you should be able to run uv pip install mellea
. To run the examples, you will need to use python <filename>
inside the conda environment instead of uv run --with mellea <filename>
.
Note
If you are using python >= 3.13, you may encounter an issue where outlines cannot be installed due to rust compiler issues (error: can't find Rust compiler
). You can either downgrade to python 3.12 or install the rust compiler to build the wheel for outlines locally.
For running a simple LLM request locally (using Ollama with Granite model), this is the starting code:
# file: https://github.com/generative-computing/mellea/blob/main/docs/examples/tutorial/example.py
import mellea
m = mellea.start_session()
print(m.chat("What is the etymology of mellea?").content)
Then run it:
Note
Before we get started, you will need to download and install ollama. Mellea can work with many different types of backends, but everything in this tutorial will "just work" on a Macbook running IBM's Granite 3.3 8B model.
uv run --with mellea docs/examples/tutorial/example.py
Fork and clone the repositoy:
git clone ssh://[email protected]/<my-username>/mellea.git && cd mellea/
Setup a virtual environment:
uv venv .venv && source .venv/bin/activate
Use uv pip
to install from source with the editable flag:
uv pip install -e .[all]
If you are planning to contribute to the repo, it would be good to have all the development requirements installed:
uv pip install .[all] --group dev --group notebook --group docs
or
uv sync --all-extras --all-groups
Ensure that you install the precommit hooks:
pre-commit install
Mellea supports validation of generation results through a instruct-validate-repair pattern. Below, the request for "Write an email.." is constrained by the requirements of "be formal" and "Use 'Dear interns' as greeting.". Using a simple rejection sampling strategy, the request is sent up to three (loop_budget) times to the model and the output is checked against the constraints using (in this case) LLM-as-a-judge.
# file: https://github.com/generative-computing/mellea/blob/main/docs/examples/instruct_validate_repair/101_email_with_validate.py
from mellea import MelleaSession
from mellea.backends.types import ModelOption
from mellea.backends.ollama import OllamaModelBackend
from mellea.backends import model_ids
from mellea.stdlib.sampling import RejectionSamplingStrategy
# create a session with Mistral running on Ollama
m = MelleaSession(
backend=OllamaModelBackend(
model_id=model_ids.MISTRALAI_MISTRAL_0_3_7B,
model_options={ModelOption.MAX_NEW_TOKENS: 300},
)
)
# run an instruction with requirements
email_v1 = m.instruct(
"Write an email to invite all interns to the office party.",
requirements=["be formal", "Use 'Dear interns' as greeting."],
strategy=RejectionSamplingStrategy(loop_budget=3),
)
# print result
print(f"***** email ****\n{str(email_v1)}\n*******")
Generative slots allow you to define functions without implementing them.
The @generative
decorator marks a function as one that should be interpreted by querying an LLM.
The example below demonstrates how an LLM's sentiment classification
capability can be wrapped up as a function using Mellea's generative slots and
a local LLM.
# file: https://github.com/generative-computing/mellea/blob/main/docs/examples/tutorial/sentiment_classifier.py#L1-L13
from typing import Literal
from mellea import generative, start_session
@generative
def classify_sentiment(text: str) -> Literal["positive", "negative"]:
"""Classify the sentiment of the input text as 'positive' or 'negative'."""
if __name__ == "__main__":
m = start_session()
sentiment = classify_sentiment(m, text="I love this!")
print("Output sentiment is:", sentiment)
See the tutorial
Please refer to the Contributor Guide for detailed instructions on how to contribute.
Mellea has been started by IBM Research in Cambridge, MA.