Skip to content

Commit

Permalink
start gemini module
Browse files Browse the repository at this point in the history
  • Loading branch information
capjamesg committed Dec 13, 2023
1 parent 2fcc32b commit 8a0842c
Show file tree
Hide file tree
Showing 8 changed files with 162 additions and 136 deletions.
21 changes: 21 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2023 Roboflow

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
85 changes: 44 additions & 41 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,64 +3,67 @@
<a align="center" href="" target="_blank">
<img
width="850"
src="https://media.roboflow.com/open-source/autodistill/autodistill-banner.png?3"
src="https://media.roboflow.com/open-source/autodistill/autodistill-banner.png"
>
</a>
</p>
</div>

# Autodistill Base Model Template
# Autodistill Gemini Module

**⚠️ Note: Before you start building a Base Model, check out our [Available Models](https://docs.autodistill.com/#available-models) directory to see if a model is already being implemented. If your desired model is being implemented, check the [Autodistill](https://github.com/autodistill/autodistill) GitHub Issues for progress. We encourage you to offer support to models you want to see in Autodistill if work is already being done on them.**
This repository contains the code supporting the Gemini base model for use with [Autodistill](https://github.com/autodistill/autodistill).

This repository contains a template for use in creating a Base Model for [Autodistill](https://github.com/autodistill/autodistill).
[Gemini](https://deepmind.google/technologies/gemini/), developed by Google, is a multimodal computer vision model that allows you to ask questions about images. You can use Gemini with Autodistill for image classification.

A Base Model is a large model that you can use for automatically labeling data. Autodistill enables you to connect Base Models to a smaller Target Model. A new model is trained using the Target Model architecture and your labeled data. This model will be smaller and thus more cost effective to run.

Autodistill is an ecosystem of Base and Target Models, with the main [Autodistill](https://github.com/autodistill/autodistill) repository acting as the bridge between the two.

This repository contains a starter template from which you can create a Base Model extension.
> [!NOTE]
> Using this project will incur billing charges for API calls to the Gemini API.
> Refer to the [Google Cloud pricing](https://cloud.google.com/pricing/) page for more information and to calculate your expected pricing. This package makes one API call per image you want to label.
Read the full [Autodistill documentation](https://autodistill.github.io/autodistill/).
## Steps to Build a Base Model

To build a base model, first rename the `src` directory to the name of the model you want to implement:

```
mkdir autodistill_model_name
```

Use underscores to separate words in the folder name.

Next, open the `model.py` file. This is the file where your model loading and inference code will be stored. If you need to write helper functions for use with your model -- for example, long methods for loading data, processing extensions -- you may opt to create new files to store the helper scripts.
## Installation

In `model.py`, replace the `Model` class name with the name of your model.
To use Gemini with autodistill, you need to install the following dependency:

Next, implement the following functions:

1. `__init__`: Code for loading the model.
2. `predict`: A function that takes in an image name, runs inference, and returns a `supervision` Detections object (object detection) or a `supervision` Classifications object (classification).

Replace the import statement in the `__init__.py` file in your model directory to point to your model. You only need to import the model, such as:

```bash
pip3 install autodistill-gemini
```
from autodistill_clip.clip_model import CLIP
```

Your version should be set in the `__init__.py` file as `0.1.0` before submitting your model for review.

Update the `setup.py` file to use the name of your model where appropriate. Add all of the requisite dependencies to the `install_requires` section.

Your Base Model should feature a README that shows a minimal example of how to use the base model. This should only be a few lines of code. Refer to `README_EXAMPLE.md` for an example of an Autodistill Base Model README. Feel free to copy this example and replace all parts as required.

Your package must be licensed under the same license as the model you are using (i.e. if your model uses an Apache 2.0 license, your Autodistill extension must use the same license). Your license should be in a file called `LICENSE`, stored in the root directory of your Autodistill extension GitHub repository.
## Quickstart

```python
from autodistill_gemini import Gemini

# define an ontology to map class names to our Gemini prompt
# the ontology dictionary has the format {caption: class}
# where caption is the prompt sent to the base model, and class is the label that will
# be saved for that caption in the generated annotations
# then, load the model
base_model = Gemini(
ontology=CaptionOntology(
{
"person": "person",
"a forklift": "forklift"
}
),
api_key="api-key",
gcp_region="us-central1",
gcp_project="project-name",
)

result = base_model.predict("image.jpg")

print(result)

# label a folder of images
base_model.label("./context_images", extension=".jpeg")
```

Update your README to note the license applied to your package.
## License

When your Autodistill extension is ready for testing, open an Issue in the main [Autodistill](https://github.com/autodistill/autodistill) repository with a link to a public GitHub repository that contains your code.
This project is licensed under an [MIT license](LICENSE).

An Autodistill maintainer will review your code. If accepted, we will:
## 🏆 Contributing

1. Add your package to the [Autodistill documentation](https://docs.autodistill.com).
2. Package your project up to PyPi and publish it as an official `autodistill` extension.
3. Announce your project on social media.
We love your input! Please see the core Autodistill [contributing guide](https://github.com/autodistill/autodistill/blob/main/CONTRIBUTING.md) to get started. Thank you 🙏 to all our contributors!
61 changes: 0 additions & 61 deletions README_EXAMPLE.md

This file was deleted.

3 changes: 0 additions & 3 deletions autodistill_base_model/__init__.py

This file was deleted.

21 changes: 0 additions & 21 deletions autodistill_base_model/model.py

This file was deleted.

3 changes: 3 additions & 0 deletions autodistill_gemini/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
from autodistill_gemini.gemini_model import Gemini

__version__ = "0.1.0"
85 changes: 85 additions & 0 deletions autodistill_gemini/gemini_model.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
import os
from dataclasses import dataclass

import requests
import supervision as sv
from autodistill.detection import CaptionOntology, DetectionBaseModel

HOME = os.path.expanduser("~")


@dataclass
class Gemini(DetectionBaseModel):
ontology: CaptionOntology
api_key: str
gcp_region: str
gcp_project: str

def __init__(
self, ontology: CaptionOntology, api_key: str, gcp_region: str, gcp_project: str
) -> None:
self.ontology = ontology
self.api_key = api_key
self.gcp_region = gcp_region
self.gcp_project = gcp_project

def predict(self, input: str, prompt: str, confidence: int = 0.5) -> sv.Detections:
payload = {
"contents": {
"role": "user",
"parts": [
{
"fileData": {
"mimeType": "image/png",
"fileUri": input,
}
},
{"text": prompt},
],
},
"safety_settings": {
"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
"threshold": "BLOCK_LOW_AND_ABOVE",
},
"generation_config": {
"temperature": 0.4,
"topP": 1.0,
"topK": 32,
"maxOutputTokens": 2048,
},
}

response = requests.post(
f"https://{self.gcp_region}-aiplatform.googleapis.com/v1/projects/{self.gcp_project}/locations/{self.gcp_region}/publishers/google/models/gemini-pro-vision:streamGenerateContent",
json=payload,
headers={"Authorization": f"Bearer {self.api_key}"},
)

# "candidates": [
# {
# "content": {
# "parts": [
# {
# "text": string
# }
# ]
# },

if not response.ok:
raise Exception(response.text)

response_body = response.json()

text_response = response_body["candidates"][0]["content"]["parts"][0]["text"]

prompts = self.ontology.prompts()

is_in = []

for prompt in prompts:
is_in.append(prompt in text_response)

return sv.Classifications(
class_ids=self.ontology.class_ids(),
confidence=[1 if i else 0 for i in is_in],
)
19 changes: 9 additions & 10 deletions setup.py
Original file line number Diff line number Diff line change
@@ -1,27 +1,26 @@
import re

import setuptools
from setuptools import find_packages
import re

with open("./autodistill_base_model/__init__.py", 'r') as f:
with open("./autodistill_gemini/__init__.py", "r") as f:
content = f.read()
# from https://www.py4u.net/discuss/139845
version = re.search(r'__version__\s*=\s*[\'"]([^\'"]*)[\'"]', content).group(1)

with open("README.md", "r") as fh:
long_description = fh.read()

setuptools.setup(
name="autodistill-base-model",
name="autodistill-gemini",
version=version,
author="",
author_email="",
author="Roboflow",
author_email="[email protected]",
description="Model for use with Autodistill",
long_description=long_description,
long_description_content_type="text/markdown",
url="",
install_requires=[
# list your requires
],
url="https://github.com/autodistill/autodistill-gemini",
install_requires=["autodistill", "supervision"],
packages=find_packages(exclude=("tests",)),
extras_require={
"dev": ["flake8", "black==22.3.0", "isort", "twine", "pytest", "wheel"],
Expand Down

0 comments on commit 8a0842c

Please sign in to comment.