alexrudall · alexrudall · Aug 10, 2025 · Dec 19, 2024
diff --git a/README.md b/README.md
@@ -28,6 +28,7 @@ Stream text with GPT-4o, transcribe and translate audio with Whisper, or create
       - [Azure](#azure)
       - [Ollama](#ollama)
       - [Groq](#groq)
+      - [Gemini](#gemini)
     - [Counting Tokens](#counting-tokens)
     - [Models](#models)
     - [Chat](#chat)
@@ -277,6 +278,30 @@ client.chat(
 )
 ```
 
+#### Gemini
+
+[Gemini API Chat](https://ai.google.dev/gemini-api/docs/openai) is also broadly compatible with the OpenAI API, and [currently in beta](https://ai.google.dev/gemini-api/docs/openai#current-limitations). Get an access token from [here](https://aistudio.google.com/app/apikey), then:
+
+```ruby
+client = OpenAI::Client.new(
+  access_token: "gemini_access_token_goes_here",
+  uri_base: "https://generativelanguage.googleapis.com/v1beta/openai/"
+)
+
+client.chat(
+  parameters: {
+    model: "gemini-1.5-flash", # Required.
+    messages: [{ role: "user", content: "Hello!"}], # Required.
+    temperature: 0.7,
+    stream: proc do |chunk, _bytesize|
+     print chunk.dig("choices", 0, "delta", "content")
+    end
+  }
+)
+
+# => Hello there! How can I help you today?
+```
+
 ### Counting Tokens
 
 OpenAI parses prompt text into [tokens](https://help.openai.com/en/articles/4936856-what-are-tokens-and-how-to-count-them), which are words or portions of words. (These tokens are unrelated to your API access_token.) Counting tokens can help you estimate your [costs](https://openai.com/pricing). It can also help you ensure your prompt text size is within the max-token limits of your model's context window, and choose an appropriate [`max_tokens`](https://platform.openai.com/docs/api-reference/chat/create#chat/create-max_tokens) completion parameter so your response will fit as well.