Chat with any file using Gemini

Chat with any file using Gemini

title says all

·

2 min read

The latest interface of Gemini Pro supports to send a sequence of interlaced text / files as input (aka parts). The raw JSON API is convenient and straightfoward. One can wrap a utility function in 20 lins of codes.

Utility function

import requests

# Change to your own configurations
PROJECT_ID = "test-project-gdg-gemini"  
LOCATION = "us-central1"
API_ENDPOINT = f"{LOCATION}-aiplatform.googleapis.com"
MODEL_ID="gemini-1.0-pro-vision"

def request_gemini(prompt, gs_path):
  # url = f'https://{API_ENDPOINT}/v1/projects/{PROJECT_ID}/locations/{LOCATION}/publishers/google/models/{MODEL_ID}:streamGenerateContent'
  url = f'https://{API_ENDPOINT}/v1/projects/{PROJECT_ID}/locations/{LOCATION}/publishers/google/models/{MODEL_ID}:generateContent'
  r = requests.post(
      url, 
      headers={
          "Authorization": f'Bearer {access_token}',
          "Content-Type": "application/json",
      },
      json={
        "contents": {
          "role": "USER",
          "parts": [
            {"text": prompt},
            {"file_data": {
              "mime_type": "text/html",
              "file_uri": gs_path
            }}
          ]
        },
        "generation_config": {
          "temperature": 0.2,
          "top_p": 0.1,
          "top_k": 16,
          "max_output_tokens": 2048,
          "candidate_count": 1,
          "stop_sequences": []
        },
        "safety_settings": {
          "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
          # "threshold": "BLOCK_LOW_AND_ABOVE",
          "threshold": "BLOCK_NONE",
        }
      })
  return r.json()

def chat(prompt, gs_path):
  r = request_gemini(prompt, gs_path)
  t = r['candidates'][0]['content']['parts'][0]['text']
  return t

def transfer_to_gs(url, gs_filename):
  r = requests.get(url)
  open(gs_filename, 'wb').write(r.content)
  !gsutil cp {gs_filename} gs://test-gemini-files/

Caution:

  • Change the project to yours.
  • The !gsutil cp is a shortcut in CoLab/ Jupyter env to copy files to Google Cloud Storage. You may need to change to proper Python code in prod environment.
  • Assign access token with the auth method suitable in your env. One example in CoLab is as below.
access_token = !gcloud auth print-access-token
access_token = access_token[0]

Usage

url = 'https://hupili.net/article/ultra-notes-on-track/'
transfer_to_gs(url, 'on-track.html')

# Checkout the head of the downloaded file
!head -n30 on-track.html

#
chat('prompt here', 'gs://test-gemini-files/on-track.html')

Below are some sample chat logs:

gemini-chat-with-html-quick-demo.png

Notes

  • The file needs to be stored on Google Cloud Storage.
  • Each chat takes about 10-20 seconds to respond.

Did you find this article valuable?

Support HU, Pili by becoming a sponsor. Any amount is appreciated!