リリースされたばかりの Gemini Pro API を試してみる

Gemini Pro の API を試してみたいと思います。

!pip install -q -U google-generativeai
import pathlib
import textwrap

import google.generativeai as genai

# Used to securely store your API key
from google.colab import userdata

from IPython.display import display
from IPython.display import Markdown

def to_markdown(text):
  text = text.replace('•', '  *')
  return Markdown(textwrap.indent(text, '> ', predicate=lambda _: True))
# Or use `os.getenv('GOOGLE_API_KEY')` to fetch an environment variable.



for m in genai.list_models():
  if 'generateContent' in m.supported_generation_methods:

Gemini Pro を試す

model = genai.GenerativeModel('gemini-pro')
response = model.generate_content("What is the meaning of life?")
CPU times: user 142 ms, sys: 18.9 ms, total: 161 ms
Wall time: 9.76 s

The meaning of life is a profound and multifaceted question that has been pondered by philosophers, theologians, scientists, and artists throughout history. While there is no single, universally agreed-upon answer, various perspectives and philosophical approaches have emerged to address this inquiry.

  1. Existentialism:

    • According to existentialist philosophers like Jean-Paul Sartre and Albert Camus, life has no inherent meaning or purpose. Instead, individuals must create their own meaning through their actions, choices, and commitments.
  2. Theological Perspectives:

    • Religious traditions often provide theological explanations for the meaning of life. For example, in Christianity, the purpose of life is to glorify God and live in accordance with his commands.
  3. Hedonism:

    • Hedonists believe that the meaning of life is to pursue pleasure and avoid pain. For them, the pursuit of happiness and the maximization of enjoyable experiences constitute the ultimate purpose of life.
  4. Eudaimonia (Well-Being):

    • In ancient Greek philosophy, eudaimonia referred to a state of flourishing, happiness, and well-being. This perspective emphasizes the pursuit of a meaningful life through the cultivation of virtues, the development of one’s potential, and the pursuit of excellence.
  5. Utilitarianism:

    • According to utilitarianism, the meaning of life lies in maximizing the overall happiness or well-being of all sentient beings. It emphasizes the importance of ethical decision-making that promotes the greatest good for the greatest number.
  6. Absolutism:

    • Absolutists believe that there is one objective and universal meaning of life that applies to all individuals, often based on moral or religious principles. This perspective asserts that the purpose of life is to fulfill a predetermined destiny or follow a specific set of rules or beliefs.
  7. Naturalism:

    • Naturalists believe that the meaning of life is rooted in the natural world and human biology. They argue that life’s purpose is to survive, reproduce, and perpetuate one’s genes.
  8. Nihilism:

    • Nihilism posits that life lacks inherent meaning or purpose and that all values are ultimately meaningless. Nihilists may believe that the universe is fundamentally absurd and that any attempt to find meaning is futile.
  9. Personal Perspectives:

    • Many individuals find personal meaning in their relationships, work, hobbies, creative pursuits, or spiritual practices. The meaning of life can be highly subjective and unique to each person, often influenced by their experiences, values, and beliefs.

Ultimately, the meaning of life is a personal and ongoing inquiry, with no single answer that fits everyone. The pursuit of meaning itself can be a fulfilling and meaningful endeavor, leading to personal growth, fulfillment, and a sense of purpose.

response = model.generate_content("ワンピースのチョッパーになりきって、長めの自己紹介をしてください。", stream=True)
for chunk in response:








Gemini Pro Vision を試す


model = genai.GenerativeModel('gemini-pro-vision')

適当な画像をロードします。たまたま別タブで開いていた Microsoft の Medprompt の記事にあった画像を使いましょう。

!curl -o image.jpg https://www.microsoft.com/en-us/research/uploads/prod/2023/11/joint_medprompt_v1.png
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 2091k  100 2091k    0     0  4459k      0 --:--:-- --:--:-- --:--:-- 4469k
from PIL import Image
img = Image.open('image.jpg')


# 画像だけを送ると、英語の説明が返ってくる
response = model.generate_content(img)


The left graph shows the performance of different fine-tuning methods on the MedQA dataset. The x-axis is the date of the experiment, and the y-axis is the accuracy on the MedQA test set. The lines show the performance of different methods. The legend at the top right corner of the graph shows the names of the methods. The right graph shows the performance of different models on the MedMCQA 2023 leaderboard. The models are listed on the x-axis, and the y-axis shows the accuracy on the MedMCQA test set. The points on the graph show the performance of each model. The legend at the bottom right corner of the graph shows the names of the models.

# プロンプトと画像セットで送ると説明が返ってくる
prompt = "Write a summary of what you can learning from the image. Be detailed as much as possible. Also mention which method was the best."
response = model.generate_content([prompt, img])

The image shows a comparison of different fine-tuning methods for the MedQA task. The methods are:

  • No fine-tuning
  • Intensive fine-tuning
  • Simple Prompt

The best method is intensive fine-tuning, which achieves an accuracy of 90.2% on the MedQA test set. This method is followed by simple prompt tuning, which achieves an accuracy of 81.7%, and no fine-tuning, which achieves an accuracy of 67.2%.

The image also shows that the best method for each individual dataset varies. For example, on the MedMCQA Dev dataset, intensive fine-tuning achieves the best accuracy, while on the PubMedBERT dataset, simple prompt tuning achieves the best accuracy. This suggests that the best fine-tuning method for a particular dataset depends on the specific characteristics of the dataset.


prompt = "画像から得られる知見を日本語で詳細にまとめてください。"
response = model.generate_content([prompt, img])


モデルは、大きく分けて 3 つのグループに分けられます。1 つ目は、事前学習済み言語モデルをそのまま使用した場合です。このグループには、BERT、RoBERTa、XLNet が含まれます。2 つ目は、事前学習済み言語モデルを微調整した場合です。このグループには、BioBERT、ClinicalBERT、PubMedBERT が含まれます。3 つ目は、事前学習済み言語モデルを大規模なデータセットで微調整した場合です。このグループには、Med-BERT、SciBERT、BiomedicalBERT が含まれます。







今回使った Colab: https://colab.research.google.com/drive/1A240Twl0lA0Idpl6D2U6yzR3XeweaD_o?usp=sharing

