使用 Imagen 生成图片


Firebase AI Logic SDK 可让您通过 Imagen API 访问 Imagen 模型,以便根据文本提示生成图片。借助此功能,您可以执行以下操作:

  • 根据用自然语言编写的提示生成图片
  • 生成各种格式和风格的图片
  • 在图片中渲染文本

请注意,Firebase AI Logic 尚不支持 Imagen 模型提供的所有功能。如需了解详情,请参阅本页下文中的支持的功能和特性

跳转到仅限文本输入的代码

GeminiImagen 模型之间进行选择

Firebase AI Logic SDK 支持使用 Gemini 模型或 Imagen 模型生成图片。对于大多数用例,请先使用 Gemini,然后在图像质量至关重要的专门任务中选择 Imagen

请注意,Firebase AI Logic SDK 尚不支持使用 Imagen 模型进行图片输入(例如编辑)。因此,如果您想处理输入图片,可以改用 Gemini 模型。

在以下情况下,请选择 Gemini

  • 利用世界知识和推理能力生成与上下文相关的图片。
  • 将文字和图片无缝融合。
  • 在长文本序列中嵌入准确的视觉内容。
  • 以对话方式编辑图片,同时保留上下文。

在以下情况下,请选择 Imagen

  • 优先考虑画质、写实效果、艺术细节或特定风格(例如印象派或动漫)。
  • 明确指定生成图片的宽高比或格式。

准备工作

点击您的 Gemini API 提供商,在本页面上查看特定于提供商的内容和代码。

如果您尚未完成入门指南,请先完成该指南。其中介绍了如何设置 Firebase 项目、将应用连接到 Firebase、添加 SDK、为所选 API 提供程序初始化后端服务,以及创建 ImagenModel 实例。

支持此功能的模型

Gemini Developer API 仅支持使用最新的稳定版 Imagen 3 模型生成图片,但不支持其他 Imagen 模型。无论您以何种方式访问 Gemini Developer API,这些 Imagen 模型限制都适用。

  • imagen-3.0-generate-002

根据纯文本输入生成图片

您可以通过文本提示来要求 Imagen 模型生成图片。您可以生成一张图片多张图片

根据纯文本输入生成一张图片

在试用此示例之前,请完成本指南的准备工作部分,以设置您的项目和应用。
在此部分中,您还需要点击所选 Gemini API 提供方的按钮,以便在本页上看到特定于该提供方的相关内容

您可以通过文本提示来要求 Imagen 模型生成单张图片。

请务必创建 ImagenModel 实例并调用 generateImages

Swift


import FirebaseAI

// Initialize the Gemini Developer API backend service
let ai = FirebaseAI.firebaseAI(backend: .googleAI())


// Create an `ImagenModel` instance with a model that supports your use case
let model = ai.imagenModel(modelName: "imagen-3.0-generate-002")

// Provide an image generation prompt
let prompt = "An astronaut riding a horse"

// To generate an image, call `generateImages` with the text prompt
let response = try await model.generateImages(prompt: prompt)

// Handle the generated image
guard let image = response.images.first else {
  fatalError("No image in the response.")
}
let uiImage = UIImage(data: image.data)

Kotlin


// Using this SDK to access Imagen models is a Preview release and requires opt-in
@OptIn(PublicPreviewAPI::class)
suspend fun generateImage() {
  // Initialize the Gemini Developer API backend service
  // Create an `ImagenModel` instance with an Imagen model that supports your use case
  val imagenModel = Firebase.ai(backend = GenerativeBackend.googleAI()).imagenModel("imagen-3.0-generate-002")

  // Provide an image generation prompt
  val prompt = "An astronaut riding a horse"

  // To generate an image, call `generateImages` with the text prompt
  val imageResponse = imagenModel.generateImages(prompt)

  // Handle the generated image
  val image = imageResponse.images.first()

  val bitmapImage = image.asBitmap()
}

Java


// Initialize the Gemini Developer API backend service
// Create an `ImagenModel` instance with an Imagen model that supports your use case
ImagenModel imagenModel = FirebaseAI.getInstance(GenerativeBackend.googleAI()).imagenModel(
        /* modelName */ "imagen-3.0-generate-002");

ImagenModelFutures model = ImagenModelFutures.from(imagenModel);

// Provide an image generation prompt
String prompt = "An astronaut riding a horse";

// To generate an image, call `generateImages` with the text prompt
Futures.addCallback(model.generateImages(prompt), new FutureCallback<ImagenGenerationResponse<ImagenInlineImage>>() {
    @Override
    public void onSuccess(ImagenGenerationResponse<ImagenInlineImage> result) {
        if (result.getImages().isEmpty()) {
            Log.d("TAG", "No images generated");
        }
        Bitmap bitmap = result.getImages().get(0).asBitmap();
        // Use the bitmap to display the image in your UI
    }

    @Override
    public void onFailure(Throwable t) {
        // ...
    }
}, Executors.newSingleThreadExecutor());

Web


import { initializeApp } from "firebase/app";
import { getAI, getGenerativeModel, GoogleAIBackend } from "firebase/ai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Gemini Developer API backend service
const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });


// Create an `ImagenModel` instance with an Imagen 3 model that supports your use case
const imagenModel = getImagenModel(ai, { model: "imagen-3.0-generate-002" });

// Provide an image generation prompt
const prompt = "An astronaut riding a horse.";

// To generate an image, call `generateImages` with the text prompt
const response = await imagenModel.generateImages(prompt)

// If fewer images were generated than were requested,
// then `filteredReason` will describe the reason they were filtered out
if (response.filteredReason) {
  console.log(response.filteredReason);
}

if (response.images.length == 0) {
  throw new Error("No images in the response.")
}

const image = response.images[0];

Dart

import 'package:firebase_ai/firebase_ai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

// Initialize FirebaseApp
await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);


// Initialize the Gemini Developer API backend service
// Create an `ImagenModel` instance with an Imagen model that supports your use case
final model =
  FirebaseAI.googleAI().imagenModel(model: 'imagen-3.0-generate-002');

// Provide an image generation prompt
const prompt = 'An astronaut riding a horse.';

// To generate an image, call `generateImages` with the text prompt
final response = await model.generateImages(prompt);

if (response.images.isNotEmpty) {
  final image = response.images[0];
  // Process the image
} else {
  // Handle the case where no images were generated
  print('Error: No images were generated.');
}

Unity

Unity 尚不支持使用 Imagen,但请稍后再回来查看!

了解如何选择适合您的应用场景和应用的模型

根据纯文本输入生成多张图片

在试用此示例之前,请完成本指南的准备工作部分,以设置您的项目和应用。
在此部分中,您还需要点击所选 Gemini API 提供方的按钮,以便在本页上看到特定于该提供方的相关内容

默认情况下,Imagen 模型每个请求只生成一张图片。不过,您可以通过在创建 ImagenModel 实例时提供 ImagenGenerationConfig,让 Imagen 模型为每个请求生成多个图片。

请务必创建 ImagenModel 实例并调用 generateImages

Swift


import FirebaseAI

// Initialize the Gemini Developer API backend service
let ai = FirebaseAI.firebaseAI(backend: .googleAI())


// Create an `ImagenModel` instance with a model that supports your use case
let model = ai.imagenModel(
  modelName: "imagen-3.0-generate-002",
  // Configure the model to generate multiple images for each request
  // See: https://firebase.google.com/docs/ai-logic/model-parameters
  generationConfig: ImagenGenerationConfig(numberOfImages: 4)
)

// Provide an image generation prompt
let prompt = "An astronaut riding a horse"

// To generate images, call `generateImages` with the text prompt
let response = try await model.generateImages(prompt: prompt)

// If fewer images were generated than were requested,
// then `filteredReason` will describe the reason they were filtered out
if let filteredReason = response.filteredReason {
  print(filteredReason)
}

// Handle the generated images
let uiImages =  response.images.compactMap { UIImage(data: $0.data) }

Kotlin


// Using this SDK to access Imagen models is a Preview release and requires opt-in
@OptIn(PublicPreviewAPI::class)
suspend fun generateImage() {
  // Initialize the Gemini Developer API backend service
  // Create an `ImagenModel` instance with an Imagen model that supports your use case
  val imagenModel = Firebase.ai(backend = GenerativeBackend.googleAI()).imagenModel(
      modelName = "imagen-3.0-generate-002",
      // Configure the model to generate multiple images for each request
      // See: https://firebase.google.com/docs/ai-logic/model-parameters
      generationConfig = ImagenGenerationConfig(numberOfImages = 4)
  )

  // Provide an image generation prompt
  val prompt = "An astronaut riding a horse"

  // To generate images, call `generateImages` with the text prompt
  val imageResponse = imagenModel.generateImages(prompt)

  // If fewer images were generated than were requested,
  // then `filteredReason` will describe the reason they were filtered out
  if (imageResponse.filteredReason != null) {
    Log.d(TAG, "FilteredReason: ${imageResponse.filteredReason}")
  }

  for (image in imageResponse.images) {
    val bitmap = image.asBitmap()
    // Use the bitmap to display the image in your UI
  }
}

Java


// Configure the model to generate multiple images for each request
// See: https://firebase.google.com/docs/ai-logic/model-parameters
ImagenGenerationConfig imagenGenerationConfig = new ImagenGenerationConfig.Builder()
        .setNumberOfImages(4)
        .build();

// Initialize the Gemini Developer API backend service
// Create an `ImagenModel` instance with an Imagen model that supports your use case
ImagenModel imagenModel = FirebaseAI.getInstance(GenerativeBackend.googleAI()).imagenModel(
        /* modelName */ "imagen-3.0-generate-002",
        /* imageGenerationConfig */ imagenGenerationConfig);

ImagenModelFutures model = ImagenModelFutures.from(imagenModel);

// Provide an image generation prompt
String prompt = "An astronaut riding a horse";

// To generate images, call `generateImages` with the text prompt
Futures.addCallback(model.generateImages(prompt), new FutureCallback<ImagenGenerationResponse<ImagenInlineImage>>() {
    @Override
    public void onSuccess(ImagenGenerationResponse<ImagenInlineImage> result) {
        // If fewer images were generated than were requested,
        // then `filteredReason` will describe the reason they were filtered out
        if (result.getFilteredReason() != null){
            Log.d("TAG", "FilteredReason: " + result.getFilteredReason());
        }

        // Handle the generated images
        List<ImagenInlineImage> images = result.getImages();
        for (ImagenInlineImage image : images) {
            Bitmap bitmap = image.asBitmap();
            // Use the bitmap to display the image in your UI
        }
    }

    @Override
    public void onFailure(Throwable t) {
        // ...
    }
}, Executors.newSingleThreadExecutor());

Web


import { initializeApp } from "firebase/app";
import { getAI, getGenerativeModel, GoogleAIBackend } from "firebase/ai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Gemini Developer API backend service
const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() });


// Create an `ImagenModel` instance with an Imagen 3 model that supports your use case
const imagenModel = getImagenModel(
  ai,
  {
    model: "imagen-3.0-generate-002",
    // Configure the model to generate multiple images for each request
    // See: https://firebase.google.com/docs/ai-logic/model-parameters
    generationConfig: {
      numberOfImages: 4
    }
  }
);

// Provide an image generation prompt
const prompt = "An astronaut riding a horse.";

// To generate images, call `generateImages` with the text prompt
const response = await imagenModel.generateImages(prompt)

// If fewer images were generated than were requested,
// then `filteredReason` will describe the reason they were filtered out
if (response.filteredReason) {
  console.log(response.filteredReason);
}

if (response.images.length == 0) {
  throw new Error("No images in the response.")
}

const images = response.images[0];

Dart

import 'package:firebase_ai/firebase_ai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

// Initialize FirebaseApp
await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);


// Initialize the Gemini Developer API backend service
// Create an `ImagenModel` instance with an Imagen model that supports your use case
final model =
  FirebaseAI.googleAI().imagenModel(
    model: 'imagen-3.0-generate-002',
    // Configure the model to generate multiple images for each request
    // See: https://firebase.google.com/docs/ai-logic/model-parameters
    generationConfig: ImagenGenerationConfig(numberOfImages: 4),
);

// Provide an image generation prompt
const prompt = 'An astronaut riding a horse.';

// To generate images, call `generateImages` with the text prompt
final response = await model.generateImages(prompt);

// If fewer images were generated than were requested,
// then `filteredReason` will describe the reason they were filtered out
if (response.filteredReason != null) {
  print(response.filteredReason);
}

if (response.images.isNotEmpty) {
  final images = response.images;
  for(var image in images) {
  // Process the image
  }
} else {
  // Handle the case where no images were generated
  print('Error: No images were generated.');
}

Unity

Unity 尚不支持使用 Imagen,但请稍后再回来查看!

了解如何选择适合您的应用场景和应用的模型



支持的功能和要求

Imagen 模型提供了许多与图片生成相关的功能。本部分介绍了将模型与 Firebase AI Logic 搭配使用时支持的功能。

支持的 capability 和功能

Firebase AI Logic 支持 Imagen 模型的这些功能。

  • 生成人物和面孔(前提是您的 Firebase 项目已获得 Google Cloud批准

  • 在生成的图片中生成文本

  • 为生成的图片添加水印

  • 配置图片生成参数,例如生成的图片数量、宽高比和水印

  • 配置安全设置

Firebase AI Logic 支持 Imagen 模型的这些高级功能。

请注意,即使在服务器端使用 Imagen 模型,也需要位于已获批准的用户名单中才能使用其中的大多数功能。

  • 图片编辑或处理功能,包括放大图片

  • 在向模型发送的请求中添加图片(例如,用于少样本学习)

  • 使用 SDK 验证数字水印
    如果您想验证图片是否带有水印,可以使用 Vertex AI Studio 的媒体标签页将图片上传到 Vertex AI Studio。

  • 根据文字生成“实况图片”(MP4 生成)

  • 使用预定义风格生成图片

  • 设置输入文本的语言

  • 启用 includeSafetyAttributes,这意味着无法返回 safetyAttributes.categoriessafetyAttributes.scores

  • 停用提示增强功能enhancePrompt 参数),这意味着基于 LLM 的重写提示工具将始终自动在所提供的提示中添加更多详细信息,以提供更高质量的图片,从而更好地反映所提供的提示

  • 将生成的图片直接写入 Google Cloud Storage,作为模型响应(storageUri 参数)的一部分。相反,系统会在响应中始终以 base64 编码的图片字节的形式返回图片。
    如果您想将生成的图片上传到 Cloud Storage,可以使用 Cloud Storage for Firebase

规范和限制

媒体资源(每个请求)
输入 token 数上限 480 个词元
输出图片数量上限 4 张图片
支持的输出图片分辨率(像素)
  • 1024x1024 像素(宽高比为 1:1)
  • 896x1280(宽高比为 3:4)
  • 1280x896(宽高比为 4:3)
  • 768x1408(宽高比为 9:16)
  • 1408x768(宽高比为 16:9)



您还可以执行以下操作

了解如何控制内容生成

详细了解支持的模型

了解适用于各种用例的模型及其配额价格


提供有关 Firebase AI Logic 使用体验的反馈