Firebase is back at Google I/O on May 20-21! Register now.

این صفحه به‌وسیله ‏Cloud Translation API‏ ترجمه شده است.

فایل های تصویری را با استفاده از API جمینی تجزیه و تحلیل کنید
با مجموعه‌ها، منظم بمانید ذخیره و طبقه‌بندی محتوا براساس اولویت‌های شما.

می‌توانید از مدل Gemini بخواهید فایل‌های تصویری را که ارائه می‌دهید به صورت درون خطی (با کدگذاری پایه 64) یا از طریق URL تجزیه و تحلیل کند. وقتی از Vertex AI در Firebase استفاده می‌کنید، می‌توانید این درخواست را مستقیماً از برنامه خود ارسال کنید.

با این قابلیت می توانید کارهایی مانند:

زیرنویس ایجاد کنید یا به سوالات مربوط به تصاویر پاسخ دهید
یک داستان کوتاه یا یک شعر در مورد یک تصویر بنویسید
اشیاء را در یک تصویر شناسایی کنید و مختصات جعبه مرزی را برای آنها برگردانید
مجموعه ای از تصاویر را بر اساس احساسات، سبک یا ویژگی های دیگر برچسب یا دسته بندی کنید

پرش به نمونه کد پرش به کد برای پاسخ های جریانی

برای گزینه های اضافی برای کار با تصاویر به راهنماهای دیگر مراجعه کنید
تولید خروجی ساختاریافته چت چند نوبتی ایجاد تصاویر

قبل از شروع

اگر قبلاً این کار را نکرده‌اید، راهنمای شروع را کامل کنید، که نحوه راه‌اندازی پروژه Firebase را توضیح می‌دهد، برنامه خود را به Firebase متصل کنید، SDK را اضافه کنید، سرویس Vertex AI را راه‌اندازی کنید، و یک نمونه GenerativeModel ایجاد کنید.

برای آزمایش و تکرار بر روی دستورات خود و حتی دریافت یک قطعه کد تولید شده، توصیه می کنیم از Vertex AI Studio استفاده کنید.

فایل تصویری نمونه نیاز دارید؟

می‌توانید از این فایل در دسترس عموم با نوع MIME image/jpeg ( مشاهده یا دانلود فایل ) استفاده کنید. https://storage.googleapis.com/cloud-samples-data/generative-ai/image/scones.jpg

ارسال فایل های تصویری (با کد پایه 64) و دریافت متن

قبل از امتحان کردن این نمونه، مطمئن شوید که بخش قبل از شروع این راهنما را تکمیل کرده اید.

می‌توانید از یک مدل Gemini بخواهید که متنی را با درخواست متن و تصویر تولید کند—با ارائه mimeType هر فایل ورودی و خود فایل. الزامات و توصیه‌های مربوط به فایل‌های ورودی را بعداً در این صفحه پیدا کنید.

سویفت

برای تولید متن از ورودی چند وجهی متن و تصاویر، می‌توانید generateContent() فراخوانی کنید.

ورودی تک فایل

import FirebaseVertexAI

// Initialize the Vertex AI service
let vertex = VertexAI.vertexAI()

// Create a `GenerativeModel` instance with a model that supports your use case
let model = vertex.generativeModel(modelName: "gemini-2.0-flash")

guard let image = UIImage(systemName: "bicycle") else { fatalError() }

// Provide a text prompt to include with the image
let prompt = "What's in this picture?"

// To generate text output, call generateContent and pass in the prompt
let response = try await model.generateContent(image, prompt)
print(response.text ?? "No text in response.")

ورودی چند فایل

import FirebaseVertexAI

// Initialize the Vertex AI service
let vertex = VertexAI.vertexAI()

// Create a `GenerativeModel` instance with a model that supports your use case
let model = vertex.generativeModel(modelName: "gemini-2.0-flash")

guard let image1 = UIImage(systemName: "car") else { fatalError() }
guard let image2 = UIImage(systemName: "car.2") else { fatalError() }

// Provide a text prompt to include with the images
let prompt = "What's different between these pictures?"

// To generate text output, call generateContent and pass in the prompt
let response = try await model.generateContent(image1, image2, prompt)
print(response.text ?? "No text in response.")

توجه : مثال بالا از یک روش ساده برای مدیریت انواع تصاویر بومی پلتفرم ( UIImage ، NSImage ، CIImage و CGImage ) در اعلان‌های چند وجهی بهره می‌برد. این نوع تصاویر (صرف نظر از فرمت اصلی آنها) قبل از ارسال به سرور، سمت سرویس گیرنده با کیفیت 80٪ به JPEG تبدیل می شوند. این بدان معناست که وقتی تصاویر را به صورت درون خطی مانند مثال بالا ارائه می کنید، نیازی به تعیین نوع MIME ندارید.

برای کنترل بیشتر بر روی فرمت‌ها و تبدیل‌های تصویر، می‌توانید تصاویر را به‌عنوان InlineDataPart ارائه کنید و نوع MIME خاصی را ارائه کنید. به عنوان مثال: InlineDataPart(data: Data(/* PNG Data */), mimeType: "image/png") .

Kotlin

برای تولید متن از ورودی چند وجهی متن و تصاویر، می‌توانید generateContent() فراخوانی کنید.

^{برای Kotlin، روش‌های موجود در این SDK توابع تعلیق هستند و باید از یک محدوده Coroutine فراخوانی شوند.}

ورودی تک فایل

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
val generativeModel = Firebase.vertexAI.generativeModel("gemini-2.0-flash")

// Loads an image from the app/res/drawable/ directory
val bitmap: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky)

// Provide a prompt that includes the image specified above and text
val prompt = content {
  image(bitmap)
  text("What developer tool is this mascot from?")
}

// To generate text output, call generateContent with the prompt
val response = generativeModel.generateContent(prompt)
print(response.text)

ورودی چند فایل

^{برای Kotlin، روش‌های موجود در این SDK توابع تعلیق هستند و باید از یک محدوده Coroutine فراخوانی شوند.}

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
val generativeModel = Firebase.vertexAI.generativeModel("gemini-2.0-flash")

// Loads an image from the app/res/drawable/ directory
val bitmap1: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky)
val bitmap2: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky_eats_pizza)

// Provide a prompt that includes the images specified above and text
val prompt = content {
  image(bitmap1)
  image(bitmap2)
  text("What is different between these pictures?")
}

// To generate text output, call generateContent with the prompt
val response = generativeModel.generateContent(prompt)
print(response.text)

توجه : مثال بالا از روشی ساده برای مدیریت انواع تصاویر بومی پلتفرم ( Bitmap ) در اعلان‌های چند وجهی بهره می‌برد. این نوع تصاویر (صرف نظر از فرمت اصلی آنها) قبل از ارسال به سرور، سمت سرویس گیرنده با کیفیت 80٪ به JPEG تبدیل می شوند. این بدان معناست که وقتی تصاویر را به صورت درون خطی مانند مثال بالا ارائه می کنید، نیازی به تعیین نوع MIME ندارید.

برای کنترل بیشتر بر روی فرمت‌ها و تبدیل‌های تصویر، می‌توانید تصاویر را به‌عنوان InlineDataPart ارائه کنید و نوع MIME خاصی را ارائه کنید. به عنوان مثال: content { inlineData(/* PNG as byte array */, "image/png") } .

Java

برای تولید متن از ورودی چند وجهی متن و تصاویر، می‌توانید generateContent() فراخوانی کنید.

^{برای جاوا، روش‌های موجود در این SDK یک ListenableFuture برمی‌گردانند.}

ورودی تک فایل

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
GenerativeModel gm = FirebaseVertexAI.getInstance()
        .generativeModel("gemini-2.0-flash");
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

Bitmap bitmap = BitmapFactory.decodeResource(getResources(), R.drawable.sparky);

// Provide a prompt that includes the image specified above and text
Content content = new Content.Builder()
        .addImage(bitmap)
        .addText("What developer tool is this mascot from?")
        .build();

// To generate text output, call generateContent with the prompt
ListenableFuture<GenerateContentResponse> response = model.generateContent(content);
Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {
    @Override
    public void onSuccess(GenerateContentResponse result) {
        String resultText = result.getText();
        System.out.println(resultText);
    }

    @Override
    public void onFailure(Throwable t) {
        t.printStackTrace();
    }
}, executor);

ورودی چند فایل

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
GenerativeModel gm = FirebaseVertexAI.getInstance()
        .generativeModel("gemini-2.0-flash");
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

Bitmap bitmap1 = BitmapFactory.decodeResource(getResources(), R.drawable.sparky);
Bitmap bitmap2 = BitmapFactory.decodeResource(getResources(), R.drawable.sparky_eats_pizza);

// Provide a prompt that includes the images specified above and text
Content prompt = new Content.Builder()
    .addImage(bitmap1)
    .addImage(bitmap2)
    .addText("What's different between these pictures?")
    .build();

// To generate text output, call generateContent with the prompt
ListenableFuture<GenerateContentResponse> response = model.generateContent(prompt);
Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() {
    @Override
    public void onSuccess(GenerateContentResponse result) {
        String resultText = result.getText();
        System.out.println(resultText);
    }

    @Override
    public void onFailure(Throwable t) {
        t.printStackTrace();
    }
}, executor);

Web

برای تولید متن از ورودی چند وجهی متن و تصاویر، می‌توانید generateContent() فراخوانی کنید.

ورودی تک فایل

import { initializeApp } from "firebase/app";
import { getVertexAI, getGenerativeModel } from "firebase/vertexai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Vertex AI service
const vertexAI = getVertexAI(firebaseApp);

// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(vertexAI, { model: "gemini-2.0-flash" });

// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the image
  const prompt = "What's different between these pictures?";

  const fileInputEl = document.querySelector("input[type=file]");
  const imagePart = await fileToGenerativePart(fileInputEl.files[0]);

  // To generate text output, call generateContent with the text and image
  const result = await model.generateContent([prompt, imagePart]);

  const response = result.response;
  const text = response.text();
  console.log(text);
}

run();

ورودی چند فایل

import { initializeApp } from "firebase/app";
import { getVertexAI, getGenerativeModel } from "firebase/vertexai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Vertex AI service
const vertexAI = getVertexAI(firebaseApp);

// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(vertexAI, { model: "gemini-2.0-flash" });

// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the images
  const prompt = "What's different between these pictures?";

  // Prepare images for input
  const fileInputEl = document.querySelector("input[type=file]");
  const imageParts = await Promise.all(
    [...fileInputEl.files].map(fileToGenerativePart)
  );

  // To generate text output, call generateContent with the text and images
  const result = await model.generateContent([prompt, ...imageParts]);

  const response = result.response;
  const text = response.text();
  console.log(text);
}

run();

Dart

برای تولید متن از ورودی چند وجهی متن و تصاویر، می‌توانید generateContent() فراخوانی کنید.

ورودی تک فایل

import 'package:firebase_vertexai/firebase_vertexai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
final model =
      FirebaseVertexAI.instance.generativeModel(model: 'gemini-2.0-flash');

// Provide a text prompt to include with the image
final prompt = TextPart("What's in the picture?");
// Prepare images for input
final image = await File('image0.jpg').readAsBytes();
final imagePart = InlineDataPart('image/jpeg', image);

// To generate text output, call generateContent with the text and image
final response = await model.generateContent([
  Content.multi([prompt,imagePart])
]);
print(response.text);

ورودی چند فایل

import 'package:firebase_vertexai/firebase_vertexai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
final model =
      FirebaseVertexAI.instance.generativeModel(model: 'gemini-2.0-flash');

final (firstImage, secondImage) = await (
  File('image0.jpg').readAsBytes(),
  File('image1.jpg').readAsBytes()
).wait;
// Provide a text prompt to include with the images
final prompt = TextPart("What's different between these pictures?");
// Prepare images for input
final imageParts = [
  InlineDataPart('image/jpeg', firstImage),
  InlineDataPart('image/jpeg', secondImage),
];

// To generate text output, call generateContent with the text and images
final response = await model.generateContent([
  Content.multi([prompt, ...imageParts])
]);
print(response.text);

نحوه انتخاب یک مدل و به صورت اختیاری مکان مناسب برای مورد استفاده و برنامه خود را بیاموزید.

جریان پاسخ

قبل از امتحان کردن این نمونه، مطمئن شوید که بخش قبل از شروع این راهنما را تکمیل کرده اید.

می‌توانید با منتظر ماندن برای کل نتیجه تولید مدل، به تعاملات سریع‌تری برسید و در عوض از استریم برای مدیریت نتایج جزئی استفاده کنید. برای پخش جریانی پاسخ، generateContentStream را فراخوانی کنید.

مشاهده مثال: متن تولید شده را از فایل های تصویری پخش کنید

سویفت

می‌توانید generateContentStream() را فراخوانی کنید تا متن تولید شده را از ورودی چند وجهی متن و تصاویر پخش کنید.

ورودی تک فایل

import FirebaseVertexAI

// Initialize the Vertex AI service
let vertex = VertexAI.vertexAI()

// Create a `GenerativeModel` instance with a model that supports your use case
let model = vertex.generativeModel(modelName: "gemini-2.0-flash")

guard let image = UIImage(systemName: "bicycle") else { fatalError() }

// Provide a text prompt to include with the image
let prompt = "What's in this picture?"

// To stream generated text output, call generateContentStream and pass in the prompt
let contentStream = try model.generateContentStream(image, prompt)
for try await chunk in contentStream {
  if let text = chunk.text {
    print(text)
  }
}

ورودی چند فایل

import FirebaseVertexAI

// Initialize the Vertex AI service
let vertex = VertexAI.vertexAI()

// Create a `GenerativeModel` instance with a model that supports your use case
let model = vertex.generativeModel(modelName: "gemini-2.0-flash")

guard let image1 = UIImage(systemName: "car") else { fatalError() }
guard let image2 = UIImage(systemName: "car.2") else { fatalError() }

// Provide a text prompt to include with the images
let prompt = "What's different between these pictures?"

// To stream generated text output, call generateContentStream and pass in the prompt
let contentStream = try model.generateContentStream(image1, image2, prompt)
for try await chunk in contentStream {
  if let text = chunk.text {
    print(text)
  }
}

Kotlin

می‌توانید generateContentStream() را فراخوانی کنید تا متن تولید شده را از ورودی چند وجهی متن و تصاویر پخش کنید.

^{برای Kotlin، روش‌های موجود در این SDK توابع تعلیق هستند و باید از یک محدوده Coroutine فراخوانی شوند.}

ورودی تک فایل

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
val generativeModel = Firebase.vertexAI.generativeModel("gemini-2.0-flash")

// Loads an image from the app/res/drawable/ directory
val bitmap: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky)

// Provide a prompt that includes the image specified above and text
val prompt = content {
  image(bitmap)
  text("What developer tool is this mascot from?")
}

// To stream generated text output, call generateContentStream with the prompt
var fullResponse = ""
generativeModel.generateContentStream(prompt).collect { chunk ->
  print(chunk.text)
  fullResponse += chunk.text
}

ورودی چند فایل

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
val generativeModel = Firebase.vertexAI.generativeModel("gemini-2.0-flash")

// Loads an image from the app/res/drawable/ directory
val bitmap1: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky)
val bitmap2: Bitmap = BitmapFactory.decodeResource(resources, R.drawable.sparky_eats_pizza)

// Provide a prompt that includes the images specified above and text
val prompt = content {
    image(bitmap1)
    image(bitmap2)
    text("What's different between these pictures?")
}

// To stream generated text output, call generateContentStream with the prompt
var fullResponse = ""
generativeModel.generateContentStream(prompt).collect { chunk ->
  print(chunk.text)
  fullResponse += chunk.text
}

Java

می‌توانید generateContentStream() را فراخوانی کنید تا متن تولید شده را از ورودی چند وجهی متن و تصاویر پخش کنید.

^{برای جاوا، روش‌های پخش در این SDK یک نوع Publisher را از کتابخانه Reactive Streams برمی‌گرداند.}

ورودی تک فایل

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
GenerativeModel gm = FirebaseVertexAI.getInstance()
        .generativeModel("gemini-2.0-flash");
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

Bitmap bitmap = BitmapFactory.decodeResource(getResources(), R.drawable.sparky);

// Provide a prompt that includes the image specified above and text
Content prompt = new Content.Builder()
        .addImage(bitmap)
        .addText("What developer tool is this mascot from?")
        .build();

// To stream generated text output, call generateContentStream with the prompt
Publisher<GenerateContentResponse> streamingResponse = model.generateContentStream(prompt);

final String[] fullResponse = {""};

streamingResponse.subscribe(new Subscriber<GenerateContentResponse>() {
    @Override
    public void onNext(GenerateContentResponse generateContentResponse) {
        String chunk = generateContentResponse.getText();
        fullResponse[0] += chunk;
    }

    @Override
    public void onComplete() {
        System.out.println(fullResponse[0]);
    }

    @Override
    public void onError(Throwable t) {
        t.printStackTrace();
    }

    @Override
    public void onSubscribe(Subscription s) {
    }
});

ورودی چند فایل

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
GenerativeModel gm = FirebaseVertexAI.getInstance()
        .generativeModel("gemini-2.0-flash");
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

Bitmap bitmap1 = BitmapFactory.decodeResource(getResources(), R.drawable.sparky);
Bitmap bitmap2 = BitmapFactory.decodeResource(getResources(), R.drawable.sparky_eats_pizza);

// Provide a prompt that includes the images specified above and text
Content prompt = new Content.Builder()
    .addImage(bitmap1)
    .addImage(bitmap2)
    .addText("What's different between these pictures?")
    .build();

// To stream generated text output, call generateContentStream with the prompt
Publisher<GenerateContentResponse> streamingResponse = model.generateContentStream(prompt);

final String[] fullResponse = {""};

streamingResponse.subscribe(new Subscriber<GenerateContentResponse>() {
    @Override
    public void onNext(GenerateContentResponse generateContentResponse) {
        String chunk = generateContentResponse.getText();
        fullResponse[0] += chunk;
    }

    @Override
    public void onComplete() {
        System.out.println(fullResponse[0]);
    }

    @Override
    public void onError(Throwable t) {
        t.printStackTrace();
    }

    @Override
    public void onSubscribe(Subscription s) {
    }
});

Web

می‌توانید generateContentStream() را فراخوانی کنید تا متن تولید شده را از ورودی چند وجهی متن و تصاویر پخش کنید.

ورودی تک فایل

import { initializeApp } from "firebase/app";
import { getVertexAI, getGenerativeModel } from "firebase/vertexai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Vertex AI service
const vertexAI = getVertexAI(firebaseApp);

// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(vertexAI, { model: "gemini-2.0-flash" });

// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the image
  const prompt = "What do you see?";

  // Prepare image for input
  const fileInputEl = document.querySelector("input[type=file]");
  const imagePart = await fileToGenerativePart(fileInputEl.files[0]);

  // To stream generated text output, call generateContentStream with the text and image
  const result = await model.generateContentStream([prompt, imagePart]);

  for await (const chunk of result.stream) {
    const chunkText = chunk.text();
    console.log(chunkText);
  }
}

run();

ورودی چند فایل

import { initializeApp } from "firebase/app";
import { getVertexAI, getGenerativeModel } from "firebase/vertexai";

// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const firebaseConfig = {
  // ...
};

// Initialize FirebaseApp
const firebaseApp = initializeApp(firebaseConfig);

// Initialize the Vertex AI service
const vertexAI = getVertexAI(firebaseApp);

// Create a `GenerativeModel` instance with a model that supports your use case
const model = getGenerativeModel(vertexAI, { model: "gemini-2.0-flash" });

// Converts a File object to a Part object.
async function fileToGenerativePart(file) {
  const base64EncodedDataPromise = new Promise((resolve) => {
    const reader = new FileReader();
    reader.onloadend = () => resolve(reader.result.split(',')[1]);
    reader.readAsDataURL(file);
  });
  return {
    inlineData: { data: await base64EncodedDataPromise, mimeType: file.type },
  };
}

async function run() {
  // Provide a text prompt to include with the images
  const prompt = "What's different between these pictures?";

  const fileInputEl = document.querySelector("input[type=file]");
  const imageParts = await Promise.all(
    [...fileInputEl.files].map(fileToGenerativePart)
  );

  // To stream generated text output, call generateContentStream with the text and images
  const result = await model.generateContentStream([prompt, ...imageParts]);

  for await (const chunk of result.stream) {
    const chunkText = chunk.text();
    console.log(chunkText);
  }
}

run();

Dart

می‌توانید generateContentStream() را فراخوانی کنید تا متن تولید شده را از ورودی چند وجهی متن و تصاویر پخش کنید.

ورودی تک فایل

import 'package:firebase_vertexai/firebase_vertexai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
final model =
      FirebaseVertexAI.instance.generativeModel(model: 'gemini-2.0-flash');

// Provide a text prompt to include with the image
final prompt = TextPart("What's in the picture?");
// Prepare images for input
final image = await File('image0.jpg').readAsBytes();
final imagePart = InlineDataPart('image/jpeg', image);

// To stream generated text output, call generateContentStream with the text and image
final response = await model.generateContentStream([
  Content.multi([prompt,imagePart])
]);
await for (final chunk in response) {
  print(chunk.text);
}

ورودی چند فایل

import 'package:firebase_vertexai/firebase_vertexai.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

await Firebase.initializeApp(
  options: DefaultFirebaseOptions.currentPlatform,
);

// Initialize the Vertex AI service and create a `GenerativeModel` instance
// Specify a model that supports your use case
final model =
      FirebaseVertexAI.instance.generativeModel(model: 'gemini-2.0-flash');

final (firstImage, secondImage) = await (
  File('image0.jpg').readAsBytes(),
  File('image1.jpg').readAsBytes()
).wait;
// Provide a text prompt to include with the images
final prompt = TextPart("What's different between these pictures?");
// Prepare images for input
final imageParts = [
  InlineDataPart('image/jpeg', firstImage),
  InlineDataPart('image/jpeg', secondImage),
];

// To stream generated text output, call generateContentStream with the text and images
final response = await model.generateContentStream([
  Content.multi([prompt, ...imageParts])
]);
await for (final chunk in response) {
  print(chunk.text);
}

الزامات و توصیه‌ها برای فایل‌های تصویری ورودی

برای کسب اطلاعات دقیق در مورد موارد زیر به "فایل های ورودی پشتیبانی شده و الزامات برای Vertex AI Gemini API " مراجعه کنید:

گزینه های مختلف برای ارائه یک فایل در یک درخواست (به صورت درون خطی یا با استفاده از URL یا URI فایل)
الزامات و بهترین روش ها برای فایل های تصویری

انواع MIME تصویر پشتیبانی شده

مدل‌های چند وجهی Gemini از انواع MIME تصویر زیر پشتیبانی می‌کنند:

نوع MIME تصویر	فلش جمینی 2.0	Gemini 2.0 Flash-Lite
PNG - `image/png`
JPEG - `image/jpeg`
WebP - `image/webp`

محدودیت در هر درخواست

محدودیت خاصی برای تعداد پیکسل های یک تصویر وجود ندارد. با این حال، تصاویر بزرگ‌تر کوچک‌تر می‌شوند و برای قرار دادن حداکثر وضوح 3072×3072 در حالی که نسبت تصویر اصلی خود را حفظ می‌کنند، بالشتک می‌شوند.

در اینجا حداکثر تعداد فایل های تصویری مجاز در یک درخواست فوری آمده است:

Gemini 2.0 Flash و Gemini 2.0 Flash-Lite : 3000 تصویر

چه کار دیگری می توانید انجام دهید؟

قبل از ارسال پیام های طولانی به مدل، نحوه شمارش نشانه ها را بیاموزید.
Cloud Storage for Firebase راه‌اندازی کنید تا بتوانید فایل‌های حجیم را در درخواست‌های چندوجهی خود بگنجانید و راه‌حل مدیریت‌شده‌تری برای ارائه فایل‌ها در درخواست‌ها داشته باشید. فایل‌ها می‌توانند شامل تصاویر، PDF، ویدیو و صدا باشند.
به فکر آماده شدن برای تولید، از جمله راه‌اندازی Firebase App Check برای محافظت از Gemini API در برابر سوء استفاده توسط مشتریان غیرمجاز باشید. همچنین، حتماً چک لیست تولید را مرور کنید.

قابلیت های دیگر را امتحان کنید

مکالمات چند نوبتی (چت) بسازید.
متن را از اعلان‌های فقط متنی ایجاد کنید.
خروجی ساختاریافته (مانند JSON) را هم از دستورات متنی و هم از چند وجهی ایجاد کنید.
تولید تصاویر از پیام های متنی
از فراخوانی تابع برای اتصال مدل های مولد به سیستم ها و اطلاعات خارجی استفاده کنید.

یاد بگیرید چگونه تولید محتوا را کنترل کنید

طراحی سریع، از جمله بهترین شیوه‌ها، استراتژی‌ها و درخواست‌های نمونه را درک کنید .
پارامترهای مدل مانند دما و نشانه‌های حداکثر خروجی (برای Gemini ) یا نسبت ابعاد و تولید شخص (برای Imagen ) را پیکربندی کنید.
از تنظیمات ایمنی برای تنظیم احتمال دریافت پاسخ هایی که ممکن است مضر تلقی شوند استفاده کنید .

همچنین می‌توانید با استفاده از Vertex AI Studio ، دستورات و پیکربندی‌های مدل را آزمایش کنید.

درباره مدل های پشتیبانی شده بیشتر بدانید

در مورد مدل های موجود برای موارد استفاده مختلف و سهمیه ها و قیمت آنها اطلاعات کسب کنید.

درباره تجربه خود با Vertex AI در Firebase بازخورد بدهید

فایل های تصویری را با استفاده از API جمینی تجزیه و تحلیل کنید با مجموعه‌ها، منظم بمانید ذخیره و طبقه‌بندی محتوا براساس اولویت‌های شما.

قبل از شروع

ارسال فایل های تصویری (با کد پایه 64) و دریافت متن

سویفت

ورودی تک فایل

ورودی چند فایل

Kotlin

ورودی تک فایل

ورودی چند فایل

Java

ورودی تک فایل

ورودی چند فایل

Web

ورودی تک فایل

ورودی چند فایل

Dart

ورودی تک فایل

ورودی چند فایل

جریان پاسخ

مشاهده مثال: متن تولید شده را از فایل های تصویری پخش کنید

سویفت

ورودی تک فایل

ورودی چند فایل

Kotlin

ورودی تک فایل

ورودی چند فایل

Java

ورودی تک فایل

ورودی چند فایل

Web

ورودی تک فایل

ورودی چند فایل

Dart

ورودی تک فایل

ورودی چند فایل

الزامات و توصیه‌ها برای فایل‌های تصویری ورودی

انواع MIME تصویر پشتیبانی شده

محدودیت در هر درخواست

چه کار دیگری می توانید انجام دهید؟

قابلیت های دیگر را امتحان کنید

یاد بگیرید چگونه تولید محتوا را کنترل کنید

درباره مدل های پشتیبانی شده بیشتر بدانید

فایل های تصویری را با استفاده از API جمینی تجزیه و تحلیل کنید
با مجموعه‌ها، منظم بمانید ذخیره و طبقه‌بندی محتوا براساس اولویت‌های شما.