Get keywords for images from the Google Cloud Vision API with C#

This blog will explore how to get keywords for images using the Google Cloud Vision API. In order to use this API you'll need a Google Console Project. I'll be using a JSON server-to-server token. If you're not sure how to set this up, please consult the Quickstart.

According to Google the project has a lot to offer:

Google Cloud Vision API enables developers to understand the content of an image by encapsulating powerful machine learning models in an easy to use REST API. It quickly classifies images into thousands of categories (e.g., "sailboat", "lion", "Eiffel Tower"), detects individual objects and faces within images, and finds and reads printed words contained within images. You can build metadata on your image catalog, moderate offensive content, or enable new marketing scenarios through image sentiment analysis. Analyze images uploaded in the request or integrate with your image storage on Google Cloud Storage.

Nuget

Never build yourself what others have been sweating to build for you! Google has provided a fine Nuget package for the API.

Package Manager
.NET CLI
PackageReference

Install-Package Google.Apis.Vision.v1 -Version 1.49.0.2142
dotnet add package Google.Apis.Vision.v1 --version 1.49.0.2142
<PackageReference Include="Google.Apis.Vision.v1" Version="1.49.0.2142" />

nuget.org/packages/Google.Apis.Vision.v1

Credentials + Service

The Google API uses many types of credentials. I'm using the server-to-server credential in a JSON format. The following code will create those credentials. Don't forget to provide the scope, otherwise the credentials will fail!

/// <summary>
/// Creates the credentials.
/// </summary>
/// <param name="path">The path to credential file.</param>
/// <returns>The credentials.</returns>
public static GoogleCredential CreateCredentials(string path)
{
    GoogleCredential credential;
    using (var stream = new FileStream(path, FileMode.Open, FileAccess.Read))
    {
        var c = GoogleCredential.FromStream(stream);
        credential = c.CreateScoped(VisionService.Scope.CloudPlatform);
    }

    return credential;
}

Now we can create the service using the credentials:

/// <summary>
/// Creates the service.
/// </summary>
/// <param name="applicationName">Name of the application.</param>
/// <param name="credentials">The credentials.</param>
/// <returns>The service.</returns>
public static VisionService CreateService(
    string applicationName, 
    IConfigurableHttpClientInitializer credentials)
{
    var service = new VisionService(
        new BaseClientService.Initializer()
        {
            ApplicationName = applicationName,
            HttpClientInitializer = credentials
        }
    );

    return service;
}

Features

The Cloud Vision API returns keywords based on features. You can use the following feature constants:

LABEL_DETECTION
Add labels based on image content (see Label Detection Tutorial)
TEXT_DETECTION
Perform Optical Character Recognition (OCR) on text within the image
SAFE_SEARCH_DETECTION
Determine image safe search properties on the image
FACE_DETECTION
Detect faces within the image (see Face Detection Tutorial)
LANDMARK_DETECTION
Detect geographic landmarks within the image
LOGO_DETECTION
Detect company logos within the image
IMAGE_PROPERTIES
Compute a set of properties about the image (such as the image's dominant colors)

Prepare image request

First, let's create an AnnotateImageRequest that represent the data for a single file.

/// <summary>
/// Creates the annotation image request.
/// </summary>
/// <param name="path">The path.</param>
/// <param name="featureTypes">The feature types.</param>
/// <returns>The request.</returns>
private static AnnotateImageRequest CreateAnnotationImageRequest(
    string path, 
    string[] featureTypes)
{
    if (!File.Exists(path))
    {
        throw new FileNotFoundException("Not found.", path);
    }

    var request = new AnnotateImageRequest();
    request.Image = new Image();

    var bytes = File.ReadAllBytes(path);
    request.Image.Content = Convert.ToBase64String(bytes);

    request.Features = new List<Feature>();

    foreach(var featureType in featureTypes)
    {
        request.Features.Add(new Feature() { Type = featureType });
    }

    return request;
}

Note that you'll need to new up any property you need to use on the AnnotationImageRequest. The API will not create collections like Features.

AnnotateAsync - single file

Now we can create an extension method that will extend the VisionService with a simple method to execute an annotation request for a single file. It ties the service and the AnnotateImageRequest together.

/// <summary>
/// Annotates the file asynchronously.
/// </summary>
/// <param name="service">The service.</param>
/// <param name="file">The file.</param>
/// <param name="features">The features.</param>
/// <returns>The annotation response.</returns>
public static async Task<AnnotateImageResponse> AnnotateAsync(
    this VisionService service, 
    FileInfo file, 
    params string[] features)
{
    var request = new BatchAnnotateImagesRequest();
    request.Requests = new List<AnnotateImageRequest>();
    request.Requests.Add(CreateAnnotationImageRequest(file.FullName, features));

    var result = await service.Images.Annotate(request).ExecuteAsync();

    if (result?.Responses?.Count > 0)
    {
        return result.Responses[0];
    }

    return null;
}

The API supports batched requests. Implementation should not be hard, you'll just need to add more requests (request.Request.Add). I left it out of this tutorial.

Proof of concept

All the components are ready. I've created a directory with some files from the excellent Unsplash project and I ran the following program:

//find the files
var ext = new HashSet<string>(StringComparer.OrdinalIgnoreCase) { ".png", ".jpg", ".gif" };
var dir = @"C:\temp\Examples";
var files =
    Directory
        .GetFiles(dir, "*.*", SearchOption.AllDirectories)
        .Where(f => ext.Contains(Path.GetExtension(f)))
        .Select(f => new FileInfo(f))
        .ToArray();

//create service
var credentails = GoogleVisionApi.CreateCredentials("user_credentials.json");
var service = GoogleVisionApi.CreateService("MyApplication", credentails);

//process each file
foreach (var file in files)
{
    string f = file.FullName;
    Console.WriteLine("Reading " + f + ":");

    var task = service.AnnotateAsync(file, "LABEL_DETECTION");
    var result = task.Result;

    var keywords = result?.LabelAnnotations?.Select(s => s.Description).ToArray();
    var words = String.Join(", ", keywords);
    Console.WriteLine(words);

    f += ".keywords.txt";
    File.WriteAllText(f, words);
}

Results

This will produce the following results:

Final thoughts

The keywords are fine, but I had expected more from the Cloud Vision API. I'm missing keywords like: sun, puppy, strawberry, pots, window-sill. Maybe the service will improve. I can't wait to experiment with other services. Hopefully the code will help your project.

Kees C. Bakker says:
June 21, 2016 at 00:54

Everything has a price, so be sure to check out the pricing before you begin: https://cloud.google.com/vision/docs/pricing
Robin says:
March 6, 2017 at 15:17

Can you share code for Android ?
1. Kees C. Bakker says:
  March 15, 2017 at 12:06
  
  No idea how. You need Xamarin code?
  1. Dai Oliveira says:
    October 6, 2017 at 01:31
    
    Hi! Can you share de Xamarin Code?
    1. Dai Oliveira says:
      October 6, 2017 at 04:14
      
      My code is done! Thank you for share this post with us!
widad says:
June 26, 2017 at 18:22

thanks very helpful :)
GM SaHito says:
October 14, 2018 at 09:32

it works offline.. not dependent on internate???


light, silhouette, hand, lighting, sunrise	plant, tree, leaf, branch	shore, beach, sea, walkway, coast

vacation, sun tanning, photo shoot	bird, nature, animal, wildlife, vertebrate	pet, dog, mammal, animal, vertebrate

plant, flower, flowering plant, floristry, floral design	crowd, city, road, street, vehicle	dish, food, meal, produce, breakfast