MP3: Libraries, APIs, and JSON

To build great things today it helps to stand on the shoulders of giants. MP3 introduces you to two different ways of using code provided by others: both through working with external libraries and remote application programmer interfaces. You also get practice working with data formatted in JavaScript Object Notation (JSON), a simple and widely-supported data exchange format. And we’ll also continue to sharpen your Android UI design abilities.

MP3 is due Friday 3/15/2019 @ 5PM. To receive full credit, you must submit by this deadline. In addition, 10% of your grade on MP3 is for submitting code that earns at least 50 points by Monday 3/11/2019 @ 5PM. As usual, late submissions will be subject to the MP late submission policy.

1. Learning Objectives

MP3 introduces you to using integrating external libraries and APIs into your Android projects. We’ll show you how to use the Microsoft Cognitive Services API to analyze image data, and you’ll learn how to process the results using the Google GSON library.

We’ll also continue to reinforce the learning objectives from previous MPs (0, 1, and 2). In particular MP3 forces you to do more with Android UI layout than you have in the past—this will help prepare you for the final project.

2. Assignment Structure

Like previous Android MPs, MP3 is split into two pieces:

  • /lib/: a small library that extracts certain pieces of information from the JSON image data returned by the Microsoft Cognitive Services API using the Google GSON library.

  • /app/: an Android app for you to use for your own interactive testing. The Android app is almost complete, but needs a few modifications from you. And, for the app to work correctly, you need to finish the library in lib.

Note that you need to create RecognizePhoto.java in the proper directory and add it to the edu.illinois.cs.cs125.spring2019.mp3.lib package.

2.1. MP3 Image Recognition Library (lib)

The MP3 app handles uploading photos to the Microsoft Cognitive Services API. However, once the results are returned it relies on your library to extract a few pieces of information:

  • The image width, height, and format: using getWidth, getHeight, and getFormat.

  • The autogenerated image caption: using getCaption

  • Whether the image contains a dog or a cat 1: using isADog and isACat

  • Whether you’ve been rickrolled: that is, whether the photo contains Rick Astley, using isRick.

As always, you may find our official MP3 online documentation useful as you understand what you need to do. If you believe that the documentation is unclear, please post on the forum and we’ll offer clarification as needed. To complete this part of the assignment you’ll want to review the section on JSON below.

2.2. MP3 App

The MP3 app handles obtaining images—either ones saved previously on the device, captured by the camera, or downloaded from the internet. It also uploads those images to the Microsoft Cognitive Services API and updates the UI appropriately—or at least parts of it.

To complete MP3 most of the lines of code that you write will be completing the image recognition library. However, there are a few changes that you’ll need to make to earn full points and ensure that the user interface functions properly. To complete the UI you’ll need to do the following things:

  • Ensure that your app is requesting data properly from the Microsoft Cognitive Services API. Investigate Tasks.java to ensure that your API call is retrieving all the necessary information.

  • Add and update UI components to display information about the photo and the autogenerated caption. Note that this TextView doesn’t even exist in the layout yet! You’ll need to create it and update it appropriately—both when a photo is loaded into the app and when the photo is cleared. You should use the information returned by your RecognizePhoto library to set the photo information and caption properly.

  • Add and update dog and cat icons appropriately. When the image is a dog you should show a dog icon, and when it is a cat you should show a cat icon. If it is neither a dog nor a cat you should not show either animal icon. Again, these UI components don’t even exist yet! You get to add them to the UI and update them appropriately, using the results from your photo recognition library.

  • Do something when you find Rick. You do want to celebrate, right? Maybe try something like this.

2.3. Obtaining MP3

Use this GitHub Classroom invitation link to fork your copy of MP3. Once your repository has been created, import it into Android Studio following our assignment Git workflow guide.

2.4. Your Goal

At this point you should be familiar with the requirements from previous MPs. See the grading description below. However, note that for MP3 we have not provided a complete test suite. You will have to figure out how to test your Rick Astley detection yourself.

3. APIs

API stands for application programming interface. The better you become at understanding and using existing APIs, the faster and more easily you will be able to build powerful apps and programs.

3.1. What is an API?

In the most general terms, an API is "a set of clearly defined methods of communication between various software components". For MP3 and your final project we are particularly interested in the subset of remote web-based APIs. These APIs are:

  • remote: they are accessed over the internet, and

  • web-based: they are accessed using standard web protocols.

One way of thinking about APIs is that somewhere, in some data center full of computers, is a computer that will run certain functions for you if ask it nicely. To use that functionality you don’t have to know where that computer is, or how the function works. You just have to learn how to get the data to the API in the format that it expects, and understand how to interpret the results.

An API like a collection of functions, with each function providing different functionality. Using an API is very much like calling a function—except that to use a remote API we need to figure out how to get the data to the API and get the results back.

3.2. Microsoft Cognitive Services

MP3 requires you to use a portion of the Microsoft Cognitive Services API. It provides a variety of features focused on enabling interaction with users in intuitive ways—processing vision and speech data, and providing intelligent search, recommendation, and natural language services.

Why should you use this API? Well, imagine that you want to extract information from photos taken by users of your new app. You have two options:

  • Spend several years mastering the sophisticated machine learning and computer vision algorithms required to implement your own solution…​

  • …​or, use the Microsoft Cognitive Services Computer Vision API.

By using APIs you are truly able to stand on the shoulders of giants. Don’t waste your time solving problems that you aren’t interested in solving! Chances are that somebody else is waiting for you to solve the problem that you want to solve. Get on to that and focus on changing the world, not re-solving problems that others have already solved.

3.3. Using the Cognitive Services Computer Vision API

To get a sense of what the Cognitive Services Computer Vision API can do, experiment with some of the examples on this page. For each feature, upload your own images to get a sense of what kind of capabilities this API has.

MP3 focuses on the image analysis feature: the first one listed on this page. Go through the sample images and see if you can understand the results returned by the API:

  1. Are the results accurate?

  2. In cases where they are inaccurate, can you figure out why?

  3. What kind of information is reported by the API?

  4. What parts of it are you most surprised by and why?

3.4. Gaining API Access

Like many remote APIs, gaining programmatic access to the Microsoft Cognitive Services API in your app requires a key. Keys allow API provides to control who uses their services, and allows providers to begin to charge API users if their usage exceeds various thresholds.

Happily, many remote APIs provide free access for usage that is more than sufficient to develop and test your own programs. And, as a student, you also have access to many free programs offered by companies to introduce you to their APIs and services. So you can try out a lot of things without paying a dime. Of course, once your app built using the Microsoft Cognitive API takes off and is being used by one million people, you’ll need to start shelling out some money to Microsoft. But let’s get there first.

So the first step to gaining access to the Cognitive Services API is to get an API key. First, use this link to create a free Azure for Students account. This provides free access to many existing Microsoft APIs as well as $100 of free cloud credits.

Next, use this link to start the process of creating an API key for cognitive services. Make sure that your key is created in the West Central region! If it is not you’ll have to modify other parts of MP3 for your key to work.

The screencast above also shows you how we use the Microsoft Cognitive Services API in MP3 and how to add your Microsoft Cognitive Services API key to your project so that you can make your own requests. Specifically, you need to create a file called secrets.properties in the app folder of your project and add the following content to that file:

API_KEY=<Your Cognitive Services API Key>

You should replace "Your Cognitive Services API Key" with the key that you obtained following the instructions above.

Of course, like any artificial intelligence system, the Microsoft Cognitive Services API is not perfect. We’ve seen it produce some very amusing results. If you find a good one, post it on the forum for us to giggle at.

4. JSON

Object-oriented languages make it easy to model data internally by designing classes. But at times we need to exchange data between two different programs or systems, possibly implemented in different languages. That requires representing the data in a format that both systems can understand. JSON (JavaScript Object Notation) is one popular data exchange format in wide use on the internet, and frequently used to communicate with web APIs.

JSON is both simple and incredibly powerful. It is based on only two different principles, but can represent a wide variety of different data. Using the Microsoft Cognitive Services API requires understanding JSON, and completing MP3 requires that you implement several simple JSON parsing tasks.

4.1. What is JSON?

Imagine we have an instance of the following Java class:

public Person {
    public String name;
    public int age;

    Person(String setName, String setAge) {
        name = setName;
        age = setAge;
    }
}
Person geoffrey = new Person("Geoffrey", 38);

Now image we want to send this information to another computer program: for example, from an Android app written in Java to a web application programmer interface (API) that could be written in Java, Python, or any other language. How do we represent this information in a way that is correct and complete, yet also portable.

JSON (JavaScript Object Notation) has become a popular answer to that question. While it is named after JavaScript, the language that introduced JSON, JSON is now supported by pretty much every common programming language. This allows an app written in Java to communicate with a web API written in Python, or a web application written in JavaScript to communicate with a web backend written in Rust.

Enough talk. Here’s how the object above could be represented in JSON:

{
  "name": "Geoffrey",
  "age": 38
}

JSON has only two ways to structure data: objects and arrays. Above you seen an example object. Like Java, it has named variable (name, age) each of which takes on a particular value ("Geoffrey", 38). Here’s another example. The following instance of this Java object:

public Course {
    public String name;
    public int enrollment;
    public double averageGrade;

    Course(String setName, String setEnrollment, double setAverageGrade) {
        name = setName;
        enrollment = setEnrollment;
        averageGrade = setAverageGrade;
    }
}
Course cs125 = new Course("CS 125", 500, 3.9);

would be represented as this JSON string:

{
  "name": "CS 125",
  "enrollment": 500,
  "averageGrade": 3.9
}

JSON can also represent arrays. This Java array:

int[] array = new int[] { 1, 2, 10, 8 };

would be represented using this JSON string:

[1, 2, 10, 8]

We can also represent nested objects and objects with array instance variables:

public Person {
    public String name;
    public int age;

    Person(String setName, String setAge) {
        name = setName;
        age = setAge;
    }
}
public Course {
    public String name;
    public int enrollment;
    public double averageGrade;
    public Person instructor;
    public int[] grades;

    Course(String setName, String setEnrollment,
        double setAverageGrade, Person setInstructor,
        int[] setGrades) {
        name = setName;
        enrollment = setEnrollment;
        averageGrade = setAverageGrade;
        instructor = setInstructor;
        grades = setGrades;
    }
}
Course cs125 = new Course("CS 125", 500, 3.9,
  new Person("Geoffrey", 38), new int[] { 4, 4, 3 });
{
  "name": "CS 125",
  "enrollment": 500,
  "averageGrade": 3.9,
  "instructor": {
    "name": "Geoffrey",
    "age": 38
  },
  "grades": [
    4,
    4,
    3
  ]
}

4.2. Parsing JSON

Because JSON is supported by many different programming languages, many web APIs return data in JSON format. The Microsoft Cognitive Services API is one of them. To utilize this data, you must first parse it or deserialize it. The process of converting a Java object—or object in any language—to JSON is called serialization. The reverse process is called deserialization.

Happily, good libraries exist to parse JSON in every programming language. Java is no exception. We have included the Google GSON JSON parsing library in your project for you to use. Note that you must use the GSON library to parse JSON for MP3. Attempts to add other JSON parsing libraries to your project will fail during remote grading.

One way to use GSON is to create a class that matches your JSON string. So if you were provided with this JSON from a web API:

{
  "number": 0,
  "caption": "I'm a zero"
}

you would design this Java class to represent it:

public class Result {
    public int number;
    public String caption;
}

Note how our classes mirrors both the names (number, caption) and types (int, String) from the JSON result.

However, when you are working with unfamiliar JSON data, as you are in MP3, we suggest that you not create new classes and instead use the built-in Java classes. Here’s an example of how to do this given the JSON string shown above:

JsonParser parser = new JsonParser();
JsonObject result = parser.parse(jsonString).getAsJsonObject();
int number = result.get("number").getAsInt();
String caption = result.get("caption").getAsString();

Note that for MP3 we will not grade any additional class files you add to your lib directory. So we suggest you follow our example above 2.

4.3. Example JSON

Here is some example JSON produced by the Microsoft Cognitive Services API. You may want to consult this as you begin work on your image recognition functions. The app will also display the JSON returned for the photo that you have loaded below the image after the API request completes.

5. Grading

MP3 is worth 100 points total, broken down as follows:

  1. 55 points: RecognizePhoto.java

    • 5 points for getWidth

    • 5 points for getHeight

    • 5 points for getFormat

    • 10 points for getCaption

    • 10 points for isADog

    • 10 points for isACat

    • 10 points for isRick

  2. 25 points: MainActivity.java

    • 5 points for making an API request properly when the button is clicked

    • 5 points for setting the metadata properly

    • 5 points for setting the caption properly

    • 10 points for adjusting the animal icons properly

  3. 10 points for no checkstyle violations

  4. 10 points for committing code that earns at least 50 points before Monday 3/11/2019 @ 5PM.

5.1. Test Cases

As in previous MPs, we have provided test cases for MP3. Please review the MP0 testing instructions.

However, unlike previous MPs we have not provided complete test cases for MP3. Specifically, we have not provided a test for isRick. This is intentional, and designed to force you to do your own local testing. It is also designed to not give away exactly what features of the JSON returned by the Microsoft Cognitive Services API you will need to look at to complete isRick.

5.2. Autograding

Like previous MPs we have provided you with an autograding script that you can use to estimate your current grade as often as you want. Please review the MP0 autograding instructions. However, as described above note that the local test suite will not test isRick, while the remote test suite will.

6. Submitting Your Work

Follow the instructions from the submitting portion of the CS 125 workflow instructions.

And remember, you must submit something that earns 50 points before Monday 3/11/2019 @ 5PM to earn 10 points on the assignment.

7. Academic Integrity

If you cheat, we will make your watch this over and over again:

CS 125 is now CS 124

This site is no longer maintained, may contain incorrect information, and may not function properly.


Created 10/24/2021
Updated 10/24/2021
Commit a44ff35 // History // View
Built 10/24/2021 @ 21:29 EDT