Exceptions and Hash Functions
> Click or hit Control-Enter to run Example.main above
Review: Types of Exceptions
Java exceptions are broken into three distinct categories:
-
Checked exceptions: these are for places where you know something might go wrong and it’s out of your control
-
Unchecked exceptions (or runtime errors): these are unanticipated errors usually caused by something dumb that you (the programmer) did wrong
-
Errors: these are reserved for serious system problems that are probably not recoverable
Exception Handling Strategies
Here are reasonable strategies for handling each kind of exception:
-
Errors: don’t try to handle these, just go bye-bye
-
Unchecked exceptions: try to avoid these by improving your code
-
Checked exceptions: try to handle these and have your program continue running, or exit gracefully…
-
but don’t go on unless you can.
Working with Exceptions
Java exceptions are just another kind of Java object—and they have some useful features, particularly when debugging:
-
toString
: like every other JavaObject
, exceptions can be printed -
getMessage
: retrieves just the message associated with this exception -
printStackTrace
: print a stack trace for the error showing what caused it and what other functions were involved
> Click or hit Control-Enter to run the code above
Rethrowing Exceptions
Sometimes you may want to just record what happened but not know what to do with an error.
In that case you may want to rethrow it out of the catch block:
static URI createURI(final String input) {
// Example where we handle URISyntaxExceptions
try {
return new URI(input);
} catch (URISyntaxException e) {
// Log that something went wrong
Log.e(TAG, input + " is not a valid URI");
// Rethrow the exception
throw(e);
}
}
Throwing Your Own Exceptions
So how do we handle a case like this?
class StringStorage {
/**
* Create a new object to store strings.
*
* @param storageSize the size of the StringStorage,
* must be positive
*/
public StringStorage(final int storageSize) {
if (storageSize <= 0) {
// what now?
}
}
}
> Click or hit Control-Enter to run Example.main above
throw
To throw an exception in Java we use the throw
keyword:
Exception e = new Exception("you did something awful");
throw(e);
throw
Well
If you need to throw an exception:
-
Look for an existing
Exception
class that’s a good fit -
Or, create your own:
public class MyException extends Exception {
}
throw(new MyException("bad bad"));
finally
Java’s try-catch
also supports a finally
block. It is always executed after
either the try
or the catch
completes:
try {
System.out.println("start");
couldError();
System.out.println("done");
} catch (Exception e) {
System.out.println("catch");
} finally {
System.out.println("finally");
}
> Click or hit Control-Enter to run the code above
Intelligent try
Usage
You can make intelligent use of try-catch
blocks to avoid repetitive sanity
checking:
JsonParser parser = new JsonParser();
JsonObject info = parser.parse(json).getAsJsonObject();
if (!info.has("metadata")) {
return 0;
}
JsonObject metadata = info.getAsJsonObject("metadata");
if (!metadata.has("width")) {
return 0;
}
JsonElement width = metadata.getAsJsonElement("width");
return width.getAsInt();
Intelligent try
Usage
You can make intelligent use of try-catch
blocks to avoid repetitive sanity
checking:
(This is particularly nice when you can chain calls together.)
try {
JsonParser parser = new JsonParser();
return parser.parse(json)
.getAsJsonObject()
.getAsJsonObject("metadata")
.get("width")
.getAsInt();
} catch (Exception e) {
return 0;
}
Questions About Exceptions?
Let’s Imagine…
Imagine I told you that there was a function with the following properties:
-
Determinism: it can convert an arbitrary amount of data into a single limited-size value. If we repeat the computation on the same data, we get the same value.
-
Uniformity: over many inputs, each output value is equally likely.
-
Efficiency: it is efficient to compute.
> Click or hit Control-Enter to run the code above
Hash Functions
A hash function is any function that can be used to map data of arbitrary size to data of fixed size. The values returned by a hash function are called hash values, hash codes, digests, or simply hashes.
What Could We Do With Such A Function?
It may not seem obvious at first, but hash functions have many, many uses.
Example: Download Verification
Imagine the following scenario:
-
You need to download a 120GB file to install a particular piece of software.
-
It’s possible that, along the way, some data gets corrupted—either by the network or by your disk, who knows.
-
So before you install the software you want to be sure that you downloaded the file correctly.
Without A Hash Function
Without a hash function, what do we have to do?
-
Download the 120GB file.
-
Download it again. (Slow.)
-
Compare the two to make sure that they are the same. (Also slow.)
But…
Remember, I have a function with the following properties:
-
Determinism: it can convert an arbitrary amount of data into a single limited-size value. If we repeat the computation on the same data, we get the same value.
-
Uniformity: over many inputs, each output value is equally likely.
-
Efficiency: it is efficient to compute.
With A Hash Function
With a hash function, what do we do?
-
You compute the hash of your copy of the file.
-
Download a hash of the file: maybe only a few bytes.
-
Compute the hash of the file locally and make sure that it matches.
Example Download With md5sum
md5
is a popular
hash function
that produces a 128-bit value.
We’re expecting an md5
hash value of d95bacb4ccd59657a5ac2bf66b35ebcc
:
$ md5 mactex-20170524.pkg
MD5 (mactex-20170524.pkg) = d95bacb4ccd59657a5ac2bf66b35ebcc
$
Example: Fingerprinting Content
Imagine the following scenario.
-
You sent me
foo.docx
at some point. -
(I deleted it because it was a
.docx
file, so in reality scenario over.) -
But let’s pretend that you can’t remember if you sent me the latest version.
Without a Hash Function
Without a hash function, what do we do?
-
You send me the file again.
-
(And I delete it again.)
But…
Remember, I have a function with the following properties:
-
Determinism: it can convert an arbitrary amount of data into a single limited-size value. If we repeat the computation on the same data, we get the same value.
-
Uniformity: over many inputs, each output value is equally likely.
-
Efficiency: it is efficient to compute.
With a Hash Function
With a hash function, what do we to do?
-
You compute the hash of your file.
-
I compute the hash of my file.
-
If they are the same, we’re done.
-
Otherwise you send me your copy.
Example Content Hash with git
git
uses hashes (the
SHA-1 algorithm)
to fingerprint files and commits:
data:image/s3,"s3://crabby-images/bc64c/bc64c327d19eb83f1171ac305c3ee4fe5b398d67" alt="github example"
Example git push
More or less, here’s what happens when you push to GitHub.com:
-
Your computer says: "Hi GitHub.com, I have the following files:
a6efc501d57b88df337fe904483d25732bb3e45e
,4e292499a1194d0493bd5350408fe3254d2335d3
,20da0fbbf8e8c279bb1edbbe0ac5ae40349edceb
, …" -
Server, "OK, I’ve got
4e292499a1194d0493bd5350408fe3254d2335d3
anda6efc501d57b88df337fe904483d25732bb3e45e
but I need20da0fbbf8e8c279bb1edbbe0ac5ae40349edceb
and …". -
Your computer: "OK, sending those now…"
Hash Collisions
If a hash function produces the same hash for two different inputs this is called a collision.
-
In some cases, particularly if the size of the hash is small, collisions are expected and we plan to deal with them.
-
If the size of the hash is large enough and the hash function is uniform, collisions should never happen and the world will end if they do. (Or at least
git
will stop working and my world will end.)
> Click or hit Control-Enter to run the code above
The Birthday Paradox
In a room with 100 students, what is the probability that two will share the same birthday 1? 99.9999%
-
Does anybody know how many you need to get a 50% chance? Only 23!
-
This is bad for our hash functions… collisions are more likely than we might think!
Announcements
-
The final project description has been posted. Please get started!
-
I have office hours MWF from 10AM–12PM in Siebel 2227. Please stop by!
-
Remember to provide feedback on the course using the anonymous feedback form.
-
I’ve started to respond to existing feedback on the forum.