You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

1164 lines
43 KiB

This file contains invisible Unicode characters!

This file contains invisible Unicode characters that may be processed differently from what appears below. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to reveal hidden characters.

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

Okio
====
Okio is a library that complements `java.io` and `java.nio` to make it much
easier to access, store, and process your data. It started as a component of
[OkHttp][1], the capable HTTP client included in Android. It's well-exercised
and ready to solve new problems.
ByteStrings and Buffers
-----------------------
Okio is built around two types that pack a lot of capability into a
straightforward API:
* [**ByteString**][3] is an immutable sequence of bytes. For character data, `String`
is fundamental. `ByteString` is String's long-lost brother, making it easy to
treat binary data as a value. This class is ergonomic: it knows how to encode
and decode itself as hex, base64, and UTF-8.
* [**Buffer**][4] is a mutable sequence of bytes. Like `ArrayList`, you don't need
to size your buffer in advance. You read and write buffers as a queue: write
data to the end and read it from the front. There's no obligation to manage
positions, limits, or capacities.
Internally, `ByteString` and `Buffer` do some clever things to save CPU and
memory. If you encode a UTF-8 string as a `ByteString`, it caches a reference to
that string so that if you decode it later, there's no work to do.
`Buffer` is implemented as a linked list of segments. When you move data from
one buffer to another, it _reassigns ownership_ of the segments rather than
copying the data across. This approach is particularly helpful for multithreaded
programs: a thread that talks to the network can exchange data with a worker
thread without any copying or ceremony.
Sources and Sinks
-----------------
An elegant part of the `java.io` design is how streams can be layered for
transformations like encryption and compression. Okio includes its own stream
types called [`Source`][5] and [`Sink`][6] that work like `InputStream` and
`OutputStream`, but with some key differences:
* **Timeouts.** The streams provide access to the timeouts of the underlying
I/O mechanism. Unlike the `java.io` socket streams, both `read()` and
`write()` calls honor timeouts.
* **Easy to implement.** `Source` declares three methods: `read()`, `close()`,
and `timeout()`. There are no hazards like `available()` or single-byte reads
that cause correctness and performance surprises.
* **Easy to use.** Although _implementations_ of `Source` and `Sink` have only
three methods to write, _callers_ are given a rich API with the
[`BufferedSource`][7] and [`BufferedSink`][8] interfaces. These interfaces give you
everything you need in one place.
* **No artificial distinction between byte streams and char streams.** It's all
data. Read and write it as bytes, UTF-8 strings, big-endian 32-bit integers,
little-endian shorts; whatever you want. No more `InputStreamReader`!
* **Easy to test.** The `Buffer` class implements both `BufferedSource` and
`BufferedSink` so your test code is simple and clear.
Sources and sinks interoperate with `InputStream` and `OutputStream`. You can
view any `Source` as an `InputStream`, and you can view any `InputStream` as a
`Source`. Similarly for `Sink` and `OutputStream`.
Presentations
-------------
[A Few “Ok” Libraries][ok_libraries_talk] ([slides][ok_libraries_slides]): An introduction to Okio
and three libraries written with it.
[Decoding the Secrets of Binary Data][encoding_talk] ([slides][encoding_slides]): How data encoding
works and how Okio does it.
[Ok Multiplatform!][ok_multiplatform_talk] ([slides][ok_multiplatform_slides]): How we changed
Okios implementation language from Java to Kotlin.
Requirements
------------
Okio supports Android 4.0.3+ (API level 15+) and Java 7+.
Okio depends on the [Kotlin standard library][kotlin]. It is a small library with strong
backward-compatibility.
Recipes
-------
We've written some recipes that demonstrate how to solve common problems with
Okio. Read through them to learn about how everything works together.
Cut-and-paste these examples freely; that's what they're for.
### Read a text file line-by-line ([Java][ReadFileLineByLine]/[Kotlin][ReadFileLineByLineKt])
Use `Okio.source(File)` to open a source stream to read a file. The returned
`Source` interface is very small and has limited uses. Instead we wrap the
source with a buffer. This has two benefits:
* **It makes the API more powerful.** Instead of the basic methods offered by
`Source`, `BufferedSource` has dozens of methods to address most common
problems concisely.
* **It makes your program run faster.** Buffering allows Okio to get more done
with fewer I/O operations.
Each `Source` that is opened needs to be closed. The code that opens the stream
is responsible for making sure it is closed.
=== "Java"
Here we use Java's `try` blocks to close our sources automatically.
```java
public void readLines(File file) throws IOException {
try (Source fileSource = Okio.source(file);
BufferedSource bufferedSource = Okio.buffer(fileSource)) {
while (true) {
String line = bufferedSource.readUtf8Line();
if (line == null) break;
if (line.contains("square")) {
System.out.println(line);
}
}
}
}
```
=== "Kotlin"
Note that static `Okio` methods become extension functions (`Okio.source(file)` =>
`file.source()`), and `use` is used to automatically close the streams:
```kotlin
@Throws(IOException::class)
fun readLines(file: File) {
file.source().use { fileSource ->
fileSource.buffer().use { bufferedFileSource ->
while (true) {
val line = bufferedFileSource.readUtf8Line() ?: break
if ("square" in line) {
println(line)
}
}
}
}
}
```
The `readUtf8Line()` API reads all of the data until the next line delimiter
either `\n`, `\r\n`, or the end of the file. It returns that data as a string,
omitting the delimiter at the end. When it encounters empty lines the method
will return an empty string. If there isnt any more data to read it will
return null.
The above program can be written more compactly by inlining the `fileSource`
variable and by using a fancy `for` loop instead of a `while`:
```java
public void readLines(File file) throws IOException {
try (BufferedSource source = Okio.buffer(Okio.source(file))) {
for (String line; (line = source.readUtf8Line()) != null; ) {
if (line.contains("square")) {
System.out.println(line);
}
}
}
}
```
In Kotlin, we can wrap invocations of `source.readUtf8Line()` into the `generateSequence` builder to
create a sequence of lines that will end once null is returned. Plus, transforming streams is easy
thanks to the extension functions:
```kotlin
@Throws(IOException::class)
fun readLines(file: File) {
file.source().buffer().use { source ->
generateSequence { source.readUtf8Line() }
.filter { line -> "square" in line }
.forEach(::println)
}
}
```
The `readUtf8Line()` method is suitable for parsing most files. For certain
use-cases you may also consider `readUtf8LineStrict()`. It is similar but it
requires that each line is terminated by `\n` or `\r\n`. If it encounters the
end of the file before that it will throw an `EOFException`. The strict variant
also permits a byte limit to defend against malformed input.
```java
public void readLines(File file) throws IOException {
try (BufferedSource source = Okio.buffer(Okio.source(file))) {
while (!source.exhausted()) {
String line = source.readUtf8LineStrict(1024L);
if (line.contains("square")) {
System.out.println(line);
}
}
}
}
```
Here's a similar example written in Kotlin:
```kotlin
@Throws(IOException::class)
fun readLines(file: File) {
file.source().buffer().use { source ->
while (!source.exhausted()) {
val line = source.readUtf8LineStrict(1024)
if ("square" in line) {
println(line)
}
}
}
}
```
### Write a text file ([Java][WriteFile]/[Kotlin][WriteFileKt])
Above we used a `Source` and a `BufferedSource` to read a file. To write, we use
a `Sink` and a `BufferedSink`. The advantages of buffering are the same: a more
capable API and better performance.
```java
public void writeEnv(File file) throws IOException {
try (Sink fileSink = Okio.sink(file);
BufferedSink bufferedSink = Okio.buffer(fileSink)) {
for (Map.Entry<String, String> entry : System.getenv().entrySet()) {
bufferedSink.writeUtf8(entry.getKey());
bufferedSink.writeUtf8("=");
bufferedSink.writeUtf8(entry.getValue());
bufferedSink.writeUtf8("\n");
}
}
}
```
There isnt an API to write a line of input; instead we manually insert our own
newline character. Most programs should hardcode `"\n"` as the newline
character. In rare situations you may use `System.lineSeparator()` instead of
`"\n"`: it returns `"\r\n"` on Windows and `"\n"` everywhere else.
We can write the above program more compactly by inlining the `fileSink`
variable and by taking advantage of method chaining:
=== "Java"
```Java
public void writeEnv(File file) throws IOException {
try (BufferedSink sink = Okio.buffer(Okio.sink(file))) {
for (Map.Entry<String, String> entry : System.getenv().entrySet()) {
sink.writeUtf8(entry.getKey())
.writeUtf8("=")
.writeUtf8(entry.getValue())
.writeUtf8("\n");
}
}
}
```
=== "Kotlin"
```Kotlin
@Throws(IOException::class)
fun writeEnv(file: File) {
file.sink().buffer().use { sink ->
for ((key, value) in System.getenv()) {
sink.writeUtf8(key)
sink.writeUtf8("=")
sink.writeUtf8(value)
sink.writeUtf8("\n")
}
}
}
```
In the above code we make four calls to `writeUtf8()`. Making four calls is
more efficient than the code below because the VM doesnt have to create and
garbage collect a temporary string.
```java
sink.writeUtf8(entry.getKey() + "=" + entry.getValue() + "\n"); // Slower!
```
### UTF-8 ([Java][ExploreCharsets]/[Kotlin][ExploreCharsetsKt])
In the above APIs you can see that Okio really likes UTF-8. Early computer
systems suffered many incompatible character encodings: ISO-8859-1, ShiftJIS,
ASCII, EBCDIC, etc. Writing software to support multiple character sets was
awful and we didnt even have emoji! Today we're lucky that the world has
standardized on UTF-8 everywhere, with some rare uses of other charsets in
legacy systems.
If you need another character set, `readString()` and `writeString()` are there
for you. These methods require that you specify a character set. Otherwise you
may accidentally create data that is only readable by the local computer. Most
programs should use the UTF-8 methods only.
When encoding strings you need to be mindful of the different ways that strings
are represented and encoded. When a glyph has an accent or another adornment
it may be represented as a single complex code point (`é`) or as a simple code
point (`e`) followed by its modifiers (`´`). When the entire glyph is a single
code point thats called [NFC][nfc]; when its multiple its [NFD][nfd].
Though we use UTF-8 whenever we read or write strings in I/O, when they are in
memory Java Strings use an obsolete character encoding called UTF-16. It is a
bad encoding because it uses a 16-bit `char` for most characters, but some dont
fit. In particular, most emoji use two Java chars. This is problematic because
`String.length()` returns a surprising result: the number of UTF-16 chars and
not the natural number of glyphs.
| | Café 🍩 | Café 🍩 |
| --------------------: | :---------------------------| :------------------------------|
| Form | [NFC][nfc] | [NFD][nfd] |
| Code Points | `c  a  f  é       🍩     ` | `c  a  f  e  ´       🍩     ` |
| UTF-8 bytes | `43 61 66 c3a9 20 f09f8da9` | `43 61 66 65 cc81 20 f09f8da9` |
| String.codePointCount | 6 | 7 |
| String.length | 7 | 8 |
| Utf8.size | 10 | 11 |
For the most part Okio lets you ignore these problems and focus on your data.
But when you need them, there are convenient APIs for dealing with low-level
UTF-8 strings.
Use `Utf8.size()` to count the number of bytes required to encode a string as
UTF-8 without actually encoding it. This is handy in length-prefixed encodings
like protocol buffers.
Use `BufferedSource.readUtf8CodePoint()` to read a single variable-length code
point, and `BufferedSink.writeUtf8CodePoint()` to write one.
### Golden Values ([Java][GoldenValue]/[Kotlin][GoldenValueKt])
Okio likes testing. The library itself is heavily tested, and it has features
that are often helpful when testing application code. One pattern weve found to
be quite useful is “golden value” testing. The goal of such tests is to confirm
that data encoded with earlier versions of a program can safely be decoded by
the current program.
Well illustrate this by encoding a value using Java Serialization. Though we
must disclaim that Java Serialization is an awful encoding system and most
programs should prefer other formats like JSON or protobuf! In any case, heres
a method that takes an object, serializes it, and returns the result as a
`ByteString`:
=== "Java"
```Java
private ByteString serialize(Object o) throws IOException {
Buffer buffer = new Buffer();
try (ObjectOutputStream objectOut = new ObjectOutputStream(buffer.outputStream())) {
objectOut.writeObject(o);
}
return buffer.readByteString();
}
```
=== "Kotlin"
```Kotlin
@Throws(IOException::class)
private fun serialize(o: Any?): ByteString {
val buffer = Buffer()
ObjectOutputStream(buffer.outputStream()).use { objectOut ->
objectOut.writeObject(o)
}
return buffer.readByteString()
}
```
Theres a lot going on here.
1. We create a buffer as a holding space for our serialized data. Its a convenient
replacement for `ByteArrayOutputStream`.
2. We ask the buffer for its output stream. Writes to a buffer or its output stream
always append data to the end of the buffer.
3. We create an `ObjectOutputStream` (the encoding API for Java serialization) and
write our object. The try block takes care of closing the stream for us. Note
that closing a buffer has no effect.
4. Finally we read a byte string from the buffer. The `readByteString()` method
allows us to specify how many bytes to read; here we dont specify a count in
order to read the entire thing. Reads from a buffer always consume data from
the front of the buffer.
With our `serialize()` method handy we are ready to compute and print a golden
value.
=== "Java"
```Java
Point point = new Point(8.0, 15.0);
ByteString pointBytes = serialize(point);
System.out.println(pointBytes.base64());
```
=== "Kotlin"
```Kotlin
val point = Point(8.0, 15.0)
val pointBytes = serialize(point)
println(pointBytes.base64())
```
We print the `ByteString` as [base64][base64] because its a compact format
thats suitable for embedding in a test case. The program prints this:
```
rO0ABXNyAB5va2lvLnNhbXBsZXMuR29sZGVuVmFsdWUkUG9pbnTdUW8rMji1IwIAAkQAAXhEAAF5eHBAIAAAAAAAAEAuAAAAAAAA
```
Thats our golden value! We can embed it in our test case using base64 again
to convert it back into a `ByteString`:
=== "Java"
```Java
ByteString goldenBytes = ByteString.decodeBase64("rO0ABXNyAB5va2lvLnNhbXBsZ"
+ "XMuR29sZGVuVmFsdWUkUG9pbnTdUW8rMji1IwIAAkQAAXhEAAF5eHBAIAAAAAAAAEAuA"
+ "AAAAAAA");
```
=== "Kotlin"
```Kotlin
val goldenBytes = ("rO0ABXNyACRva2lvLnNhbXBsZXMuS290bGluR29sZGVuVmFsdWUkUG9pbnRF9yaY7cJ9EwIAA" +
"kQAAXhEAAF5eHBAIAAAAAAAAEAuAAAAAAAA").decodeBase64()
```
The next step is to deserialize the `ByteString` back into our value class. This
method reverses the `serialize()` method above: we append a byte string to a
buffer then consume it using an `ObjectInputStream`:
=== "Java"
```Java
private Object deserialize(ByteString byteString) throws IOException, ClassNotFoundException {
Buffer buffer = new Buffer();
buffer.write(byteString);
try (ObjectInputStream objectIn = new ObjectInputStream(buffer.inputStream())) {
return objectIn.readObject();
}
}
```
=== "Kotlin"
```Kotlin
@Throws(IOException::class, ClassNotFoundException::class)
private fun deserialize(byteString: ByteString): Any? {
val buffer = Buffer()
buffer.write(byteString)
ObjectInputStream(buffer.inputStream()).use { objectIn ->
return objectIn.readObject()
}
}
```
Now we can test the decoder against the golden value:
=== "Java"
```Java
ByteString goldenBytes = ByteString.decodeBase64("rO0ABXNyAB5va2lvLnNhbXBsZ"
+ "XMuR29sZGVuVmFsdWUkUG9pbnTdUW8rMji1IwIAAkQAAXhEAAF5eHBAIAAAAAAAAEAuA"
+ "AAAAAAA");
Point decoded = (Point) deserialize(goldenBytes);
assertEquals(new Point(8.0, 15.0), decoded);
```
=== "Kotlin"
```Kotlin
val goldenBytes = ("rO0ABXNyACRva2lvLnNhbXBsZXMuS290bGluR29sZGVuVmFsdWUkUG9pbnRF9yaY7cJ9EwIAA" +
"kQAAXhEAAF5eHBAIAAAAAAAAEAuAAAAAAAA").decodeBase64()!!
val decoded = deserialize(goldenBytes) as Point
assertEquals(point, decoded)
```
With this test we can change the serialization of the `Point` class without
breaking compatibility.
### Write a binary file ([Java][BitmapEncoder]/[Kotlin][BitmapEncoderKt])
Encoding a binary file is not unlike encoding a text file. Okio uses the same
`BufferedSink` and `BufferedSource` bytes for both. This is handy for binary
formats that include both byte and character data.
Writing binary data is more hazardous than text because if you make a mistake it
is often quite difficult to diagnose. Avoid such mistakes by being careful
around these traps:
* **The width of each field.** This is the number of bytes used. Okio doesn't
include a mechanism to emit partial bytes. If you need that, youll need to
do your own bit shifting and masking before writing.
* **The endianness of each field.** All fields that have more than one byte
have _endianness_: whether the bytes are ordered most-significant to least
(big endian) or least-significant to most (little endian). Okio uses the `Le`
suffix for little-endian methods; methods without a suffix are big-endian.
* **Signed vs. Unsigned.** Java doesnt have unsigned primitive types (except
for `char`!) so coping with this is often something that happens at the
application layer. To make this a little easier Okio accepts `int` types for
`writeByte()` and `writeShort()`. You can pass an “unsigned” byte like 255
and Okio will do the right thing.
| Method | Width | Endianness | Value | Encoded Value |
| :----------- | ----: | :--------- | --------------: | :------------------------ |
| writeByte | 1 | | 3 | `03` |
| writeShort | 2 | big | 3 | `00 03` |
| writeInt | 4 | big | 3 | `00 00 00 03` |
| writeLong | 8 | big | 3 | `00 00 00 00 00 00 00 03` |
| writeShortLe | 2 | little | 3 | `03 00` |
| writeIntLe | 4 | little | 3 | `03 00 00 00` |
| writeLongLe | 8 | little | 3 | `03 00 00 00 00 00 00 00` |
| writeByte | 1 | | Byte.MAX_VALUE | `7f` |
| writeShort | 2 | big | Short.MAX_VALUE | `7f ff` |
| writeInt | 4 | big | Int.MAX_VALUE | `7f ff ff ff` |
| writeLong | 8 | big | Long.MAX_VALUE | `7f ff ff ff ff ff ff ff` |
| writeShortLe | 2 | little | Short.MAX_VALUE | `ff 7f` |
| writeIntLe | 4 | little | Int.MAX_VALUE | `ff ff ff 7f` |
| writeLongLe | 8 | little | Long.MAX_VALUE | `ff ff ff ff ff ff ff 7f` |
This code encodes a bitmap following the [BMP file format][bmp].
=== "Java"
```Java
void encode(Bitmap bitmap, BufferedSink sink) throws IOException {
int height = bitmap.height();
int width = bitmap.width();
int bytesPerPixel = 3;
int rowByteCountWithoutPadding = (bytesPerPixel * width);
int rowByteCount = ((rowByteCountWithoutPadding + 3) / 4) * 4;
int pixelDataSize = rowByteCount * height;
int bmpHeaderSize = 14;
int dibHeaderSize = 40;
// BMP Header
sink.writeUtf8("BM"); // ID.
sink.writeIntLe(bmpHeaderSize + dibHeaderSize + pixelDataSize); // File size.
sink.writeShortLe(0); // Unused.
sink.writeShortLe(0); // Unused.
sink.writeIntLe(bmpHeaderSize + dibHeaderSize); // Offset of pixel data.
// DIB Header
sink.writeIntLe(dibHeaderSize);
sink.writeIntLe(width);
sink.writeIntLe(height);
sink.writeShortLe(1); // Color plane count.
sink.writeShortLe(bytesPerPixel * Byte.SIZE);
sink.writeIntLe(0); // No compression.
sink.writeIntLe(16); // Size of bitmap data including padding.
sink.writeIntLe(2835); // Horizontal print resolution in pixels/meter. (72 dpi).
sink.writeIntLe(2835); // Vertical print resolution in pixels/meter. (72 dpi).
sink.writeIntLe(0); // Palette color count.
sink.writeIntLe(0); // 0 important colors.
// Pixel data.
for (int y = height - 1; y >= 0; y--) {
for (int x = 0; x < width; x++) {
sink.writeByte(bitmap.blue(x, y));
sink.writeByte(bitmap.green(x, y));
sink.writeByte(bitmap.red(x, y));
}
// Padding for 4-byte alignment.
for (int p = rowByteCountWithoutPadding; p < rowByteCount; p++) {
sink.writeByte(0);
}
}
}
```
=== "Kotlin"
```Kotlin
@Throws(IOException::class)
fun encode(bitmap: Bitmap, sink: BufferedSink) {
val height = bitmap.height
val width = bitmap.width
val bytesPerPixel = 3
val rowByteCountWithoutPadding = bytesPerPixel * width
val rowByteCount = (rowByteCountWithoutPadding + 3) / 4 * 4
val pixelDataSize = rowByteCount * height
val bmpHeaderSize = 14
val dibHeaderSize = 40
// BMP Header
sink.writeUtf8("BM") // ID.
sink.writeIntLe(bmpHeaderSize + dibHeaderSize + pixelDataSize) // File size.
sink.writeShortLe(0) // Unused.
sink.writeShortLe(0) // Unused.
sink.writeIntLe(bmpHeaderSize + dibHeaderSize) // Offset of pixel data.
// DIB Header
sink.writeIntLe(dibHeaderSize)
sink.writeIntLe(width)
sink.writeIntLe(height)
sink.writeShortLe(1) // Color plane count.
sink.writeShortLe(bytesPerPixel * Byte.SIZE_BITS)
sink.writeIntLe(0) // No compression.
sink.writeIntLe(16) // Size of bitmap data including padding.
sink.writeIntLe(2835) // Horizontal print resolution in pixels/meter. (72 dpi).
sink.writeIntLe(2835) // Vertical print resolution in pixels/meter. (72 dpi).
sink.writeIntLe(0) // Palette color count.
sink.writeIntLe(0) // 0 important colors.
// Pixel data.
for (y in height - 1 downTo 0) {
for (x in 0 until width) {
sink.writeByte(bitmap.blue(x, y))
sink.writeByte(bitmap.green(x, y))
sink.writeByte(bitmap.red(x, y))
}
// Padding for 4-byte alignment.
for (p in rowByteCountWithoutPadding until rowByteCount) {
sink.writeByte(0)
}
}
}
```
The trickiest part of this program is the formats required padding. The BMP
format expects each row to begin on a 4-byte boundary so it is necessary to add
zeros to maintain the alignment.
Encoding other binary formats is usually quite similar. Some tips:
* Write tests with golden values! Confirming that your program emits the
expected result can make debugging easier.
* Use `Utf8.size()` to compute the number of bytes of an encoded string. This
is essential for length-prefixed formats.
* Use `Float.floatToIntBits()` and `Double.doubleToLongBits()` to encode
floating point values.
### Communicate on a Socket ([Java][SocksProxyServer]/[Kotlin][SocksProxyServerKt])
Sending and receiving data over the network is a bit like writing and reading
files. We use `BufferedSink` to encode output and `BufferedSource` to decode
input. Like files, network protocols can be text, binary, or a mix of both. But
there are also some substantial differences between the network and the
file system.
With a file youre either reading or writing but with the network you can do
both! Some protocols handle this by taking turns: write a request, read a
response, repeat. You can implement this kind of protocol with a single thread.
In other protocols you may read and write simultaneously. Typically youll want
one dedicated thread for reading. For writing you can use either a dedicated
thread or use `synchronized` so that multiple threads can share a sink. Okios
streams are not safe for concurrent use.
Sinks buffer outbound data to minimize I/O operations. This is efficient but it
means you must manually call `flush()` to transmit data. Typically
message-oriented protocols flush after each message. Note that Okio will
automatically flush when the buffered data exceeds some threshold. This is
intended to save memory and you shouldnt rely on it for interactive protocols.
Okio builds on `java.io.Socket` for connectivity. Create your socket as a server
or as a client, then use `Okio.source(Socket)` to read and `Okio.sink(Socket)`
to write. These APIs also work with `SSLSocket`. You should use SSL unless you
have a very good reason not to!
Cancel a socket from any thread by calling `Socket.close()`; this will cause its
sources and sinks to immediately fail with an `IOException`. You can also
configure timeouts for all socket operations. You dont need a reference to the
socket to adjust timeouts: `Source` and `Sink` expose timeouts directly. This
API works even if the streams are decorated.
As a complete example of networking with Okio we wrote a [basic SOCKS
proxy][SocksProxyServer] server. Some highlights:
=== "Java"
```Java
Socket fromSocket = ...
BufferedSource fromSource = Okio.buffer(Okio.source(fromSocket));
BufferedSink fromSink = Okio.buffer(Okio.sink(fromSocket));
```
=== "Kotlin"
```Kotlin
val fromSocket: Socket = ...
val fromSource = fromSocket.source().buffer()
val fromSink = fromSocket.sink().buffer()
```
Creating sources and sinks for sockets is the same as creating them for files.
Once you create a `Source` or `Sink` for a socket you must not use its
`InputStream` or `OutputStream`, respectively.
=== "Java"
```Java
Buffer buffer = new Buffer();
for (long byteCount; (byteCount = source.read(buffer, 8192L)) != -1; ) {
sink.write(buffer, byteCount);
sink.flush();
}
```
=== "Kotlin"
```Kotlin
val buffer = Buffer()
var byteCount: Long
while (source.read(buffer, 8192L).also { byteCount = it } != -1L) {
sink.write(buffer, byteCount)
sink.flush()
}
```
The above loop copies data from the source to the sink, flushing after each
read. If we didnt need the flushing we could replace this loop with a single
call to `BufferedSink.writeAll(Source)`.
The `8192` argument to `read()` is the maximum number of bytes to read before
returning. We could have passed any value here, but we like 8 KiB because thats
the largest value Okio can do in a single system call. Most of the time
application code doesnt need to deal with such limits!
=== "Java"
```Java
int addressType = fromSource.readByte() & 0xff;
int port = fromSource.readShort() & 0xffff;
```
=== "Kotlin"
```Kotlin
val addressType = fromSource.readByte().toInt() and 0xff
val port = fromSource.readShort().toInt() and 0xffff
```
Okio uses signed types like `byte` and `short`, but often protocols want
unsigned values. The bitwise `&` operator is Javas preferred idiom to convert
a signed value into an unsigned value. Heres a cheat sheet for bytes, shorts,
and ints:
| Type | Signed Range | Unsigned Range | Signed to Unsigned |
| :---- | :---------------------------: | :--------------- | :-------------------------- |
| byte | -128..127 | 0..255 | `int u = s & 0xff;` |
| short | -32,768..32,767 | 0..65,535 | `int u = s & 0xffff;` |
| int | -2,147,483,648..2,147,483,647 | 0..4,294,967,295 | `long u = s & 0xffffffffL;` |
Java has no primitive type that can represent unsigned longs.
### Hashing ([Java][Hashing]/[Kotlin][HashingKt])
Were bombarded by hashing in our lives as Java programmers. Early on we're introduced to the
`hashCode()` method, something we know we need to override otherwise unforeseen bad things happen.
Later were shown `LinkedHashMap` and its friends. These build on that `hashCode()` method to
organize data for fast retrieval.
Elsewhere we have cryptographic hash functions. These get used all over the place. HTTPS
certificates, Git commits, BitTorrent integrity checking, and Blockchain blocks all use
cryptographic hashes. Good use of hashes can improve the performance, privacy, security, and
simplicity of an application.
Each cryptographic hash function accepts a variable-length stream of input bytes and produces a
fixed-length byte string value called the hash”. Hash functions have these important qualities:
* Deterministic: each input always produces the same output.
* Uniform: each output byte string is equally likely. It is very difficult to find or create pairs
of different inputs that yield the same output. This is called a collision”.
* Non-reversible: knowing an output doesn't help you to find the input. Note that if you know some
possible inputs you can hash them to see if their hashes match.
* Well-known: the hash is implemented everywhere and rigorously understood.
Good hash functions are very cheap to compute (dozens of microseconds) and expensive to reverse
(quintillions of millenia). Steady advances in computing and mathematics have caused once-great hash
functions to become inexpensive to reverse. When choosing a hash function, beware that not all are
created equal! Okio supports these well-known cryptographic hash functions:
* **MD5**: a 128-bit (16 byte) cryptographic hash. It is both insecure and obsolete because it is
inexpensive to reverse! This hash is offered because it is popular and convenient for use in
legacy systems that are not security-sensitive.
* **SHA-1**: a 160-bit (20 byte) cryptographic hash. It was recently demonstrated that it is
feasible to create SHA-1 collisions. Consider upgrading from SHA-1 to SHA-256.
* **SHA-256**: a 256-bit (32 byte) cryptographic hash. SHA-256 is widely understood and expensive
to reverse. This is the hash most systems should use.
* **SHA-512**: a 512-bit (64 byte) cryptographic hash. It is expensive to reverse.
Each hash creates a `ByteString` of the specified length. Use `hex()` to get the conventional
human-readable form. Or leave it as a `ByteString` because thats a convenient model type!
Okio can produce cryptographic hashes from byte strings:
=== "Java"
```Java
ByteString byteString = readByteString(new File("README.md"));
System.out.println(" md5: " + byteString.md5().hex());
System.out.println(" sha1: " + byteString.sha1().hex());
System.out.println("sha256: " + byteString.sha256().hex());
System.out.println("sha512: " + byteString.sha512().hex());
```
=== "Kotlin"
```Kotlin
val byteString = readByteString(File("README.md"))
println(" md5: " + byteString.md5().hex())
println(" sha1: " + byteString.sha1().hex())
println(" sha256: " + byteString.sha256().hex())
println(" sha512: " + byteString.sha512().hex())
```
From buffers:
=== "Java"
```Java
Buffer buffer = readBuffer(new File("README.md"));
System.out.println(" md5: " + buffer.md5().hex());
System.out.println(" sha1: " + buffer.sha1().hex());
System.out.println("sha256: " + buffer.sha256().hex());
System.out.println("sha512: " + buffer.sha512().hex());
```
=== "Kotlin"
```Kotlin
val buffer = readBuffer(File("README.md"))
println(" md5: " + buffer.md5().hex())
println(" sha1: " + buffer.sha1().hex())
println(" sha256: " + buffer.sha256().hex())
println(" sha512: " + buffer.sha512().hex())
```
While streaming from a source:
=== "Java"
```Java
try (HashingSink hashingSink = HashingSink.sha256(Okio.blackhole());
BufferedSource source = Okio.buffer(Okio.source(file))) {
source.readAll(hashingSink);
System.out.println("sha256: " + hashingSink.hash().hex());
}
```
=== "Kotlin"
```Kotlin
sha256(blackholeSink()).use { hashingSink ->
file.source().buffer().use { source ->
source.readAll(hashingSink)
println(" sha256: " + hashingSink.hash.hex())
}
}
```
While streaming to a sink:
=== "Java"
```Java
try (HashingSink hashingSink = HashingSink.sha256(Okio.blackhole());
BufferedSink sink = Okio.buffer(hashingSink);
Source source = Okio.source(file)) {
sink.writeAll(source);
sink.close(); // Emit anything buffered.
System.out.println("sha256: " + hashingSink.hash().hex());
}
```
=== "Kotlin"
```Kotlin
sha256(blackholeSink()).use { hashingSink ->
hashingSink.buffer().use { sink ->
file.source().use { source ->
sink.writeAll(source)
sink.close() // Emit anything buffered.
println(" sha256: " + hashingSink.hash.hex())
}
}
}
```
Okio also supports HMAC (Hash Message Authentication Code) which combines a secret and a hash.
Applications use HMAC for data integrity and authentication.
=== "Java"
```Java
ByteString secret = ByteString.decodeHex("7065616e7574627574746572");
System.out.println("hmacSha256: " + byteString.hmacSha256(secret).hex());
```
=== "Kotlin"
```Kotlin
val secret = "7065616e7574627574746572".decodeHex()
println("hmacSha256: " + byteString.hmacSha256(secret).hex())
```
As with hashing, you can generate an HMAC from a `ByteString`, `Buffer`, `HashingSource`, and
`HashingSink`. Note that Okio doesnt implement HMAC for MD5. Okio uses Javas
`java.security.MessageDigest` for cryptographic hashes and `javax.crypto.Mac` for HMAC.
### Encryption and Decryption
Use `Okio.cipherSink(Sink, Cipher)` or `Okio.cipherSource(Source, Cipher)` to encrypt or decrypt a
stream using a block cipher.
Callers are responsible for the initialization of the encryption or decryption cipher with the
chosen algorithm, the key, and algorithm-specific additional parameters like the initialization
vector. The following example shows a typical usage with AES encryption, in which `key` and `iv`
parameters should both be 16 bytes long.
```java
void encryptAes(ByteString bytes, File file, byte[] key, byte[] iv)
throws GeneralSecurityException, IOException {
Cipher cipher = Cipher.getInstance("AES/CBC/PKCS5Padding");
cipher.init(Cipher.ENCRYPT_MODE, new SecretKeySpec(key, "AES"), new IvParameterSpec(iv));
try (BufferedSink sink = Okio.buffer(Okio.cipherSink(Okio.sink(file), cipher))) {
sink.write(bytes);
}
}
ByteString decryptAesToByteString(File file, byte[] key, byte[] iv)
throws GeneralSecurityException, IOException {
Cipher cipher = Cipher.getInstance("AES/CBC/PKCS5Padding");
cipher.init(Cipher.DECRYPT_MODE, new SecretKeySpec(key, "AES"), new IvParameterSpec(iv));
try (BufferedSource source = Okio.buffer(Okio.cipherSource(Okio.source(file), cipher))) {
return source.readByteString();
}
}
```
In Kotlin, these encryption and decryption methods are extensions on `Cipher`:
```kotlin
fun encryptAes(bytes: ByteString, file: File, key: ByteArray, iv: ByteArray) {
val cipher = Cipher.getInstance("AES/CBC/PKCS5Padding")
cipher.init(Cipher.ENCRYPT_MODE, SecretKeySpec(key, "AES"), IvParameterSpec(iv))
val cipherSink = file.sink().cipherSink(cipher)
cipherSink.buffer().use {
it.write(bytes)
}
}
fun decryptAesToByteString(file: File, key: ByteArray, iv: ByteArray): ByteString {
val cipher = Cipher.getInstance("AES/CBC/PKCS5Padding")
cipher.init(Cipher.DECRYPT_MODE, SecretKeySpec(key, "AES"), IvParameterSpec(iv))
val cipherSource = file.source().cipherSource(cipher)
return cipherSource.buffer().use {
it.readByteString()
}
}
```
File System Examples
--------------------
Okio's recently gained a multiplatform file system API. These examples work on JVM, native, and
Node.js platforms. In the examples below `fileSystem` is an instance of [FileSystem] such as
`FileSystem.SYSTEM` or `FakeFileSystem`.
Read all of `readme.md` as a string:
```
val path = "readme.md".toPath()
val entireFileString = fileSystem.read(path) {
readUtf8()
}
```
Read all of `thumbnail.png` as a [ByteString][3]:
```
val path = "thumbnail.png".toPath()
val entireFileByteString = fileSystem.read(path) {
readByteString()
}
```
Read all lines of `/etc/hosts` into a `List<String>`:
```
val path = "/etc/hosts".toPath()
val allLines = fileSystem.read(path) {
generateSequence { readUtf8Line() }.toList()
}
```
Read the prefix of `index.html` that precedes the first `<html>` substring:
```
val path = "index.html".toPath()
val untilHtmlTag = fileSystem.read(path) {
val htmlTag = indexOf("<html>".encodeUtf8())
if (htmlTag != -1L) readUtf8(htmlTag) else null
}
```
Write `readme.md` as a string:
```
val path = "readme.md".toPath()
fileSystem.write(path) {
writeUtf8(
"""
|Hello, World
|------------
|
|This is a sample file.
|""".trimMargin()
)
}
```
Write `data.bin` as a [ByteString][3]:
```
val path = "data.bin".toPath()
fileSystem.write(path) {
val byteString = "68656c6c6f20776f726c640a".decodeHex()
write(byteString)
}
```
Write `readme.md` from a `List<String>`:
```
val path = "readme.md".toPath()
val lines = listOf(
"Hello, World",
"------------",
"",
"This is a sample file.",
""
)
fileSystem.write(path) {
for (line in lines) {
writeUtf8(line)
writeUtf8("\n")
}
}
```
Generate `binary.txt` programmatically:
```
val path = "binary.txt".toPath()
fileSystem.write(path) {
for (i in 1 until 100) {
writeUtf8("$i ${i.toString(2)}")
writeUtf8("\n")
}
}
```
Releases
--------
Our [change log][changelog] has release history.
```kotlin
implementation("com.squareup.okio:okio:2.10.0")
```
<details>
<summary>Snapshot builds are also available</summary>
```kotlin
repositories {
maven {
url = uri("https://oss.sonatype.org/content/repositories/snapshots/")
}
}
dependencies {
implementation("com.squareup.okio:okio:2.10.0")
}
```
</details>
R8 / ProGuard
--------
If you are using R8 or ProGuard add the options from [this file][proguard].
License
--------
Copyright 2013 Square, Inc.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
[1]: https://github.com/square/okhttp
[3]: https://square.github.io/okio/2.x/okio/okio/-byte-string/index.html
[4]: https://square.github.io/okio/2.x/okio/okio/-buffer/index.html
[5]: https://square.github.io/okio/2.x/okio/okio/-source/index.html
[6]: https://square.github.io/okio/2.x/okio/okio/-sink/index.html
[7]: https://square.github.io/okio/2.x/okio/okio/-buffered-source/index.html
[8]: https://square.github.io/okio/2.x/okio/okio/-buffered-sink/index.html
[changelog]: http://square.github.io/okio/changelog/
[javadoc]: https://square.github.io/okio/2.x/okio/okio/index.html
[nfd]: https://docs.oracle.com/javase/7/docs/api/java/text/Normalizer.Form.html#NFD
[nfc]: https://docs.oracle.com/javase/7/docs/api/java/text/Normalizer.Form.html#NFC
[base64]: https://tools.ietf.org/html/rfc4648#section-4
[bmp]: https://en.wikipedia.org/wiki/BMP_file_format
[kotlin]: https://kotlinlang.org/
[ok_libraries_talk]: https://www.youtube.com/watch?v=WvyScM_S88c
[ok_libraries_slides]: https://speakerdeck.com/jakewharton/a-few-ok-libraries-droidcon-mtl-2015
[encoding_talk]: https://www.youtube.com/watch?v=T_p22jMZSrk
[encoding_slides]: https://speakerdeck.com/swankjesse/decoding-the-secrets-of-binary-data-droidcon-nyc-2016
[ok_multiplatform_talk]: https://www.youtube.com/watch?v=Q8B4eDirgk0
[ok_multiplatform_slides]: https://speakerdeck.com/swankjesse/ok-multiplatform
[ReadFileLineByLine]: https://github.com/square/okio/blob/master/samples/src/jvmMain/java/okio/samples/ReadFileLineByLine.java
[ReadFileLineByLineKt]: https://github.com/square/okio/blob/master/samples/src/jvmMain/kotlin/okio/samples/ReadFileLineByLine.kt
[WriteFile]: https://github.com/square/okio/blob/master/samples/src/jvmMain/java/okio/samples/WriteFile.java
[WriteFileKt]: https://github.com/square/okio/blob/master/samples/src/jvmMain/kotlin/okio/samples/WriteFile.kt
[ExploreCharsets]: https://github.com/square/okio/blob/master/samples/src/jvmMain/java/okio/samples/ExploreCharsets.java
[ExploreCharsetsKt]: https://github.com/square/okio/blob/master/samples/src/jvmMain/kotlin/okio/samples/ExploreCharsets.kt
[FileSystem]: https://square.github.io/okio/2.x/okio/okio/-file-system/index.html
[GoldenValue]: https://github.com/square/okio/blob/master/samples/src/jvmMain/java/okio/samples/GoldenValue.java
[GoldenValueKt]: https://github.com/square/okio/blob/master/samples/src/jvmMain/kotlin/okio/samples/GoldenValue.kt
[BitmapEncoder]: https://github.com/square/okio/blob/master/samples/src/jvmMain/java/okio/samples/BitmapEncoder.java
[BitmapEncoderKt]: https://github.com/square/okio/blob/master/samples/src/jvmMain/kotlin/okio/samples/BitmapEncoder.kt
[SocksProxyServer]: https://github.com/square/okio/blob/master/samples/src/jvmMain/java/okio/samples/SocksProxyServer.java
[SocksProxyServerKt]: https://github.com/square/okio/blob/master/samples/src/jvmMain/kotlin/okio/samples/SocksProxyServer.kt
[Hashing]: https://github.com/square/okio/blob/master/samples/src/jvmMain/java/okio/samples/Hashing.java
[HashingKt]: https://github.com/square/okio/blob/master/samples/src/jvmMain/kotlin/okio/samples/Hashing.kt
[proguard]: https://github.com/square/okio/blob/master/okio/src/jvmMain/resources/META-INF/proguard/okio.pro