Fork me on GitHub

basho

Basic Riak API Operations

Riak currently offers three ways to interact with the database:

  1. HTTP Interface (often referred to as RESTful)
  2. Native Erlang Interface
  3. Protocol Buffers (PB) Interface

For this tutorial, we are going to work with the HTTP Interface and use curl commands through the terminal to test Riak out and help you get acquainted with it quickly. (You can find links to details about the HTTP and PB APIs at the bottom of this page.)

Required Knowledge
  • Client ID – All requests should include the X-Riak-ClientId header, which can be any string that uniquely identifies the client, for purposes of tracing object modifications in the vector clock.
  • URL Escaping – Buckets, keys, and link specifications may not contain unescaped slashes. Use a URL-escaping library or replace slashes with %2F.

Object/Key Operations

Most of the interactions you’ll have with Riak will be setting or retrieving the value of a key. This section describes how to do that.

Read an Object

Here is the basic command formation for retrieving a specific key from a bucket.

GET /riak/bucket/key

The body of the response will be the contents of the object (if it exists). Simple, right?

Riak understands many HTTP-defined headers, like Accept for content-type negotiation (relevant when dealing with siblings, see the sibling examples for the HTTP API), and If-None-Match/ETag and If-Modified-Since/Last-Modified for conditional requests.

Riak also accepts many query parameters, including r for setting the R-value for this get request (R Values will be explained more in the final section of the Fast Track Tutorial). Omitting the r query parameter is equivalent to specifying r=2.

Normal response codes:

  • 200 OK
  • 300 Multiple Choices
  • 304 Not Modified

Typical error codes:

  • 404 Not Found

So, with that in mind, try this command. This will request (GET) the key “doc2” from the bucket “test.”

$ curl -v http://127.0.0.1:8091/riak/test/doc2

This should return a 404 Not Found as the key “doc2” does not exist, but that’s not a bad thing. This is the response we wanted to receive.

Store an object with existing or user-defined key

Your application will often have its own method of generating the keys for its data. If so, storing that data is easy. The basic request looks like this.

PUT /riak/bucket/key
POST is also a valid verb, for compatibility’s sake.

Some request headers are required for PUTs:

  • Content-Type must be set for the stored object. Set what you expect to receive back when next requesting it.
  • X-Riak-Vclock if the object already exists, the vector clock attached to the object when read; if the object is new, this header may be omitted

Other request headers are optional for PUTs:

  • X-Riak-Meta-YourHeader any additional metadata headers that should be stored with the object.
  • Link user and system-defined links to other resources. Read more about Links

Similar to how GET requests support the “r” query parameter, PUT requests also support these parameters:

  • r how many replicas need to agree when retrieving an existing object before the write (integer value, default is 2)
  • w how many replicas to write to before returning a successful response (integer value, default is 2).
  • dw how many replicas to commit to durable storage before returning a successful response (integer value, default is 0)
  • returnbody whether to return the contents of the stored object (boolean string, default is “false”)

Normal status codes:

  • 200 OK
  • 204 No Content
  • 300 Multiple Choices

If returnbody=true, any of the response headers expected from a GET request may be present. Like a GET request, 300 Multiple Choices may be returned if siblings existed or were created as part of the operation, and the response can be dealt with similarly.

Let’s give it a shot. Try running this in a terminal.

$ curl -v -X PUT -d '{"bar":"baz"}' -H "Content-Type: application/json" \
  -H "X-Riak-Vclock: a85hYGBgzGDKBVIszMk55zKYEhnzWBlKIniO8mUBAA==" \
  http://127.0.0.1:8091/riak/test/doc?returnbody=true

Store a new object and assign random key

If your application would rather leave key-generation up to Riak, issue a POST request to the bucket URL instead of a PUT to a bucket/key pair:

POST /riak/bucket

If you don’t pass Riak a “key” name after the bucket, it will know to create one for you.

Supported headers are the same as for bucket/key PUT requests, though X-Riak-Vclock will never be relevant for these POST requests. Supported query parameters are also the same as for bucket/key PUT requests.

Normal status codes:

  • 201 Created

This command will store an object, in the bucket “test” and assign it a key:

$ curl -v -d 'this is a test' -H "Content-Type: text/plain" \
  http://127.0.0.1:8091/riak/test

In the output, the Location header will give the you key for that object. To view the newly created object, go to “http://127.0.0.1:8091/Location” in your browser.

If you’ve done it correctly, you should see the value (which is “this is a test”).

Delete an object

Lastly, you’ll need to know how to delete keys.

The command, as you can probably guess, follows a predictable pattern and looks like this:

DELETE /riak/bucket/key

The normal response codes for a DELETE operations are 204 No Content and 404 Not Found

404 responses are “normal” in the sense that DELETE operations are idempotent and not finding the resource has the same effect as deleting it.

Try this:

$ curl -v -X DELETE http://127.0.0.1:8091/riak/test/test2

Bucket Properties and Operations

Buckets are essentially a flat namespace in Riak and have little significance beyond their ability to allow the same key name to exist in multiple buckets and to provide some per-bucket configurability.

How Many Buckets Can I Have?

Currently, buckets come with virtually no cost except for when you modify the default bucket properties. Modified Bucket properties are gossiped around the cluster and therefore add to the amount of data sent around the network. In other words, buckets using the default bucket properties are free.

Setting a bucket’s properties

There is no need to “create” buckets in Riak. They pop into existence when keys are added to them, and dissappear when all keys have been removed from them.

However, in addition to providing a namespace for keys, the properties of a bucket also define some of the behaviors that Riak will implement for the values stored in the bucket.

To set these properties, issue a PUT to the bucket’s URL:

PUT /riak/bucket

The body of the request should be a JSON object with a single entry “props.” Unmodified bucket properties may be omitted.

Important headers:

  • Content-Type: application/json

The most important properties to consider for your bucket are:

  • n_val – the number of replicas for objects in this bucket (defaults to 3); n_val should be an integer greater than 0 and less than the number of partitions in the ring.
Changing n_val after keys have been added to the bucket is not advisable as it may result in failed reads because the new value may not be replicated to all the appropriate partitions.
  • allow_multtrue or false (defaults to false ); Riak maintains any sibling objects caused by things like concurrent writes or network partitions. With allow_mult set to false, clients will only get the most-recent-by-timestamp object.

Let’s go ahead and alter the properties of a Bucket. The following PUT will create a new bucket called “test” with a modified n_val of 5.

$ curl -v -X PUT -H "Content-Type: application/json" -d '{"props":{"n_val":5}}' \
  http://127.0.0.1:8091/riak/test

GET Buckets

Here is how you use the HTTP API to retrieve (or “GET”) the bucket properties and/or keys:

GET /riak/bucket_name

Again, quite simple. (Are you starting to see a pattern?)

The optional query parameters are:

  • props=true|false – whether to return the bucket properties (defaults to “true”)
  • keys=true|false|stream – whether to return the keys stored in the bucket (defaults to “false”); see the HTTP API's list keys for details about dealing with a keys=stream response

With that in mind, go ahead and run this command. This will GET the bucket information that we just set with the sample command above:

$ curl -v http://127.0.0.1:8091/riak/test

You can also view this Bucket information through any browser by going to “http://127.0.0.1:8091/riak/test”

So, that’s how the HTTP API works. There are more nuances to what you can do with it, and an in depth reading of the HTTP API page (linked below) is highly recommended. This will give you details on the headers, parameters, and status that you should keep in mind when using the HTTP Interface. But, for the purpose of the Fast Track, this whirlwind tour will be sufficient.

What’s Next? Now that we have worked a bit with the HTTP interface to Riak, let's load some sample data and run some more advanced queries using MapReduce.

Additional Reading for this Section