What the #%@! Is REST?

by msimpson 5/21/2009 2:03:00 PM

I'm currently rewriting this site using ASP.NET MVC, REST, Dojo, and some other technologies.  I've mentioned this to friends and coworkers, and many have asked what's the big deal about REST.  I admit that until one really examines it closely, it's pretty underwhelming.  "HTTP requests?  I've been doing that for years!"  I was in that skeptical camp for a long time, until I bought and read a book a friend of mine recommended.  But now I have a fuller appreciation of what REST is about and what distinguishes it from RPC-style architectures.  There's already a lot of material on the web about REST, but I thought I would describe it in my own words - hence this blog post.

About REST 

REST ("Representational State Transfer") is a term coined by Roy Fielding in his well-known dissertation.  I think one can build a RESTful application without fully understanding what Roy means by the term, but you could say that REST is "how the web is designed to work" - meaning in specific terms, it leverages HTTP.

Now I'll put on my Captain Obvious hat and point out some things about HTTP.  It defines a few key things - resources for example, which are documents, data, and other things that are accessible via the web, and also a very limited set of verbs ("get", "post", "put", "delete", etc.) for operating on those resources.  This is an important point - there are only a few standard verbs, but a potentially infinite set of "nouns" (resources).  REST people speak of HTTP's constrained set of verbs as HTTP's uniform interface.

Now imagine an application with a potentially unlimited set of nouns, PLUS a potentially unlimited set of verbs.  That's an RPC-style application.  RPC-style apps have a much less uniform interface, because they have an infinite set of verbs.

Typically web services based on an RPC model use SOAP, a protocol that involves stuffing all the information necessary for a request (including the verb info) into a data structure, and sending that to the server using some transport mechanism.  SOAP itself doesn't require any particular transport, but almost all implementations use HTTP for transport. 

However, because all the info about the procedure call is in the SOAP "envelope", in effect SOAP rides on top of HTTP without leveraging it for anything but simple transport.  You could say SOAP is a tunneling protocol.

This is innocuous enough, and in fact SOAP does bring some things to the table that are not in HTTP, but for many applications, HTTP would suffice.  HTTP already contains mechanisms for authentication, caching, content expiration, exception indication, addressability, and other useful things.  In some cases (like authentication), SOAP and its extensions reinvent the wheel. 

With that background, I'll define a few more things central to REST.

Resources are the logical noun you're working with; strictly speaking, they don't physically exist.  The resource state lives on the server, and is not exposed directly.  Instead, the client deals with representations of a resource, which exist on a temporary basis.

A representation is simply some subset of the resource's information, in some format meaningful to the recipient.  If your logical resource is a "person", maintained in your server database as a record in a Person table with fields for PersonID, FirstName, LastName, etc., then the person record in the database is not the resource itself - it's the resource state.  The resource is the logical person - the person's soul as opposed to its body, to put it in pseudoreligious terms.   A representation might be a chunk of JSON data sent across the wire listing some subset of that information (or all of it).  Or it might be XML.  Or HTML.  Or plain text.  You can have multiple representations of the same resource.  HTTP and REST do not specify what the format is, just that there is one.

When a client requests data, works with it, updates it, deletes it, etc., there is another kind of state, called application state, which describes where the client is in its process, the current values of form fields, the history of user actions, etc.  With REST, application state may only be maintained on the client, while resource state may only be maintained on the server.  This puts the client in control of where it goes and what it does.  It also allows the server to be what is typically just called stateless - the server forgets about the client in between requests.  This allows the server to be simpler and scale better.

A URI provides addressability for a resource - the ability to reach a unique resource given an address.  Each resource might be exposed via multiple URIs, but each URI points to only one resource.  In the example above, you might have a URI like this:  http://www.myserver.com/person/123, where '123' is the person ID.  You could also have http://www.myserver.com/simpson/mike (if first + last name is unique).  When requesting a representation of the resource, you might specify it in the URI (http://www.myserver.com/person/123/xml), or use the "Accept" HTTP header.

A third principle is connectedness.  This just means that the client isn't required to know information that's out-of-band in order to follow from one resource to another or from one representation to another.  All the information needed is supplied in the representation returned in the current request.  In practical terms, this means doing things like returning URIs of related resources, instead of just their IDs.

Advantages of REST

Here are some advantages and disadvantages of REST, as compared to SOAP/RPC:

  • REST is simple and extremely interoperable.  Everything speaks HTTP.  That makes REST suitable for public-facing applications that might be used by anyone.
  • REST is addressable - a unique URI exists for every resource.  By contrast, SOAP requests are usually funneled through a single URI for the service as a whole, so you can't refer to a particular resource or action directly.  You have to embed that stuff in the SOAP envelope.
  • REST has great performance.  There is no heavyweight XML/XSLT to parse (at least not necessarily).  With no session state, you avoid serialization costs on the server side.  True, you may have more requests, and you may need to push more representation data across the wire, but it's usually no worse than viewstate (which is gone by the way), and HTTP traffic is easily compressed anyway.
  • REST has great scalability.  Stateless servers are easily scaled out across a web farm (as Clay pointed out below, you can also run RPC-style apps statelessly, but it's optional for them).  Plus, you can take advantage of HTTP's extensive capabilities for caching and optimization.
  • REST works better with AJAX.  A REST service can return data in JSON or another format easily parsed by Javascript.  It is impractical for Javascript to parse a SOAP response.
  • REST is more easily testable.  Ignoring "proper" unit testing for the purpose of this discussion, you can simply use a browser to hit the service directly.  You aren't going to do that with a SOAP service.
Disadvantages of REST

Nothing is perfect.  Here is where REST falls down:

  • Although there is a standard of sorts (WADL) for declaring the interface a REST-based site presents, it's not widely used at all.  It's also possible to describe a REST interface fairly well with WSDL 2.0.  But many people disagree that such description is necessary or desirable.  If you feel you do need to be able to describe your REST interface to clients, there is no standard that is likely to work across the board.
  • REST and HTTP by themselves do not address some more complicated scenarios that are specifically addressed by SOAP and its WS-* extensions - things like transactions, for example.  It is possible to model a transaction with REST by defining a resource for the entire transaction state, and handling the multiple parts behind the scenes on the server, but this feels somewhat clunky.  I would generalize this to say that REST presents an "impedance mismatch" when the problem is more naturally expressed in terms of verbs as opposed to nouns.
  • Tooling for SOAP is great these days.  Basically you can point your client at a service, push a button, and start calling the service on the spot.  By contrast, REST's lack of standardization and description means that manual effort will be needed to figure out what the interface looks like and how to call it correctly.  By and large this is easy to do because REST is simple, but it's still work that a SOAP app can simply avoid.
Choosing Between REST and SOAP/RPC 

Myself, I don't think any one approach is right for all situations.  These days, I would tend to use REST by default, unless the requirements are such that they couldn't be easily accomplished with it.  In such case I would go to SOAP with all its bells, whistles, foghorns, blinking lights, etc. etc.

REST and ASP.NET MVC

REST by itself does not say anything about what server technology to use.  In the Microsoft realm, Microsofties might say to use WCF to expose RESTful services.  This is certainly a good option, but ASP.NET MVC is also a good option.  While it's not RESTful by default because the action (the verb) is present in the URI as opposed to being driven by the HTTP request verb, ASP.NET MVC allows flexible routing and is very easily reconfigured to be RESTful.  It also gives you absolute control over the representations you serve to the client.  Bottom line, it's a perfect fit once reconfigured.

My experience working with REST and ASP.NET MVC has been a joy.  Having been used to large WebForms applications with big session state and authentication that depends on it, I've found it refreshing to rebuild my site (which would have blown away session state were I using it) and continue working without even having to log in again.  I can set it up such that I can close the browser, reopen it and keep working without logging in again, too.   It makes for a streamlined, pleasant development cycle.

Guidelines for a RESTful Application

With all that in mind, here are some guidelines I distilled from reading the book mentioned above:

URI Design

  • Make URIs descriptive (one school of thought says make them descriptive, and another says make them opaque; the author favors the former school).
  • One resource can map to one or more URIs, but each URI can map to only one resource.  Use the minimum number of URIs necessary.
  • Version your application or service; consider encoding the version number into the URI:  /v1/users/msimpson.
  • Model operations (such as transactions) that do not fit the standard methods as resources themselves.
  • Use path variables to encode hierarchy: /parent/child.
  • Put punctuation characters in path variables to avoid implying hierarchy where none exists:  /parent/child1;child2.  Use commas when the order of the scoping information is important, and semicolons when the order doesn’t matter.
  • Use query variables to imply inputs into an algorithm, for example: /search?q=wingnut&start=10.

Representations

  • Design your representations, both incoming (request) and outgoing (response).  I would go through an analysis process up front to:
    • designate my resources;
    • design my URI structure;
    • define what verbs are allowed for each URI;
    • define the representations accepted and served by each combination of URI and verb.  It may not be necessary to do it all up front, but there should at least be conventions defined up front.
  • Representations should link to related resources/states.
  • Representations need not convey the entire state of a resource, only some of it.
  • Where an entity-body is needed for a given request, consider exposing a form so that the client will be able to figure out how to make the request.

Resource State

  • When creating a subordinate resource, use PUT when the client is in charge of determining the resource URI, and POST when the server is in charge.
  • Don’t allow clients to PUT representations that change a resource’s state in relative terms.
  • Don’t expose unsafe operations (operations that change resource state) through GET.
  • HEAD, GET, PUT and DELETE should be idempotent (should have no undesired effects if called repeatedly).

Application State

  • Don’t use cookies, even for a simple session ID, unless the client is in charge of the cookie value.
  • Authenticate on every request, rather than maintaining a server session which breaks statelessness.  Authentication can be done using any of several different mechanisms, but credentials should probably not be part of the URI.  Consider HTTP Basic or Digest authentication, or a similar solution involving the Authenticate response header.

HTTP

  • Use the five built-in HTTP methods (HEAD, GET, PUT, POST, DELETE) to indicate what you’re trying to do. 
  • Use conditional GET (response headers Last-Modified and Etag, and request headers If-Modified-Since and If-None-Match).
  • Allow the client to cache data when appropriate, through the use of the Cache-Control response header.
  • Allow the client to make Look-Before-You-Leap (LBYL) requests, using the “Expect: 100-continue” request header.
  • Don’t put error messages in the representation; rather, use HTTP error codes appropriately.
  • Set the Content-Location response header if the URI requested is not the “primary” URI for the resource.

Other

  • Be careful with Ajax, as it can break addressability and statelessness.  But paradoxically, it can actually allow statelessness if used well.  If used, employ an Ajax framework to hide things like browser differences, and include equivalents to browser navigation in the application.

Currently rated 5.0 by 2 people

  • Currently 5/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Tags: , , , , , , , , ,

Software

Related posts

Comments

5/21/2009 3:44:13 PM

msimpson

Hmmm... know what would be interesting? A protocol like SOAP but without the transport-agnosticism. Imagine if SOAP required HTTP and leveraged it to the hilt... SOAP would be a lot simpler and might perform better.

msimpson

5/22/2009 2:36:53 AM

Clay Lenhart

I'm trying to grasp it all. I think what makes REST unique is that the noun and verb are separated. This allows for caching since GET and HEAD are understood by proxy servers. (The address is another way of describing the verb/noun separation).

The connectedness sounds good, but practically I don't see how it will be an advantage. If the client knows that a link goes to a resource, it probably already knows the format of the URL. Even if it doesn't, why is is important to encapsulate the URL resource location from the client?

Stateless isn't unique to REST, and neither is web farms or testability. I think it is good that REST has this, but it isn't unique.

Having noun-based "business logic" calls to save data will likely mean either you make many non-transactional calls, or you have a weirdly named resource to update multiple resources.

I'm thinking that GETs should be REST based, but "saves" should be verbs/RPC. REST seems to "get" reads if you pardon the pun.




Clay Lenhart gb

5/22/2009 4:00:36 AM

msimpson

That's a good point about statelessness and web farms - I didn't mean to imply that you couldn't run an RPC-style app statelessly, just that with REST it's pretty much required, whereas with RPC it's an option. I've updated the post to reflect that. Thanks!

Connectedness is to a RESTful app what hyperlinking is to HTML... I think the whole point of it is to make it easy for clients to get around and to figure out how to use your system. It allows you to change your URI structure without breaking your clients. And to a certain extent it helps mitigate the lack of a formal description or specification of the service, though I'd still prefer, I think, to have a formal description of any service of any importance.

Your point about "business logic" calls is a valid one, and goes to what I was saying about there possibly being an impedance mismatch between the constrained set of REST verbs and the problem you're trying to solve. But I think having special resources like this can work better than you might think - for example in my app I've exposed the login/logoff process via a resource called "session" (POST to login, DELETE to log off). Many times I think it's actually useful to explicitly express the problem as a noun. In a banking system, for example, you might have a resource called "transfer" to describe updates to two different "account" resources, wrapped in a transaction. It's useful to have the word "transfer" in the application's vocabulary. In DDD terms, it's part of the "ubiquitous language" of the system and should be expressed formally.

In general though, I think REST works really well for simple CRUD operations, but can get more and more unwieldy the more complex the application becomes.

msimpson

5/22/2009 11:18:29 AM

Clay Lenhart

A session resource, that's really cleaver! I'm going to use it if I get the chance.

My point of view is REST is good, but isn't "Wow, this is awesome!". I attended a session where the instructor was so excited about REST. I just want to know what the excitement is all about. For example,

* REST is simple and extremely interoperable.

True, though many languages work with SOAP.

* REST is addressable

Honestly, I haven't encountered a situation where addressability would solve a problem. Utilizing proxy servers is a neat hack though.

* REST has great performance.

I agree here, but few apps really deal with performance problems that REST solves. For example, there was one project, where the DA layer passed an XML document to the BL layer via SOAP (so it is XML within XML, ouch), then the BL layer passed it to the presentation layer via SOAP. Each layer was a server, so this was going over the network. As you can imagine, the SOAP architecture was very slow. I don't think REST is the answer here, but instead they should have had a web farm with each layer being just a DLL within the same process.

* REST has great scalability.

SOAP can be stateless and scalable.

* REST works better with AJAX.

True, but HTTP calls are a pretty obvious choice.

* REST is more easily testable.

WebServiceStudio allows for SOAP testing. With the right tools, REST isn't any more testable that SOAP.

REST is definitely a tool to keep in the tool belt. It's just overblown.

GWT, on the other hand is AWESOME! Wink

Clay Lenhart gb

6/14/2009 6:39:11 AM

pingback

Pingback from rapid-dev.net

The Technology Post for June 8th | rapid-DEV.net

rapid-dev.net

9/6/2009 3:06:46 AM

Josh Gough

Mike, this is a great article, extremely well summarized and explained.

Here are a couple of articles I've found very informative on this topic too:

This an old one, but a great one:
http://www.prescod.net/rest/rest_vs_soap_overview/

Very simple, minimalist explanation that is surprisingly informative:
http://www.xfront.com/REST-Web-Services.html

Roy Fielding on hypertextness of REST APIs:
roy.gbiv.com/.../rest-apis-must-be-hypertext-driven

Steve Vinoski writes a lot about REST:
http://steve.vinoski.net/blog/category/rest/
In the past, he had done a lot of work on CORBA.

A comment on addressability and its usefulness:

I believe that this is the most important part of REST, but not just of REST, but of how the WWW itself works. To a larger scale, it's even the most important part of DNS, etc.

I see the URI as a conceptual, logical extension of a DNS name. It just happens to be application/port specific so that it gives further resolution to a resource.

As Mike explained, and IIS / Apache 404s have been telling us for years, "Resources" are at the core of the WWW. For too long many of us have thought of web servers as glorified FTP servers, delivering files, or at best server-executed-to-create-output files, when the user requests the URI.

Google has capitalized on REST and addressability better than anyone. They do this in two ways:

1) When they index web sites for search, the URL and how it is constructed, which words are in, how many are in the URL and not the query string, are all part of the relevance of a link. This makes addressability extemely important for Search-Engine-Optimization.

2) All of the APIs and services they make available tend to be RESTful, addressable. For example, when you build a chart using the google charts API, or when you click "Link to this map" in Google Maps, you get a single, addressable resource pointer that can be used to fully reconstruct the resource. The URL and its query string are thus the complete input to the backend algorithm. They similarily make google calendar feeds or embeddable as a single URL.

Stefan Tilkov says this well in his summary here:

http://www.infoq.com/articles/rest-introduction

The way he describes addressability is simpl: "Give everything an ID"... I like that simple way of looking at it.

"ID" could be the entire globally unique www.MyDomain.com/path/post/WhatTheHeckIsRest

Taking it further with the google example:

As Mike creates blog posts, google continues to index them both by content and by the URL, so when people search for "What the heck is rest", the first link that comes up is this very post. Google is then nothing more than a mapper of content search terms to physical, unique URIs -- addresses

-Josh








Josh Gough us

9/6/2009 8:20:26 AM

Josh Gough

Mike, a book I just picked up and plan to read after I read Pro ASP.NET MVC from Apress is "Effective REST Services in .NET". You can find a great interview on DotNetRocks about it also.
take care,
Josh

Josh Gough us

9/9/2009 8:20:38 AM

Josh Gough

Here is a preview of the .NET book:

books.google.com/books+rest+services+via+.net&printsec=frontcover&source=bl&ots=yoymCmYvhj&sig=MyE1Qxd3W9xeb9_67K4S01CONDQ&hl=en&ei=Ke-nSvKJOKqmtgfC-N2kCA&sa=X&oi=book_result&ct=result&resnum=6#v=onepage&q=&f=false

Josh Gough us

Comments are closed

Powered by BlogEngine.NET 1.2.0.0
Theme by Mads Kristensen


Calendar

<<  September 2010  >>
MoTuWeThFrSaSu
303112345
6789101112
13141516171819
20212223242526
27282930123
45678910

View posts in large calendar

Pages

    Recent posts

    Recent comments

    Disclaimer

    The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.

    © Copyright 2010

    Sign in