I'm currently rewriting this site using ASP.NET MVC, REST, Dojo, and some other technologies. I've mentioned this to friends and coworkers, and many have asked what's the big deal about REST. I admit that until one really examines it closely, it's pretty underwhelming. "HTTP requests? I've been doing that for years!" I was in that skeptical camp for a long time, until I bought and read a book a friend of mine recommended. But now I have a fuller appreciation of what REST is about and what distinguishes it from RPC-style architectures. There's already a lot of material on the web about REST, but I thought I would describe it in my own words - hence this blog post.
About REST
REST ("Representational State Transfer") is a term coined by Roy Fielding in his well-known dissertation. I think one can build a RESTful application without fully understanding what Roy means by the term, but you could say that REST is "how the web is designed to work" - meaning in specific terms, it leverages HTTP.
Now I'll put on my Captain Obvious hat and point out some things about HTTP. It defines a few key things - resources for example, which are documents, data, and other things that are accessible via the web, and also a very limited set of verbs ("get", "post", "put", "delete", etc.) for operating on those resources. This is an important point - there are only a few standard verbs, but a potentially infinite set of "nouns" (resources). REST people speak of HTTP's constrained set of verbs as HTTP's uniform interface.
Now imagine an application with a potentially unlimited set of nouns, PLUS a potentially unlimited set of verbs. That's an RPC-style application. RPC-style apps have a much less uniform interface, because they have an infinite set of verbs.
Typically web services based on an RPC model use SOAP, a protocol that involves stuffing all the information necessary for a request (including the verb info) into a data structure, and sending that to the server using some transport mechanism. SOAP itself doesn't require any particular transport, but almost all implementations use HTTP for transport.
However, because all the info about the procedure call is in the SOAP "envelope", in effect SOAP rides on top of HTTP without leveraging it for anything but simple transport. You could say SOAP is a tunneling protocol.
This is innocuous enough, and in fact SOAP does bring some things to the table that are not in HTTP, but for many applications, HTTP would suffice. HTTP already contains mechanisms for authentication, caching, content expiration, exception indication, addressability, and other useful things. In some cases (like authentication), SOAP and its extensions reinvent the wheel.
With that background, I'll define a few more things central to REST.
Resources are the logical noun you're working with; strictly speaking, they don't physically exist. The resource state lives on the server, and is not exposed directly. Instead, the client deals with representations of a resource, which exist on a temporary basis.
A representation is simply some subset of the resource's information, in some format meaningful to the recipient. If your logical resource is a "person", maintained in your server database as a record in a Person table with fields for PersonID, FirstName, LastName, etc., then the person record in the database is not the resource itself - it's the resource state. The resource is the logical person - the person's soul as opposed to its body, to put it in pseudoreligious terms. A representation might be a chunk of JSON data sent across the wire listing some subset of that information (or all of it). Or it might be XML. Or HTML. Or plain text. You can have multiple representations of the same resource. HTTP and REST do not specify what the format is, just that there is one.
When a client requests data, works with it, updates it, deletes it, etc., there is another kind of state, called application state, which describes where the client is in its process, the current values of form fields, the history of user actions, etc. With REST, application state may only be maintained on the client, while resource state may only be maintained on the server. This puts the client in control of where it goes and what it does. It also allows the server to be what is typically just called stateless - the server forgets about the client in between requests. This allows the server to be simpler and scale better.
A URI provides addressability for a resource - the ability to reach a unique resource given an address. Each resource might be exposed via multiple URIs, but each URI points to only one resource. In the example above, you might have a URI like this: http://www.myserver.com/person/123, where '123' is the person ID. You could also have http://www.myserver.com/simpson/mike (if first + last name is unique). When requesting a representation of the resource, you might specify it in the URI (http://www.myserver.com/person/123/xml), or use the "Accept" HTTP header.
A third principle is connectedness. This just means that the client isn't required to know information that's out-of-band in order to follow from one resource to another or from one representation to another. All the information needed is supplied in the representation returned in the current request. In practical terms, this means doing things like returning URIs of related resources, instead of just their IDs.
Advantages of REST
Here are some advantages and disadvantages of REST, as compared to SOAP/RPC:
-
REST is simple and extremely interoperable. Everything speaks HTTP. That makes REST suitable for public-facing applications that might be used by anyone.
-
REST is addressable - a unique URI exists for every resource. By contrast, SOAP requests are usually funneled through a single URI for the service as a whole, so you can't refer to a particular resource or action directly. You have to embed that stuff in the SOAP envelope.
-
REST has great performance. There is no heavyweight XML/XSLT to parse (at least not necessarily). With no session state, you avoid serialization costs on the server side. True, you may have more requests, and you may need to push more representation data across the wire, but it's usually no worse than viewstate (which is gone by the way), and HTTP traffic is easily compressed anyway.
-
REST has great scalability. Stateless servers are easily scaled out across a web farm (as Clay pointed out below, you can also run RPC-style apps statelessly, but it's optional for them). Plus, you can take advantage of HTTP's extensive capabilities for caching and optimization.
-
REST works better with AJAX. A REST service can return data in JSON or another format easily parsed by Javascript. It is impractical for Javascript to parse a SOAP response.
-
REST is more easily testable. Ignoring "proper" unit testing for the purpose of this discussion, you can simply use a browser to hit the service directly. You aren't going to do that with a SOAP service.
Disadvantages of REST
Nothing is perfect. Here is where REST falls down:
-
Although there is a standard of sorts (
WADL) for declaring the interface a REST-based site presents, it's not widely used at all. It's also possible to describe a REST interface fairly well with WSDL 2.0. But many people disagree that such description is necessary or desirable. If you feel you do need to be able to describe your REST interface to clients, there is no standard that is likely to work across the board.
-
REST and HTTP by themselves do not address some more complicated scenarios that are specifically addressed by SOAP and its WS-* extensions - things like transactions, for example. It is possible to model a transaction with REST by defining a resource for the entire transaction state, and handling the multiple parts behind the scenes on the server, but this feels somewhat clunky. I would generalize this to say that REST presents an "impedance mismatch" when the problem is more naturally expressed in terms of verbs as opposed to nouns.
-
Tooling for SOAP is great these days. Basically you can point your client at a service, push a button, and start calling the service on the spot. By contrast, REST's lack of standardization and description means that manual effort will be needed to figure out what the interface looks like and how to call it correctly. By and large this is easy to do because REST is simple, but it's still work that a SOAP app can simply avoid.
Choosing Between REST and SOAP/RPC
Myself, I don't think any one approach is right for all situations. These days, I would tend to use REST by default, unless the requirements are such that they couldn't be easily accomplished with it. In such case I would go to SOAP with all its bells, whistles, foghorns, blinking lights, etc. etc.
REST and ASP.NET MVC
REST by itself does not say anything about what server technology to use. In the Microsoft realm, Microsofties might say to use WCF to expose RESTful services. This is certainly a good option, but ASP.NET MVC is also a good option. While it's not RESTful by default because the action (the verb) is present in the URI as opposed to being driven by the HTTP request verb, ASP.NET MVC allows flexible routing and is very easily reconfigured to be RESTful. It also gives you absolute control over the representations you serve to the client. Bottom line, it's a perfect fit once reconfigured.
My experience working with REST and ASP.NET MVC has been a joy. Having been used to large WebForms applications with big session state and authentication that depends on it, I've found it refreshing to rebuild my site (which would have blown away session state were I using it) and continue working without even having to log in again. I can set it up such that I can close the browser, reopen it and keep working without logging in again, too. It makes for a streamlined, pleasant development cycle.
Guidelines for a RESTful Application
With all that in mind, here are some guidelines I distilled from reading the book mentioned above:
URI Design
-
Make URIs descriptive (one school of thought says make them descriptive, and another says make them opaque; the author favors the former school).
-
One resource can map to one or more URIs, but each URI can map to only one resource. Use the minimum number of URIs necessary.
-
Version your application or service; consider encoding the version number into the URI: /v1/users/msimpson.
-
Model operations (such as transactions) that do not fit the standard methods as resources themselves.
-
Use path variables to encode hierarchy: /parent/child.
-
Put punctuation characters in path variables to avoid implying hierarchy where none exists: /parent/child1;child2. Use commas when the order of the scoping information is important, and semicolons when the order doesn’t matter.
-
Use query variables to imply inputs into an algorithm, for example: /search?q=wingnut&start=10.
Representations
-
Design your representations, both incoming (request) and outgoing (response). I would go through an analysis process up front to:
-
designate my resources;
-
design my URI structure;
-
define what verbs are allowed for each URI;
-
define the representations accepted and served by each combination of URI and verb. It may not be necessary to do it all up front, but there should at least be conventions defined up front.
-
Representations should link to related resources/states.
-
Representations need not convey the entire state of a resource, only some of it.
-
Where an entity-body is needed for a given request, consider exposing a form so that the client will be able to figure out how to make the request.
Resource State
-
When creating a subordinate resource, use PUT when the client is in charge of determining the resource URI, and POST when the server is in charge.
-
Don’t allow clients to PUT representations that change a resource’s state in relative terms.
-
Don’t expose unsafe operations (operations that change resource state) through GET.
-
HEAD, GET, PUT and DELETE should be idempotent (should have no undesired effects if called repeatedly).
Application State
-
Don’t use cookies, even for a simple session ID, unless the client is in charge of the cookie value.
-
Authenticate on every request, rather than maintaining a server session which breaks statelessness. Authentication can be done using any of several different mechanisms, but credentials should probably not be part of the URI. Consider HTTP Basic or Digest authentication, or a similar solution involving the Authenticate response header.
HTTP
-
Use the five built-in HTTP methods (HEAD, GET, PUT, POST, DELETE) to indicate what you’re trying to do.
-
Use conditional GET (response headers Last-Modified and Etag, and request headers If-Modified-Since and If-None-Match).
-
Allow the client to cache data when appropriate, through the use of the Cache-Control response header.
-
Allow the client to make Look-Before-You-Leap (LBYL) requests, using the “Expect: 100-continue” request header.
-
Don’t put error messages in the representation; rather, use HTTP error codes appropriately.
-
Set the Content-Location response header if the URI requested is not the “primary” URI for the resource.
Other
-
Be careful with Ajax, as it can break addressability and statelessness. But paradoxically, it can actually allow statelessness if used well. If used, employ an Ajax framework to hide things like browser differences, and include equivalents to browser navigation in the application.