Features of the Resource-Oriented Architecture - Application State Versus Resource State (Page 3 of 4 )
When we talk about “statelessness,” what counts as “state”? What’s the difference between persistent data, the useful server-side data that makes us want to use web services in the first place, and this state we’re trying to keep off the server? The Flickr web service lets you upload pictures to your account, and those pictures are stored on the server. It would be crazy to make the client send every one of its pictures along with every request to flickr.com, just to keep the server from having to store any state. That would defeat the whole point of the service. But what’s the difference between this scenario, and state about the client’s session, which I claim should be kept off the server?
The problem is one of terminology. Statelessness implies there’s only one kind of state and that the server should go without it. Actually, there are two kinds of state. From this point on in the book I’m going to distinguish between application state, which ought to live on the client, and resource state, which ought to live on the server.
When you use a search engine, your current query and your current page are bits of client state. This state is different for every client. You might be on page 3 of the search results for “jellyfish,” and I might be on page 1 of the search results for “mice.” The page number and the query are different because we took different paths through the application. Our respective clients store different bits of application state.
A web service only needs to care about your application state when you’re actually making a request. The rest of the time, it doesn’t even know you exist. This means that whenever a client makes a request, it must include all the application states the server will need to process it. The server might send back a page with links, telling the client about other requests it might want to make in the future, but then it can forget all about the client until the next request. That’s what I mean when I say a web service should be “stateless.” The client should be in charge of managing its own path through the application.
Resource state is the same for every client, and its proper place is on the server. When you upload a picture to Flickr, you create a new resource: the new picture has its own URI and can be the target of future requests. You can fetch, modify, and delete the “picture” resource through HTTP. It’s there for everybody: I can fetch it too. The picture is a bit of resource state, and it stays on the server until a client deletes it.
Client state can show up when you don’t expect it. Lots of web services make you sign up for a unique string they call an API key or application key. You send in this key with every request, and the server restricts uses it to restrict you to a certain number of requests a day. For instance, an API key for Google’s deprecated SOAP search API is good for 1,000 requests a day. That’s client state: it’s different for every client. Once you exceed the limit, the behavior of the service changes dramatically: on request 1,000 you get your data, and on request 1,001 you get an error. Meanwhile, I’m on request 402 and the service still works fine for me.
Of course, clients can’t be trusted to self-report this bit of application state: the temptation to cheat is too great. So servers keep this kind of application state on the server, violating statelessness. The API key is like the Rails _session_id cookie, a key into a server-side client session that lasts one day. This is fine as far as it goes, but there’s a scalability price to be paid. If the service is to be distributed across multiple machines, every machine in the cluster needs to know that you’re on request 1,001 and I’m on request 402 (technical term: session replication), so that every machine knows to deny you access and let me through. Alternatively, the load balancer needs to make sure that every one of your requests goes to the same machine in the cluster (technical term: session affinity). Statelessness removes this requirement. As a service designer, you only need to start thinking about data replication when your resource state needs to be split across multiple machines.