Brail’s Blog

April 17, 2009

APIs and Web Apps are Different

Filed under: Cloud Services, APIs — greg @ 2:29 pm

I have been working on some applications recently that include a traditional web app component as well as an API. This has reminded me of what I think is a key design principle for building APIs, namely this: APIs are different from web applications. You should treat them differently and deploy them as separate components.

To run a modern web application, you need:

  • Session-based login, so that you present a nice, pretty “login” screen
  • Session state, so your customers can store their shopping cart and whatever else you need
  • A presentation framework like JSP or JSF
  • A navigation framework like Struts
  • Stylesheets
  • Static content

To run an API, you don’t need or want any of that stuff. 

For instance, session-based “login” pages are the norm for web apps today because they look nicer than the “password” dialog box that your browser puts up when you use HTTP Basic authentication.

Many APIs have also been written that use this paradigm — there is a “login” method that takes a username/password and returns a session token that the client must use for each request. I suspect that this is common because it is how the original web application was written and the developers figure it was easy to use the same paradigm, or perhaps it wasn’t even possible to expose the back-end services in any other way because the web app was baked together long before an API was conceived.

(Or perhaps API builders think that this is more efficient than validating the password on each request. But why? You have to validate the session ID anyway, or at least you better do that, and if validating a password is causing performance problems then perhaps you need a better app server.)

While this pattern is fine for a web browser, it makes your API more difficult to use because now the API client must remember that session id. It also makes it more difficult for proxies and other types of intermediaries because they may have to remember the session ID too. And since every API has a slightly different “login” method and a different way of retrieving and returning the session ID, there’s no standard library that the API client can use to make all this happen automatically.

On the other hand, an API that uses HTTP Basic authentication is easy to use and you can invoke it in lots of ways, from Java code to “curl” on your command line.

Session state is another difference between a good web application and a useful API — it’s crucial for humans using your web site, but does your API need it? It will be called by other computers who can remember what you tell them to remember. If your API is stateless, then you’ll have an easier time scaling it up, because session state introduces complexities to the load balancing layer, and if you want it replicated for high availability, to the app server layer too.

Furthermore, all the other “stuff” that you will deploy in your web app server in order to run a web app is, if anything, a detriment to the API. Frameworks like Struts, JSF, etc. use memory and disk space.

While we’re at it, think about those good old “three-tier architectures” from the 1990s. The whole point was to separate presentation and business logic. That was a good idea! If you’re building a web app and an API at the same time, it also makes sense to use the API as the business logic back end if you can.

So, if you can do that, then you will be probably releasing the web app component and the API component on different release schedules — which will be easier if you develop them, manage them, and deploy them separately.

What does this mean in the end? For a small application, perhaps this just means that you have two WAR files (or whatever you use) — one for the web app and another for the API. For a larger deployment, a good idea is to set up a separate DNS entry for your API, so that “api.mycompany.com” is different from “www.mycompany.com”. Perhaps at first these will both point to the same IP address, but over time you have the flexibility to change this if you wish. 

April 10, 2009

Why should you care about “Cloud Services?”

Filed under: Cloud Services — greg @ 1:18 pm

Everyone is talking about “cloud computing.” There’s a ton going on out there, and a ton of confusion. I’d like to talk about not cloud computing in general, but the concept of “cloud services” and why they’re important.

IDC has posted an overview of what the term “cloud service” means, including an eight-point checklist. I’d like to offer my own, shorter version on this. In my opinion, a cloud service: 

  • Is accessible over the public Internet. 
  • Is accessed using standard web services protocols like XML or JSON over HTTP — proprietary client-side hardware and software are not acceptable.
  • Is billed elastically, so consumers can pay for what they use.
  • Scales elastically, so consumers don’t not need to worry about capacity as long as they stick within their SLAs.

Now, this is different from “cloud computing.” “Cloud Computing” (again, see IDC) is about running your infrastructure in an elastic environment. You can run a cloud service “in the cloud” by building it on Google App Engine, or Microsoft Azure, or by building your own app and deploying it on your server in EC2. But you can also build a cloud service on an enterprise network and run it from your own DMZ, or run it on a server in a traditional co-location facility. In fact, I suspect that the majority of cloud services deployed over the next few years will run in traditional data centers and will not actually run “in the cloud!”

How is that different from an API or a web service?

Is a “web services API”, like I talked about last time, a “cloud service?” Absolutely. However, right now I have two problems with the generic term “API.” The first is that a “API” on the web today implies something like Twitter or Google — not an enterprise-type service that provides a service so valuable that it’s worth paying money for. The second problem is that “API” means a lot of things other than “web service” and the distinction between, say, the “JDBC API” and the “Twitter API” is lost on less-technically-oriented people.

Then, is “cloud service” just another term for a “web service?” No. First, unless it’s available on the Internet, then it’s not a “cloud service.” Second, “web service” has come to mean SOAP, WSDL, and other “enterprise-y” technologies. (For instance, see the Wikipedia definition, which as of today, at least, shows a diagram of a “web service” complete with SOAP, WSDL, and UDDI.) There’s no reason why you can’t build a web service without these technologies, but if you do you face an uphill battle against those who assume that “web service” means “SOAP.”

So, is a web application a cloud service? IDC thinks it is, but I’m not so sure. Salesforce.com pioneered “SaaS,” and they have offered an XML-based API for a long time now — that’s truly a “cloud service.” However, when you log in to the Salesforce.com web site using your browser, are you using a cloud service — inasmuch as every interactive web site has aspects of a cloud service, then yes, but I think that’s making the definition too broad.

In the end, I like a very simple definition of a “cloud service” – a web service, running on the Internet, that can be used in an elastic way.

So what?

Most of you are probably already using cloud services. Twitter is a cloud service. The Salesforce.com APIs are cloud services, as are the APIs provided by UPS and FedEx. (And as far as I know, none of those cloud services “run in the cloud” — they run in traditional data centers or co-lo facilities.)

A cloud service is a great way to expose your core functionality in a lot of new ways. For example:

  • By offering transit times, rate quotes, and tracking as a cloud service, then FedEx and UPS allow all sorts of third parties to integrate their services directly into their applications, where previously a user had to call or visit the web site. This saves UPS and FedEx money, and it makes it easier for their customers to use their services — which means they get more business.
  • By offering their catalog as a web service, Amazon makes it possible for all sorts of other retailers and manufacturers to produce a pretty, up-to-the-minute web-based storefront however they want, using whatever graphic design people are the best for the job — but when it comes to the boring, hard work of taking money and shipping products, that cool site can just delegate to Amazon without losing control of the user experience.

On the other hand, a ton of stuff today does not take advantage of this technology. Would it be easier to build an air-fare aggregation engine like Orbitz or Kayak if all the airlines offered a cloud service to get access to schedules and fares? Absolutely! Do the airlines want that? It doesn’t seem like it.

Or, what about all the systems today that communicate using FTP, or EDI VANs, or VPNs, or fax, or tapes being sent via UPS? Would those systems be a lot more effective if they communicated using cloud services instead? What do you think?

 

April 5, 2009

What we are seeing out there — Part 1: Web Services APIs

Filed under: Cloud Services, APIs — greg @ 9:55 pm

I thought I’d start this blog out by talking a little bit about what we’ve been seeing out there in the field. Coming up on two years with Sonoa, I’ve been talking to customers literally for hours every day, and I’ve lost track of the number of people we have spoken with. Over time patterns develop, and the longer we do this the more obvious the patterns become.

I’ve seen more and more companies of all sizes start to build web services APIs so that they can make their own capabilities available to a wider audience. The concept of an “API” on the Internet is not new but it’s been taking off, especially recently. At this point, if your company isn’t offering such API, it probably will be before long.

What’s an API?

In computer science we’ve used the term “Application Programming Interface,” or “API,” for pretty much forever. Everything from the Windows operating system to the Oracle database has an API.

What I’m talking about is a little more specific — a “web services API”. That means some functionality that you can invoke over the Internet. Specifically, an “API” in this context:

  • Is invoked over the public Internet
  • Almost always uses HTTP (or HTTPS) as its communications protocol
  • Often uses XML to represent a response
  • Often uses either HTTP query parameters or some XML to represent a request

For these reasons, a good API often requires no special client-side software. The most successful APIs, like the Twitter API, can be used from anywhere on the Internet from literally any programming environment that can communicate using HTTP.

The most effective and popular APIs are also simple. For instance, in order to see the current status of Lance Armstrong on Twitter as an XML document, all I have to do is perform an HTTP GET to this URL:

http://www.twitter.com/users/show/lancearmstrong.xml

Aren’t API’s just web services?

Well, yes they are. In fact, all web services match the description I put in the section above except for one thing — a web service does not necessarily have to be accessed using the public Internet. In fact, an awful lot of web services today run inside a corporate network and are never touched by the Internet.

But on the Internet the term “API” has become a lot more prevalent than the term “web service.” that is partly due to semantics — “web service” typically implies “SOA,” which to many people implies “SOAP,” which in turn makes people think, “big, complicated, and expensive.” (And if you don’t know what I was talking about in this paragraph then you probably had the same reaction.)

So from a purely pedantic perspective, an “API” is just a “web service.” But that doesn’t mean there isn’t an important distinction — typically, something called an “API” is designed for the Internet, to be consumed by many different types of clients, and if it is successful it is pretty simple. On the other hand, a SOAP web service running on a big “web services stack” inside a corporate network could be a thing of beauty that is widely adopted, or it could be a tangle of spaghetti that keeps one or two applications intimately intertwined for the foreseeable future.

Why build an API?

We will talk more about this in the future. The short answer is that an API is a great way to get more people to use the services that you offer. For instance, most Twitter traffic comes from its API, not from the web site. (If you’re using a desktop Twitter client like Twhirl or Spaz or even some of the iPhone or Blackberry apps, you’re using the Twitter API.) It also lets you focus on the functionality that your company provides rather than on presentation. Do you run a trucking company with great scheduling and routing capabilities? You can expose those as an API that your clients can embed into their own applications more easily than you can build a slick web site and keep it up to date with the latest fashions in web site design and implementation.

The Challenge for the API Builder

If you’re planning to build an API and expose it to the Internet then you’re going to have to face some challenges that you won’t necessarily find when building an internal web service. For instance:

Design. Like I said, the best APIs are the simplest, but designing a simple API isn’t easy. Plus, what’s simple to one user population isn’t simple to another. A “REST-style” API like Twitter’s is great for AIR programmers or Perl hackers but someone accessing it from inside a big web app server stack might actually find it easier to use a SOAP web service with a proper WSDL. On the other hand, a SOAP-only API would have been death for Twitter because it would have meant that those tens of thousands of Perl hackers would have had a heck of a time using it in the first place.

Compatibility. Let’s say you don’t get the design right the first time — and how often does that happen? How many “old” versions of your API can you afford to keep running to keep clients functioning? Are are you willing to tell your users, “sorry, we changed the API and now you have to re-write your apps.”

Authentication and Authorization. What does your API do? If it just lets you look up public information, maybe you don’t need authentication. But are you planning on using it with more sensitive data? Will people be using your API to spend money? They’re going to expect that they have to authenticate using a username and password at the very least. There are quite a few ways to do that — which one(s) will you choose? How will you manage all those accounts?

Threat protection. Is there a possibility that a malformed API request can cause your servers to go off in la-la-land, trying to execute an impossible query? Did you code everything write to prevent a SQL injection attach? What if a client sends your servers some bizarre XML — will they run out of memory or crash?

Latency. Since the goal of your API is to provide a service over the Internet, then you will have to live with anywhere up to several hundred milliseconds of latency just to get to and from your API. If each API request takes hundreds more milliseconds, or even several seconds, to run, then how will that affect the perception of your service?

Visibility. Who is using your API? How often? How do the patterns change over time? What kind of latency are they seeing? How many errors do they get? Do different users see a higher error rate? Is the user you signed up last week actually using the API? These are all questions you will want to answer in order to serve your customers better.

Rate limiting. How do you plan to limit user access to your API? Sometimes the right answer is to do nothing — and this is often the right answer for an internal system, where saying “no” is not an option. But for a public, Internet-based API, you owe it to yourself to at least protect your API against disaster — a user who decides today’s a great day to see if they can call your API 100 or 1000 times per second, or one that makes a programming mistake and codes up an infinite loop, or worse. And if you’re planning on a larger user population, then a formal set of quotas makes a lot of sense, which is why Twitter, Yahoo, Google, Amazon, and others all put limits on how much you’re allowed to use their APIs before you give them a call and let them know what you’re up to.

Thanks for reading this far!

Over the next weeks and months, I hope to write more about these topics and many others.

 

Powered by WordPress