Basic Concepts

 


 

Requests

 

Responses

  • All responses are UTF-8 encoded.
  • Times are expressed in Unix epoch format.
  • List responses with zero elements include an empty container element and an HTTP response code of 200.
  • Responses to requests for a single element, when the element does not exist, return an error element and an HTTP response code of 404.
  • See Errors for description of status codes and error behavior.

 

Responses May Adjust Requested Time Periods

You'll notice that some responses contain data for slightly different time periods than those you requested with arguments like min_date, max_date, and so on. That's because the API adjusts your arguments in order to respond quickly.

 

You can still get exactly the data you need, but you may have to make additional requests. Each response includes the adjusted timestamps in the min_date and max_date properties of the root element. Your application must pay attention to the adjusted timestamps and, if necessary, make an additional request for adjacent time periods. Your application also must examine each element in the response and ignore those that fall outside the precise time period required.

 

Or you may find it easier to design your application to be flexible about the time periods.

 

In the response, the min_date property is always present. The max_date property is present only when the request included a maximum timestamp argument, such as max_date, max_submit_date, or max_promote_date.

 

Implicit Time Periods

Many requests allow you to specify arguments like min_date and max_date, but even if you do not include these arguments in your request, many API requests imply a time period, to ensure a quick response.

 

For example, this request seems to ask for every Digg by every Digg user, ever (Well, back to 2004, anyway.):

 

http://services.digg.com/stories/diggs?appkey=http%3A%2F%2Fapidoc.digg.com

 

(That's an example of the List Events endpoint.)

 

We'd like to give you that information, but, as you can imagine, it would take some time to compile a response that includes millions of Diggs. (Sure, you'll only get up to 100 of them at a time, but we still need to figure out how many there are!)

 

In order to respond quickly, the API applies an implicit time period. For the List Events endpoint, the implicit time period is currently "the last hour." This may change from time to time, so your application should look to the min_date and/or max_date properties of the root element of the response for the time period applied.

 

To learn which requests use an implicit time period, see the documentation for each endpoint. Often, an implicit time period is not used, when the request is limited to, for example, a single user or a single story. The API always indicates whether an implicit time period was applied, by including the min_date and/or max_date properties in the root element of the response.

 

Multiple Requests to Collect Continuous Data

Since the API may adjust the requested time period and apply implicit time periods to requests, how can an application collect continuous data? By making successive requests for adjacent time periods.

 

Imagine an application that needs to collect data from the latest to the earliest. That application should:

 

  1. Make an initial request with no min_date or max_date argument. The response will include only the latest items. The start of the implicit time period can be found in the min_date property of the root element of the response.
  2. Note the min_date property in the response, and make another request with that timestamp as the max_date argument. The response will include items which immediately precede that timestamp, and a new value in the min_date property.
  3. Repeat step 2 until the requests have gone back in time far enough to retrieve all of the data needed.

 

Note: It may be necessary to use the offset and count arguments to retrieve all of the data for each time period. See Using Offsets for more information.

 

Now consider an application that keeps track of current data in real time. That application should:

 

  1. Make an initial request with no min_date or max_date argument. The response will include only the latest items. The end of the implicit time period can be found in the timestamp property of the root element of the response--it'll be close to the time when the request was received.
  2. Note the timestamp in the response, and make another request with that timestamp as the min_date argument. The response will include items which immediately follow that timestamp, and a new value in the timestamp property.
  3. Repeat step 2 to get even more current data.

 

But remember, [#BePolitePlease|be polite]: Don't make requests every second, or our operations guys may see your application as an attacking robit. Consider making one request per minute, and processing or displaying the data between requests. That's how Stamen's flash toolkit does its magic.

 

Using Offsets

 

The offset and count arguments are used to ensure that each response is of a reasonable size. When a request specifies a large amount of data, such as many Diggs, users, or news items, each response provides only a portion of that data. Additional requests can retrieve, piece by piece, the desired result, by specifying an offset into the data and a count of elements to be returned. The maximum count for each request is limited. See the documentation of each endpoint for the specific limit.

 

Retrieving the full set of data in this way can require a fairly complex string of asynchronous requests, so provided below is example code to bootstrap common tasks.

 

Note: Example code generally assumes that you're using Prototype. You should be familiar with it before proceeding any further.

 

Fetching Friends

 

No, that wasn't an adjective. Fetching a user's friends is an example of the simplest usage of offsets: a single level of recursive callbacks.

 

Assuming that proxy.php is your proxy:

function fetchFriends(storage, offset, callback) {
    storage = storage || [];
    offset = offset || 0;
    callback = callback || new Function();
    
    var que = escape('?'), amp = escape('&');
    new Ajax.Request('/proxy.php?proxy_url=http://testapi.digg.internal/user/digitalgopher/fans'+que+'type=json'+amp+'count=100'+amp+'offset=' + offset, {
        method: 'get',
        onComplete:function(transport) {
            var i, fansChunk = eval('(' + transport.responseText + ')');
            
            if(fansChunk.count + fansChunk.offset >= fansChunk.total || fansChunk.offset > 100) {
                callback(storage);
                return;
            }
            
            for(i = 0; i < fansChunk.users.length; i++) {
                storage.push(fansChunk.users[i]);
            }
            
            fetchFriends(storage, Number(fansChunk.offset) + 100, callback);
        }
    });
}

 

Usage:

fetchFriends([], 0, function(storage) {
    dump(storage.inspect() + "n");
});

 

User Agents

All API requests must include a User-Agent HTTP Header. A request without this header will receive no response.

Some commonly used languages do not send the User-Agent header by default.

In PHP, for example, one must explicitly set the user_agent setting in php.ini or through ini_set(). Example:

ini_set('user_agent', 'My-Application/2.5');

In Ruby, the User-Agent header can be explicitly included:

open('http://services.digg.com/user/sbwms', 'User-Agent' => 'My-Application/2.5')

(Thanks, Lynn.)

We welcome contributions of other examples.

 

Be Polite, Please!

Use good judgment when designing your application. Don't make repeated requests for the same data frequently. Examples:

 

  • If you call the API to find out whether a story has been submitted, wait a minute or two between requests.
  • To track diggs as they occur, call the API once a minute to retrieve the diggs during the last minute. The response includes a "max_date" timestamp that you can use as the "min_date" argument for your next call.
  • If your application doesn't need real-time data, cache the API responses on your server or use a cached proxy. (More on caching below.)

 

We monitor API usage, and we may block applications that do silly things. Use good judgment to keep your application running smoothly!

 

Caching API Responses

 

Most applications don't need real-time API responses and so should use caching to avoid getting blocked. This is especially true if you have a high-traffic web site and you want to display Digg data directly on your web pages. All web sites, regardless of traffic, should use one of these methods to cache API responses:

 

  • Make the API request in your server-side code and cache the response.
  • Set up a cached proxy on your web site. If your web page calls the Digg API via AJAX, you'll need a proxy in any case, due to Javascript security, and it's straightforward to add a cache to your proxy. You're welcome to use Services_Digg_Proxy from PEAR Services_Digg to implement your proxy, and you can easily add caching.
  • Use Digg's public API proxy (below). This is especially suitable for script tags in your web pages.

 

Digg's Public API Proxy

 

Digg provides a public API proxy that any application can use. The background is kind of interesting:

 

Digg invites everyone to Create a Digg Widget for their own web sites. Digg widgets are just script tags embedded in any web page. The script tag loads Javascript from the Digg API, using the Javascript response type. Many web pages have Digg widgets, and millions of people open those web pages in their web browsers.

 

If Digg widgets called the API directly, there would be an unnecessary load on the API servers, especially as widgets don't need to display real-time information. So Digg widgets instead call the API through Digg's public API proxy, which caches the responses, reducing the load on the Digg API. Digg's public API proxy uses Services_Digg_Proxy, so you can read full documentation under PEAR Services_Digg. But here's the cookbook version:

 

To use Digg's public API proxy, just change the URL of any API request to http://digg.com/tools/services. The proxy takes an "endPoint" query string argument specifying the Digg API endpoint requested, together with the other query string arguments from the API request.

 

Here's an example. The following direct API request asks for information about a page on the Digg Blog at http://blog.digg.com/?p=98:

 

http://services.digg.com/stories?link=http%3A%2F%2Fblog.digg.com%2F%3Fp%3D98&appkey=http%3A%2F%2Fexample.com&type=xml

 

The corresponding call to the Digg's public API proxy is:

 

http://digg.com/tools/services?endPoint=/stories&link=http%3A%2F%2Fblog.digg.com%2F%3Fp%3D98&type=xml&appkey=http%3A%2F%2Fexample.com

 

This request uses the XML response type, however for use in a script tag, you would use the Javascript response type.


Page Information

  • 1 week ago [history]
  • View page source
  • You're not logged in
  • No tags yet learn more

Wiki Information

Recent PBwiki Blog Posts