You'll notice that some responses contain data for slightly different time periods than those you requested with arguments like min_date, max_date, and so on. That's because the API adjusts your arguments in order to respond quickly.
You can still get exactly the data you need, but you may have to make additional requests. Each response includes the adjusted timestamps in the min_date and max_date properties of the root element. Your application must pay attention to the adjusted timestamps and, if necessary, make an additional request for adjacent time periods. Your application also must examine each element in the response and ignore those that fall outside the precise time period required.
Or you may find it easier to design your application to be flexible about the time periods.
In the response, the min_date property is always present. The max_date property is present only when the request included a maximum timestamp argument, such as max_date, max_submit_date, or max_promote_date.
Many requests allow you to specify arguments like min_date and max_date, but even if you do not include these arguments in your request, many API requests imply a time period, to ensure a quick response.
For example, this request seems to ask for every Digg by every Digg user, ever (Well, back to 2004, anyway.):
http://services.digg.com/stories/diggs?appkey=http%3A%2F%2Fapidoc.digg.com
(That's an example of the List Events endpoint.)
We'd like to give you that information, but, as you can imagine, it would take some time to compile a response that includes millions of Diggs. (Sure, you'll only get up to 100 of them at a time, but we still need to figure out how many there are!)
In order to respond quickly, the API applies an implicit time period. For the List Events endpoint, the implicit time period is currently "the last hour." This may change from time to time, so your application should look to the min_date and/or max_date properties of the root element of the response for the time period applied.
To learn which requests use an implicit time period, see the documentation for each endpoint. Often, an implicit time period is not used, when the request is limited to, for example, a single user or a single story. The API always indicates whether an implicit time period was applied, by including the min_date and/or max_date properties in the root element of the response.
Since the API may adjust the requested time period and apply implicit time periods to requests, how can an application collect continuous data? By making successive requests for adjacent time periods.
Imagine an application that needs to collect data from the latest to the earliest. That application should:
Note: It may be necessary to use the offset and count arguments to retrieve all of the data for each time period. See Using Offsets for more information.
Now consider an application that keeps track of current data in real time. That application should:
But remember, [#BePolitePlease|be polite]: Don't make requests every second, or our operations guys may see your application as an attacking robit. Consider making one request per minute, and processing or displaying the data between requests. That's how Stamen's flash toolkit does its magic.
The offset and count arguments are used to ensure that each response is of a reasonable size. When a request specifies a large amount of data, such as many Diggs, users, or news items, each response provides only a portion of that data. Additional requests can retrieve, piece by piece, the desired result, by specifying an offset into the data and a count of elements to be returned. The maximum count for each request is limited. See the documentation of each endpoint for the specific limit.
Retrieving the full set of data in this way can require a fairly complex string of asynchronous requests, so provided below is example code to bootstrap common tasks.
Note: Example code generally assumes that you're using Prototype. You should be familiar with it before proceeding any further.
No, that wasn't an adjective. Fetching a user's friends is an example of the simplest usage of offsets: a single level of recursive callbacks.
Assuming that proxy.php is your proxy:
function fetchFriends(storage, offset, callback) {
storage = storage || [];
offset = offset || 0;
callback = callback || new Function();
var que = escape('?'), amp = escape('&');
new Ajax.Request('/proxy.php?proxy_url=http://testapi.digg.internal/user/digitalgopher/fans'+que+'type=json'+amp+'count=100'+amp+'offset=' + offset, {
method: 'get',
onComplete:function(transport) {
var i, fansChunk = eval('(' + transport.responseText + ')');
if(fansChunk.count + fansChunk.offset >= fansChunk.total || fansChunk.offset > 100) {
callback(storage);
return;
}
for(i = 0; i < fansChunk.users.length; i++) {
storage.push(fansChunk.users[i]);
}
fetchFriends(storage, Number(fansChunk.offset) + 100, callback);
}
});
}
Usage:
fetchFriends([], 0, function(storage) {
dump(storage.inspect() + "n");
});
All API requests must include a User-Agent HTTP Header. A request without this header will receive no response.
Some commonly used languages do not send the User-Agent header by default.
In PHP, for example, one must explicitly set the user_agent setting in php.ini or through ini_set(). Example:
ini_set('user_agent', 'My-Application/2.5');
In Ruby, the User-Agent header can be explicitly included:
open('http://services.digg.com/user/sbwms', 'User-Agent' => 'My-Application/2.5')
(Thanks, Lynn.)
We welcome contributions of other examples.
Use good judgment when designing your application. Don't make repeated requests for the same data frequently. Examples:
We monitor API usage, and we may block applications that do silly things. Use good judgment to keep your application running smoothly!
Most applications don't need real-time API responses and so should use caching to avoid getting blocked. This is especially true if you have a high-traffic web site and you want to display Digg data directly on your web pages. All web sites, regardless of traffic, should use one of these methods to cache API responses:
Digg provides a public API proxy that any application can use. The background is kind of interesting:
Digg invites everyone to Create a Digg Widget for their own web sites. Digg widgets are just script tags embedded in any web page. The script tag loads Javascript from the Digg API, using the Javascript response type. Many web pages have Digg widgets, and millions of people open those web pages in their web browsers.
If Digg widgets called the API directly, there would be an unnecessary load on the API servers, especially as widgets don't need to display real-time information. So Digg widgets instead call the API through Digg's public API proxy, which caches the responses, reducing the load on the Digg API. Digg's public API proxy uses Services_Digg_Proxy, so you can read full documentation under PEAR Services_Digg. But here's the cookbook version:
To use Digg's public API proxy, just change the URL of any API request to http://digg.com/tools/services. The proxy takes an "endPoint" query string argument specifying the Digg API endpoint requested, together with the other query string arguments from the API request.
Here's an example. The following direct API request asks for information about a page on the Digg Blog at http://blog.digg.com/?p=98:
http://services.digg.com/stories?link=http%3A%2F%2Fblog.digg.com%2F%3Fp%3D98&appkey=http%3A%2F%2Fexample.com&type=xml
The corresponding call to the Digg's public API proxy is:
http://digg.com/tools/services?endPoint=/stories&link=http%3A%2F%2Fblog.digg.com%2F%3Fp%3D98&type=xml&appkey=http%3A%2F%2Fexample.com
This request uses the XML response type, however for use in a script tag, you would use the Javascript response type.
Page Information
|
Wiki Information |
Recent PBwiki Blog Posts |