February 10: Pagination in a json-server API with the link header

Software engineering is chock-full of problems which seem boring and pedestrian at first, but unlock a world of intrigue and insight upon close study. Case in point? API pagination. Seriously, the title of this blog post is like a Seroquel for the soul — just looking at it makes me want to crawl into bed and cry. But APIs are a huge, important, fascinating topic — the stuff that billion-dollar fortunes and Supreme Court cases are made of.

If you manage to stay awake long enough to follow me to the end of this blog post, hopefully you’ll deepen your understanding of the request-response cycle in API-based web applications. That knowledge will give you the confidence necessary to do way more than just displaying objects on a page when you’re building your own API-based app.

This topic is rather dense, so if you find it difficult to follow I encourage you to fork and clone this repository. Now, without further ado, let’s paginate!

Background: What’s pagination?

Think for a moment about what happens when a user browses content in a typical web application. The user scrolls through a list of objects, but that list is pretty short at first — let’s say for the sake of this example it’s fifty items long. When the user scrolls to object #50, the application pauses to load the next set of fifty objects to view; at object #100 the application pauses again to load the next set, and so on.

What’s going on behind the curtain? For each set of objects, the application sends a request to a web API, which comes with information about the nature of the request. The server hosting the API responds with the objects to be displayed, along with information about the context of the request. With each request, the document gets longer and longer as the application renders the result of each request.

Uh, okay… so?

JavaScript is fast — so fast that no human can reasonably tell it’s being run at all! HTTP requests, on the other hand, are slow as molasses by comparison. The delay isn’t terribly noticeable on a request for fifty objects, but what about a request for fifty thousand or fifty million objects?

The Twitter API, for example, contains billions and billions of tweets. Even if you’re not grabbing every tweet in the Twitterverse to show a single profile, an API request for a single user’s tweets will slow your application to a crawl if that profile has tens of thousands of them. That’s a huge problem when you’re delivering content to a fickle, picky user with things to do and places to be. One recent study showed users lose interest in a website after as little as ten seconds of waiting.

Enter pagination. Just about every web API allows (or forces) an engineer to limit the request to a certain size. The API may then give the engineer tools to paginate through the API with that limit — just like flipping through pages in a book. Each page represents a limit-sized chunk of the API’s data, so in our typical app above, where the limit is fifty objects, page 1 includes objects 1–50, page 2 includes objects 51–100, and so on. Limits with pagination give engineers all the flexibility and power of a huge API, with a built-in strategy to avoid the excessive load of a data-heavy HTTP request.

Starting up json-server

An actual web API lives on a server somewhere in the desert in California or Israel, and not on Josh’s computer in his apartment in Hell’s Kitchen… but the really fun ones cost money, and they often don’t give us full CRUD functions. So, to learn and test, we’ll use a package called json-server, a node module (an extension to basic JavaScript code) that simulates/mocks the behavior of an external API.

The documentation for json-server isn’t terribly helpful for beginners, but since web APIs are semi-standardized, the routes/responses we can use look pretty consistent with what we’d get from an external API. You can see for yourself in the repository for this blog post, which comes with a db.json file with population data for US counties. Fork and clone the repository, then install json-server through the terminal with the command npm install -g json-server. Once installed, type in json-server db.json and you’ll see a message showing you which port you’re listening to:

\{^_^}/ hi!Loading db.json
Done
Resources
http://localhost:3000/us-counties
Home
http://localhost:3000
Type s + enter at any time to create a snapshot of the database

Now, in your favorite browser you can navigate over to http://localhost:3000/us-counties and see a giant list of counties with their populations as of the last census:

{
"us-counties": [
{
"population": 55200,
"state": "Alabama",
"name": "Autauga",
"type": "County"
},
{
"population": 208107,
"state": "Alabama",
"name": "Baldwin",
"type": "County"
},
...

A quick look through the json-server documentation shows us the routes we can use and combine, following familiar, RESTful conventions. Route parameters begin with ? and can be chained with & — so, for example, the route http://localhost:3000/us-counties?state=New%20York&_sort=name gives us a list of the 62 counties in New York state in alphabetical order.

Take particular note of our pagination routes,_page and _limit. These routes allow us to _limit the size of our API call to _pages which correspond to the size of our _limit. So the route http://localhost:3000/us-counties?state=New%20York&_sort=population&_order=desc&_limit=10&_page=2 will show us the tenth through twentieth most populous counties in New York state.

Exploring the link header

Remember I said above that APIs respond to requests with data — a lot more data than just the objects requested. Google Chrome makes it easy to examine this data by giving us a Network tab in the browser console:

Take particular note of the Link header, which appears on pagination routes. Let’s keep paginating through counties in New York and examine the link headers on &_page=2 of the ?state=New%20York&_sort=population route:

<http://localhost:3000/us-counties?state=New%20York&_limit=10&_page=1>; rel="first", <http://localhost:3000/us-counties?state=New%20York&_limit=10&_page=2>; rel="prev", <http://localhost:3000/us-counties?state=New%20York&_limit=10&_page=4>; rel="next", <http://localhost:3000/us-counties?state=New%20York&_limit=10&_page=7>; rel="last"

How convenient! The link header gives us URLs for the first, previous, next and last pages of our API according to the current _limit! Even better, the prev and next links don’t appear on the first and last pages, respectively. We’ll take advantage of this to make sure a user can’t paginate to a nonexistent page. And it’s easy to use an ordinary, familiar, off-the-shelf JavaScript .fetch() to get and work with link headers!

fetch( currentUrl ).then( response => response.headers.get( "Link" ) ).then(...);

Parsing the link header

Sounds easy so far, right? Unfortunately, as with everything in JavaScript, there’s a big catch — two of them, actually…

  1. The link header only appears on a _limited response with multiple _pages. If you try to call response.headers.get( "Link" ) on a route with only one page, you’ll get a null object.
  2. The link header is a string — not an array or object — and you’ll have to parse it to an Object do anything useful with it.

We have several options for parsing the link header, none of which are fall-off-a-log easy:

  • You can use an excellent free package called parse-link-header… but it must be installed separately with npm install parse-link-header, and it doesn’t play nice with a lot of browsers, especially mobile device browsers;
  • You can use a regular expression, like this brain bomb: /^(?:(?:(([^:\/#\?]+:)?(?:(?:\/\/)(?:(?:(?:([^:@\/#\?]+)(?:\:([^:@\/#\?]*))?)@)?(([^:\/#\?\]\[]+|\[[^\/\]@#?]+\])(?:\:([0-9]+))?))?)?)?((?:\/?(?:[^\/\?#]+\/+)*)(?:[^\?#]*)))?(\?[^#]+)?)(#.*)?/… but even if you haven’t read my previous blog posts, you can tell that working with regular expressions is about as fun as flushing your eyes with turpentine;
  • So, for this blog post, to demonstrate and explain, I’ll take the naïve approach and use my trusty set of JavaScipt ginsu knives to .split(), .map(), .replace() and .slice() the link header into a five-star gourmet Object:
function parseLinkHeader( linkHeader ) {
return Object.fromEntries( linkHeader.split( ", " ).map( header => header.split( "; " ) ).map( header => [ header[1].replace( /"/g, "" ).replace( "rel=", "" ), header[0].slice( 1, -1 ) ] ) );
}

Because this method is a festering cancerous tumor which violates every principle of good programming practice, here’s an identical version teased out into multiple lines, so we can understand what’s going on with our mental health (relatively) intact:

function parseLinkHeader( linkHeader ) {
const linkHeadersArray = linkHeader.split( ", " ).map( header => header.split( "; " ) );
const linkHeadersMap = linkHeadersArray.map( header => {
const thisHeaderRel = header[1].replace( /"/g, "" ).replace( "rel=", "" );
const thisHeaderUrl = header[0].slice( 1, -1 );
return [ thisHeaderRel, thisHeaderUrl ]
} );
return Object.fromEntries( linkHeadersMap );
}

Here’s a line-by-line breakdown using the example from New York above:

  • First let’s whip up a variable, linkHeadersArray, by .split()ting individual routes and rels in the linkHeader by commas, and then splitting each pair of routes and rels by semicolons with .map() to yield the following nested array:
[
[ '<http://localhost:3000/us-counties?state=New%20York&_limit=10&_page=1>', 'rel="first"' ],
[ '<http://localhost:3000/us-counties?state=New%20York&_limit=10&_page=2>', 'rel="prev"' ],
[ '<http://localhost:3000/us-counties?state=New%20York&_limit=10&_page=4>', 'rel="next"' ],
[ '<http://localhost:3000/us-counties?state=New%20York&_limit=10&_page=7>', 'rel="last"' ]
]
  • Next, let’s turn linkHeadersArray into a linkHeadersMap with a .map() function that strips the the carats from the URL with .slice( 1, -1 ), strips everything except the attribute value from the rel with .replace(), and switches their places;
  • Finally, return an Object.fromEntries() of linkHeadersMap that finally gives us a custom-tailored, perfectly-cut Object that looks like this:
{
first: 'http://localhost:3000/us-counties?state=New%20York&_limit=10&_page=1',
prev: 'http://localhost:3000/us-counties?state=New%20York&_limit=10&_page=2',
next: 'http://localhost:3000/us-counties?state=New%20York&_limit=10&_page=4',
last: 'http://localhost:3000/us-counties?state=New%20York&_limit=10&_page=7'
}

Putting the link header to use

Now that we’ve sacrificed a snow-white bull for the gods at midnight under a full moon to write a parseLinkHeader() method, we can put it to use in a callback method, paginate( direction ). Now we can listen for a 'click' on a button with an addEventListener() which paginates to any of our four possible destination pages — the "first" , "prev" , "next" , or "last" page:

let currentUrl = "http://localhost:3000/us-counties?_limit=20&_page=1"function paginate( direction ) {
fetch( currentUrl ).then( response => {
let linkHeaders = parseLinkHeader( response.headers.get( "Link" ) );
if ( !!linkHeaders[ direction ] ) {
currentUrl = linkHeaders[ direction ];
fetchCounties( linkHeaders[ direction ] );
}
} );
}

Remember I said above we’ll take advantage of the fact that our parsed link header only gives us a "prev" or "next" _page if we aren’t on the "first" or "last" page, respectively. This makes it easy to prevent our user from paginating to the end of the universe while clicking around our app’s API. All we have to do is wrap our method to fetch the next page in a conditional asking if ( !!linkHeaders[ direction ] ), which returns true and executes the block only if our link header contains the direction we’re looking for. That way, our user can’t break our app by clicking to the previous page while on the first page, or the next page on the last page.

Conclusion: Here I go, playin’ the star again — there I go, turn the page…

One more thing I pointed out above to remember: the link header only appears on a paginated route, and if you try to parse a null link header, you’ll drown in a fierce rip-current of Uncaught (in promise) ReferenceErrors. You’ll need to implement additional logic to make sure this never happens, and you can see examples of exactly that in this blog post’s repository. Most of this edge-case handling happens in a simple filter/search function I built, because some possible search queries are short — Delaware only has three counties!

When implementing pagination, it’s your responsibility as the engineer to handle both paginated and non-paginated API routes without breaking your app. Get comfortable doing this, and you’ll be well on your way to building apps that serve hard-to-please users with robust, exciting universes of content to paginate through at lighting speed!

Oh geez, Josh Frank decided to go to Flatiron? He must be insane…