How to Choose a Drupal 7 Module for Viewing PDFs

A Case Study in the Art of Module Selection

Recently, a client asked me to add a new feature to the company's Drupal site: display PDF files in the browser. As I browsed the options on drupal.org, I realized that this was a perfect chance to document my actual decision-making process as I chose a new module. I'm always saying to choose modules wisely, but now you can see how I think this works in real life.

Define What You Want

The first step is to define what you want.

In my case, I wanted:

  • The ability to view PDF files in a web browser, similar to this example. The client would upload PDFs of the company newsletter, and visitors would be able to read them easily.

  • The site is Drupal 7, so the module would need to match that major version. (Drupal 7 has been out for awhile now, so if a module developer hasn't come out with a Drupal 7 version yet, they probably won't.)

  • Although the client didn't specify this, I also wanted to avoid relying on a third-party service. For videos, I'm happy to post the content to YouTube or Vimeo and then embed it on a Drupal site, but for PDFs, I didn't think the possible extra exposure would outweigh the potential hassle, breakage, and expense. However, I was open to a third-party service if it was the only option.

  • Despite my wish to avoid a third-party service, I knew my choice would probably require a third-party Javascript library. Although this would add an extra step on future upgrades, I generally feel better about running my own copy of a library rather than relying on a third-party service.

  • I wanted to keep the module as lightweight and specific as possible. I didn't want to get involved with some radically new way of handling or organizing media files. I wanted something more like Colorbox, which enlarges images for better viewing, but remains completely independent of how you choose to manage the image files. I had a hunch that this library would be pdf.js, but I was open to other possibilities.

  • As usual, I wanted to follow the general guidelines for choosing a Drupal module. Basically, choose a module that's already been in use by a few thousand people (if possible) for awhile, with a minimum of dependencies, that seems to be maintained by an active developer who plans to keep supporting the project into the future and doesn't require a licensing fee.

Search on Drupal.org

With these goals in mind, the next step was a simple search on Drupal.org. Time to jump into the Ball Pit of Module Goodness.

"Comparison" Page for PDF Modules

My first stop was (or should have been), this page: a Comparison of PDF viewer modules. Drupal.org has an excellent tradition of documentation pages which outline the pros and cons of various modules in the same space. There's a central list of comparison pages, but they're also sprinkled throughout the site.

The PDF comparison page included four PDF viewer modules. I'll cover them here, as well as a couple others I found from searching. I'll start with the candidates I decided to skip.

Now let's delve into the specifics of why these modules did (or mostly didn't) work for this project.

File Viewer

File Viewer uses the Internet Archive BookReader, which intrigued me because I'm an Internet Archive junkie.

Every time I go there, I feel tickles of fear and overwhelm at the mountains of books I can pluck from the ether.

That being said, the demonstration site looked a little ugly to me. I might live with it, but I doubted my client would, when pdf.js looks so much more stylish.

Also, on a second look at the project page, I saw the big bold announcement at the top: This module has been moved to PDF module formally. Fair enough. With less than 400 installs, merging with the more popular PDF module (which we'll cover in a moment), seems like a good move. Never download a module that has been merged/moved/abandoned.

Google Viewer File Formatter

Google Viewer File Formatter is what it sounds like: a way to use Google Docs to embed displays of files in your web page. Although I liked the versatility of Google Docs, one of my goals was to remain independent of any third-party service.

Also, this module had less than 100 installs.

Ajax Document Viewer

Although "AJAX" is a general Javascript term, Ajax Document Viewer turned out to rely on a specific third-party service. Only about 100 installs. Moving on...

Scald PDF

Scald PDF only had 40 installs, but I had to take a look, since it was clearly part of a larger project called (yes) Scald. As the Scald project page explained:

Scald is an innovative take on how to handle Media Atoms in Drupal.

That sentence raised two huge red flags: "innovative take" and the word "Media" paired with "Atom". "Atom" was obviously a repurposed word for "thing", which made it a red flag all by itself. Drupal has a penchant for these empty-box kind of words: nodeentityfeature... The more general the word, the more sweeping the changes may be.

As I scrolled down, my suspicions were confirmed. I read excited claims of how Scald would basically reinvent how I handled Media on my site.

Now, the truth is that Drupal's Media handling could use some reinventing. Scald isn't the only ambitious project in this space. However, with less than 1000 installs so far, I didn't want to get in on the ground floor.

Sure, by this time next year, Scald might be the next Views. That would rock.

But it might also be abandonware, with a (small) trail of broken sites left to weep.

For now, I wanted to stick with a much less ambitious and perilous solution. Just display PDFs, please. That's all I was asking.

Shadowbox

Shadowbox surprised me: it claimed to be a single solution to displaying all kinds of media, from PDFs to images to video. This wasn't as sweeping as Scald, since it would only focus on displaying media without introducing whole new concepts like "Media Atoms". But I already like Colorbox, as I mentioned. I didn't want to have to rethink that decision.

However, I did note (with an inner groan) that with over 16,000 installs, Shadowbox could be a more powerful alternative in the same space. I had to take a look.

The Shadowbox Drupal module is basically a bridge to a Javascript library, Shadowbox.js, so I checked out the library's website. There, I discovered two reasons to move on:

  • The library requires a license fee for commercial use. The fee was reasonable enough, but I try to avoid open-source software that isn't free.

  • A careful search of the FAQ revealed that, contrary to the description on the Drupal module page, PDFs are not 100% supported by the Shadowbox library. Oops. Good thing I checked.

The Two Contenders: "PDF" and "PDF Reader"

Having eliminated the rest, I now came to the two obvious contenders: PDF and PDF Reader

These two projects had key similarities:

  • Both had nearly 3,000 installs, far more than the alternatives (except Shadowbox).

  • Both used the same external Javascript library, pdf.js.

    What about differences?

    PDF Reader also had the option for Google Docs integration. In this particular case, I thought my client might like that, so I liked having the option.

    Meanwhile, PDF was marked as Seeking co-maintainer(s). That could be a sign that the developer would soon abandon the project, but on the other hand, the most recent commit was a week ago, so at least the developer was still active.

    On the other hand, PDF Reader was marked as Actively maintained, but the most recent commit was a year ago.

    Without a clear winner, I decided to test them both.

    Testing the Contenders

    I tested both modules on a copy of my live site. (No matter how solid and innocuous a module appears, never try it first on a live site. You could break your whole site.)

    I was biased toward PDF Reader, because it seemed to have more options (such as Google Docs) than PDF. So I decided to try PDF first, to get it out of the way.

    PDF Fail: Compilation Required?

    However, when I installed PDF and read README.txt, I discovered a problem that I had seen but ignored on the project page. For some reason, this module seems to require that you compile pdf.js manually. Although the project page suggested that this wasn't necessarily required, README.txt suggested it was.

    Since PDF Reader would use the exact same library without requiring this step, I decided to try it first after all. If it didn't work, I could always go back to PDF and try to manually compile pdf.js.

    PDF Reader: Success! Sort of.

    So, at long last, I tried PDF Reader. This module provides a new widget for displaying a File field. You add a file field to your desired content type and set the widget type to PDF Reader. Then, you create a node of this type and upload your PDF. The PDF appears embedded in a "box" on the page.

    You can try different display options by editing the content type again and changing the display settings for the field.

    I found that each display option had pros and cons:

    • The Google Docs reader worked fine as an embed, but when I clicked it to go full-screen, I wound up on a Google Docs page which apologized that my rate limit had been exceeded. Oops. Perhaps this would be more reliable if I hooked the module to a paying Google Apps account, but I didn't bother to find out, as I was pretty sure my client wouldn't like the display.

    • The pdf.js option worked wonderfully ... on Firefox and Chrome. But when I fired up Internet Explorer, the box appeared empty. Apparently, this is a problem with pdf.js itself, not the PDF Reader module. I suppose I should have expected this, given that pdf.js is developed by Mozilla and Internet Explorer is ... itself. Still, I was disappointed that I hadn't thought to confirm that pdf.js worked reliably across all browsers in the first place.

    • The embed option was the most reliable. This actually ran Adobe Reader in a box on the web page. My Firefox still preferred to run pdf.js, but I think this was a browser setting. Either way, as long as a visitor had either Firefox or a PDF viewer like Adobe Reader, the PDF would display.

    Thus, in the end, my solution was to use the PDF Reader with the Embed display option. This option would allow me to attach a PDF to a Drupal node, and reliably display it on a Drupal web page.

    Unfortunately, sometimes "reliable" isn't enough. After all this searching, I had to consider a third-party service after all.