Link Checkers

While preparing a quote for a client to convert their static web site to Drupal, I wanted to make sure that I knew how many pages the site spanned. Normally that's an easy task, but their previous webmaster did little in the way of organization, leaving dozens of old files littering the hosting account.

I thought perhaps that I could use a sitemap creator to crawl the site and give me a list of the pages. That was a good thought, but I soon realized that what would be more useful to me in the long run was a link checker—something that would check all the resources on a site and report what the site used and any missing pieces.

As always, my first stop was I found a number of sitemap tools and link checkers listed there, some of which I was able to bypass right away because they were no longer supported (date of last update was over two years ago) or had many poor reviews.

Since at this point I was looking for a sitemap creator, I started with RAGE Sitemap Automator ($30). I was already familiar with RAGE products, having considered (and, eventually, passing on) RAGE Domainer as a tool to keep track of my domains. Sitemap Automator is a good looking program, Mac-like in appearance, but I quickly decided this wasn't the class of product that I wanted. I really needed something to do link checking.

iGooMap ($25) and WebLight ($33) also generate sitemaps. HTML validation and link checking are built into WebLight, but the app is so non-intuitive and ugly that it simply wasn't an option for me. iGooMap has nothing going for it over Sitemap Automator.

Two tools from the same developer were next to come across my desktop. Integrity (Donation) is a basic link checker, while Scrutiny ($30)adds sitemap, SEO, and HTML validation features. As I began reviewing link checkers, certain features became important to me; both of these applications allow the user to set per-site delays to the scanner, so as not to overload the server, and have the ability to pause a scan and optionally scan for missing images.

However, Integrity has some bugs in the area of saving profiles (sites and their settings) and Scrutiny has some odd user interface issues that make me uncomfortable using it. Also, both suffer from a bug that makes the scanner treat URLs with 'www.' in them as a different site than those without the subdomain. The developer has indicated that this will be fixed in a later version.

Two programs win honorable mention for being particularly ugly and non-Mac-like: LinkByLink ($80)and BLT (Better Link Tester) ($20). LinkByLink also loses points for being a Java application and for shipping with a non-working HTML validator. On the plus side, though, it lists results with content type and modification date, and it checks for some JavaScript and Flash files. BLT also picks up those additional resources and was particularly fast in its scan without taking down my test site (as others did if they didn't have a delay), but being ugly and non-intuitive takes it out of the running.

The final two apps bracket the extremes of pricing. LinkChecker is free, but using it comes with a high cost of complexity. Most of its settings are manipulated with an INI-style configuration file and there are no profiles. Also, I could not complete my survey of the test site as a result of no delay between page checks. Two features, though, were present that I wish were in other products: robots.txt enforcement and resource checking in the HEAD tag.

Deep Trawl ($149) surprised me. Normally Java applications are ugly, but Deep Trawl departed so radically from the design of the other apps that it was easy to ignore the quirks that showed it wasn't a native Mac application. And the addition of such features as a built-in HTML validator, FTP client, HTML editor and scheduler (among others) make the program well worth the price.

In the end, there's no clear winner for me. Only one app was truly Mac-like and it (RAGE Sitemap Automator) isn't really a link checker. Only one checked all resources (LinkChecker), but it suffers from being overly complex.

Of the rest, my faves are Deep Trawl and Scrutiny. Deep Trawl costs more than I want to spend right now; it also failed to complete the site survey because the server (apparently) limits connections from a single client and the app cannot be throttled by setting a delay between checks.

Scrutiny, as noted above, isn't a polished app. And there are features found in other apps I'd like it to have, like robots.txt exclusion and full resource checking. However, its price is more appealing than Deep Trawl's, so I suspect many users will overlook these shortcomings.