Site optimization and using Google Webmaster Tools

February 20, 2013 by christians in General

Optimizing Your Website

A webmaster’s primary responsibility is to make a website easily discoverable by people who search the web.  To accomplish this discoverability, it is essential to ensure that site is being indexed by major search engines. This post discusses how using Google Webmaster Tools can increase a website’s visibility in Google search results.

Google Webmaster Tools is a suite of tools that allows webmasters to gain an understanding of what Google sees when it crawls a website and how to improve the visibility of the website’s content. The first step you must perform when using Google Webmaster Tools is to verify that you have sufficient rights to the site you want to manage. Google offers a variety of methods to accomplish this verification. After you log in, you will see an option to add a site (shown in the following figure).

 Google Webmaster Tools - Add and Verify a Site

By clicking the option to add a site, you will be prompted to enter the site address and then verify ownership of the site via a few available options. The recommended option (shown in the following figure) involves downloading a file that Google creates and saving that file to the root of the website. For sites that are hosted externally, you will want to connect to the server upon which the site files are installed, locate the root directory, and copy the downloaded Google file to that directory. Typically, this directory is the same one in which the index.html file is found.

After you successfully copy the file, navigate to the Google Webmaster Tools page and click the address found in step 3 of the instructions shown in the following figure. If you installed the file in the correct place, your browser will display a page with that address and the Google file name you copied in the upper left corner. This display indicates that Google was able to find the file on your server, which sufficiently verifies that you have appropriate rights to the site. After this verification process you will have access to the Google Webmaster Tool suite.

Google Webmaster Tools Recommended Verification Method

I only described the recommended option because Google provides clear instructions for the alternate methods (shown in the following figure) if you choose a different course of verification.

 Google Webmaster Tools - Alternative Verification Method

Google Webmaster Tools serve one main purpose: to help webmasters optimize the websites they manage. This optimization is good for both Google and the webmaster, because Google is in the business of delivering relevant search results and webmasters want to make their websites as visible as possible.

Google gathers information from websites by crawling them (perusing the website pages and indexing data), and it performs this function in two ways: the deep crawl and the fresh crawl.  Google’s deep crawl is an in-depth examination that includes a website’s pages, their content, links, and metadata. This examination provides the pertinent information that Google needs to rank the website according to characteristics that Google determines make a useful website.

It’s worth noting that Google’s crawl functions are intelligent enough to recognize tricky search engine optimization (SEO) techniques. One such example is known as Google bombing. In the early days of web crawling, websites that contained a high number of usable links would receive high page rankings. When Google realized that webmasters were intentionally placing numerous links on their pages to gain higher page rankings, they began to penalize websites for such practices. Using SEO techniques that are less-than-genuine or deceptive to the crawler can result in lowered page rankings and cause websites to be buried so deep in search results that they become virtually invisible.

Google’s fresh crawl is more of a cursory look at a website’s basic content. The fresh crawl process assumes that the basic structure of the website remains the same since the last deep crawl was performed, but that there might be some newer content that needs updating. The fresh crawl does not dive deep into the websites directories or links but operates on a superficial level, taking note of very simple changes to primary page content and not deeper content and structure.

The frequency of which any given website is crawled varies. For webmasters who are interested in obtaining quantifiable data regarding specific changes to their website, the ambiguity of crawl frequency can be problematic.  By using Google Webmaster Tools you can see the crawl history of a website, which provides a general schedule by which you can determine roughly when the next crawl will take place. This information makes it possible to correlate website changes with the data provided by Google Analytics, which enables the quantification of both positive and negative consequences of changes to the site.

For those who have the ability to run queries against their web server, there is good news. Here at Wadeware, we verify the last crawl by means of the command line and LogParser 2.0. Using this method, we can run SQL queries against the web server to find out each time the robots.txt file was requested. Because the robots.txt file is requested by most indexing spiders (programs designed to find and index data on websites across the internet), and not just Google spiders, the result is a .csv table that is full of data regarding who crawled the website and when. The amount of time spent on the crawl, whether there was an error and what kind, the date, and other specific is available to anyone who has this capability.

Some of this data, including crawl dates, is also available via the Google Webmaster Tools interface and Google does a great job of making the data easy to view; however, we find that the previously described method provides more useful information.  The main benefit is that a query can operate within more parameters and provide various kinds of information about all the sources that try to access the robots.txt file.  This method can be useful for webmasters because it offers the information necessary to limit the spiders that are crawling their sites. For example, if company A sells shoes online but only ships to the United States, there is no reason for a Chinese spider to be crawling their website.  Examining the data that comes from determining who or what is accessing the robots.txt file provides a comprehensive list of all the spiders crawling the page, including the ones that should be banned.

The Fetch as Google feature also proves to be instrumental in monitoring how changes to a website affect visibility in searches. For those concerned with getting new content indexed quickly, this tool serves as a crawl request to Google; however, one should not expect this request to result in a deep crawl. The Fetch as Google tool returns the first 100KB of data on a designated page. In addition, it allows you to see what Google sees when your page is crawled. For example, if there is content on your website that is not crawlable, using the Fetch as Google tool can reveal that problem and allow a webmaster to supply the content in a format that is crawlable. The Fetch as Google tool is a powerful way to receive feedback about your website content and its crawlability.

Google’s Index Status link on the left of the Google Webmaster Tools interface provides a graph of the total indexed material on the website. Uploading an accurate sitemap to Google Webmaster Tools significantly helps the Google spiders navigate the crawl process. You can upload a sitemap by clicking the Optimization link and then the Sitemaps link on the left menu. The more data the spiders are able to crawl, the more will be indexed.  At Wadeware, our experience shows that the more data that is successfully indexed, the greater the number of impressions in Google searches the website will receive. You can easily find a sitemap generator online by doing a simple Google search.

A variety of useful tools are included in the Google Webmaster Tools suite. After a webmaster determines that their site is being crawled, additional tools are available that will help them understand ways that their website can be improved upon and appear more friendly to the spiders that crawl it. By utilizing Google Webmaster Tools, even webmasters with limited experience can make vast improvements in their websites’ visibility in search results.