We all know the importance of having an XML Sitemap on our website. It is equally important to have minimum crawl errors for the Google bot to crawl your website easily. Needless to say, having crawl errors on your website is one of the biggest examples you can give fora bad user experience.
Majority of us use different third party tools to generate XML sitemaps. On the other hand,Google Webmaster Tools is preferred for checking crawl errors. What if we get both these services in one tool?
Jim Boykin has launched a free online tool to find broken links, check redirects & generate XML sitemap. Not just it! The tool provides a lot more information that we can use while optimizing our website.
How to go about using the tool
Now just because this is a free online tool it does not give us any less data. This tool can crawl as many pages you have on your website and allow5 runs per day, per user. However this tool can be used for larger websites (more than 1,000 pages), only if you own them or have access to optimize them! Such websites can only be crawled if we add the following code within the website's robots.txt file:
I carried out a test for a relatively small website and got a lot of insightful data.
Enter the URL of you website's home page and also your email id (optional) to get an email notifying you of the completed crawl along with the generated reports attached.
Once we click on Ninja Check, our website is crawled in real time.
As soon as the crawling is complete, we receive a "Finished Crawling" message and a lot of data to now work upon!
The tool provides us with 6 tables along with an option to generate an XML sitemap:
- Internal links
- External links
- Internal errors (a subset of Internal Links)
- Internal redirects (another subset of Internal Links)
- External errors (a subset of External Links)
- External redirects (another subset of External Links)
Let us dwell deeper in each type and look at what data do we get and how can we use it in our day to day SEO
The data fields in this report are:
URLs crawled on the site:
All URLs the tool has captured. XML sitemap is made up of all these URLs
Link to The On Page Optimization SEO Tool for that URL
This link provides us access to another tool provided by Internet Marketing Ninjas that help you analyze a single page for its on page elements, providing data like:
- Title Tag
- Meta Description
- Meta Keywords
- Total Words on Page
- Words that are links on the page
- Words that are Not Links on the page
- Number of External Links
- Total Distinct Words
- Number of links on the page (internal and external)
We can even specify 5 keywords for which we would like to analyze the web page.
This can serve to be an excellent tool for understanding what all keywords have already been targeted on the page and build strategies to target the rest of the mapped keywords
URL's level from the domain root
The level at which the URL is from the root domain. This can help us decide which URLs should be given more importance, how can we build a theme for our website by linking pages internally and what architecture has been followed, in case we are analyzing our competitor's website.
URL's returned HTTP status code
This field gives us the the status code of the URL (200 OK, 400, 500 etc). This is a direct actionable data we can use to reduce the crawl errors on our website. However this data has also been exclusively provided in the following two reports
- Internal errors (in case of errors like 4xx, 5xx)
- Internal redirects(in case of redirects like 301, 302)
Number of internal links the URL has within the site
This field gives the number of pages within the website that point towards the URL. After clicking on this number, you can also get the list of pages linking to the mentioned URL. This data can be very useful for pages that do not rank in spite of having optimized them in the best possible way. We can then plan accordingly on what pages need to be linked internally.
Link text used for the URL
This is the anchor text used while linking to the specified URL. Linking pages using a keyword in the anchor text would ideally not be recommended in the post penguin world. Anchor texts like – brand name, click here etc. are more natural. Here we can plan our internal linking strategies by just taking a glance at the provided data.
Number of internal links on the page
This is the number of internal links present on the specified URL. Here too after clicking on this number, you can get the list of Pages that are linked to from the URL.
Number of external links on the page
This is the number of external links present on the specified URL. Here too after clicking on this number, you can get the list of Pages that are linked to from the URL. This data can help us decide how many external links should be allowed and how many of them should be follow/nofollow
Size of the page
This field provides the page size in kilobytes. On clicking the page size number, we get access to page speed tool where page speed can be tested for a particular URL.
- This page speed tool also provides us with an option to compare the page speed of two web pages simultaneously.
Link to the Check Image Sizes, Alt Text, Header Checks and More
This option provides us with the access to image checker tool by Internet Marketing Ninja, which also provides us with a lot of data related to images on a particular page. For my website I got a summary of image report as below
- This data helped me replace the redirected URLs with the actual URLs, replace the broken links and implement Alt tags for the 10 images.
The tag text from the URL's page
The tag textfield provides us with the title of the page. This data can be useful in competitor analysis for digging into what keywords our competitors are targeting.
The description tag text from the URL's page
This provides us with the Meta Description on the page.
The keywords tag text from the URL's page
This provides us with the Meta Keywords on the page.
Contents, if used, of the anchor tag's "rel=" attribute
This would tell if anchor tags with "rel"=attribute is used on the page
A couple of times, I have witnessed, that on page elements of a certain website get automatically reverted due to some code level changes. The title, description and keywords functionality here, can help us audit our own website and check for the presence of optimized tags regularly.
This data provides us with the links on our website that point to some other domain. The External links table includes the following data fields:
URL's returned HTTP status code
This field providesus with the status code of the external link that has got a link from our website. This can help us monitor the pages we are linking to.This data has also been exclusively provided in the following two reports
- External errors (in case of errors like 4xx, 5xx)
- External redirects(in case of redirects like 301, 302)
Number of times that URL is linked to from within the site
This field provides us with the number of times the external page has been linked from our domain. After clicking on this number you can get a list of pages that link to the external domain. In case multiple pages are linking to a 404 page, immediate action can be taken.
External URL used in the link
This is the URL to which our domain has linked to.
Link text used for the URL
The anchor text used in the link.
Internal page URL on which the link was first found
This field provides us with the page of our website on which the link was first found by the tool’s crawler.
This table is a subset of internal links table. It provides us with a list of URLs within the website that returns an error response like 404. This direct data can help us implement appropriate 301 redirects. The fields in this table are:
This field gives the type of error the page is rendering.
Number of Pages that Link
This field gives the number of pages that link to the error page. After clicking on this number, we can also get the list of pages that link to the error page. This can help us replace the error link on these pages with correct links.
The field gives us the URL of the error page.
The anchor text used while linking to the error pages.
First found on
This field provides us with the page of our website on which the error/broken link was first found by the tool's crawler.
This table is a subset of external links table. It provides list of external URLs linked from our domain that give an error response like 404. This direct datacan help us replace/remove such links from our website. The fields in this table are similar to the fields in Internal Errors table.
This table is another subset of internal links table. It provides list of URLs within the website that have been redirected to some other URL. This data can help us replace the redirected links with the correct links.
Apart from the fields that we get in internal/external errors table, this table also provides us with the URL to which the 3xx page is redirected to.
This table is another subset of external links table. It provides us with list of external URLs (pointing to some other domain) within the website that have been redirected to some other URL. This data can help us replace the redirected links with the correct links.
The fields in this table are similar to the fields in Internal Redirects table.
There are other tools in the market that provide you with data on similar lines. But this tool is not only free but easy to use and helpful. The big chunk of useful data and then its division into subsets for easy analysis is the highlight of this tool for me.
Definitely worth a try!