Search Engine Optimization Pay Per Click - PPC Campaign Management Affiliate Marketing Web Analytics Social Media Optimization - Blog Marketing
SERVICES
Search Engine Optimization PPC Campaign Management
Affiliate Marketing
Social Media Optimization
PR Optimization
Web Analytics

CORPORATE
About Us
The Team
News
Outsource to Us
Partnerships
Jobs at Convonix
RESOURCES
Articles
WhitePapers
Applications & Tools
SEO Blog
Case Studies
SEO Help - Ask Our Experts
CONTACT
Contact Us
Get a Free Quote
.
SEO articles
Internet Marketing 3.0
Taking you to The Next Level

Posts Tagged ‘Indexing’

How your firewall can destroy your search engine rankings

Monday, October 20th, 2008

Some webmasters recently experienced a delisting from the search engines without any good reason. The webmasters had not done anything wrong and their web sites were optimized for search engines. Nevertheless, the web sites had been removed from search engines.

Poorly configured firewalls can block search engine spiders

It turned out that the delisted web sites were all hosted by the same hosting company. More precisely, the web sites were all hosted by a hosting company that used a special firewall software by SonicWALL Inc.

That firewall stopped the search engine spiders from accessing their web sites. Google, Yahoo, MSN and all other search engines that request the robots.txt file couldn’t index the web site anymore because the firewall didn’t allow that:

“An attacker could retrieve robots.txt from the server, then use the contents of this file to discover the path of an unprotected administration interface for the server. The attacker may gain control of the webserver using this interface.

The information gathered from robots.txt could be used for system compromise and control of the web server.” (source)

This is the standard security settings of the SonicWALL firewall and it basically means that your web site won’t be spidered by search engines if you use this firewall without customizing it.

A firewall with these settings will drop the connection to anyone requesting the robots.txt file so that it looks as if the web site is offline. From an SEO point of view, this is very bad for your web site because all good search engine spiders request the robots.txt file before indexing your web site.

What does this mean to you?

If your web site is not listed on search engines although it has many good incoming links and optimized web page content, you should ask your web host if their firewall blocks search engines that request the robots.txt file. Your web host might not be aware of the problem.

Difficulties Faced by Google’s Indexing Robot

Friday, September 5th, 2008

Many webmasters don’t get high rankings on Google and other search engines just because the indexing robot has difficulty to index their web pages.Search engine robots are very simple software programs. If an indexing robot cannot find the content of your website immediately, it will skip your site and go to the next link in the list. For that reason, it is very important to make sure that search engine robots can index your web pages without problems.

Here are the top 5 elements that drive search engine robots away:

Element 1: Your robots.txt file is damaged or it contains a typo

If search engine robots misinterpret your robots.txt file, they might completely ignore your web pages.

Double check your robots.txt file and make sure that you use the disallow parameter only for web pages that you really don’t want to have indexed.

Element 2: Your URLs contain too many variables

URLs with many variables can cause problems with search engine robots. If your URLs contain too many variables, search engine robots might ignore your pages.

Here’s Google’s official statement about web pages with many variables:

“Google indexes dynamically generated webpages, including .asp pages, .php pages, and pages with question marks in their URLs. However, these pages can cause problems for our crawler and may be ignored.”

Element 3: You use session IDs in your URLs

Many search engines don’t index URLs that contain session IDs because they can lead to duplicate content problems. If possible, avoid session IDs in your URLs. Better use cookies to store session IDs.

Element 4: Your web pages contain too much code

Of course, your web pages can contain JavaScript code, CSS code and other script code that is not directly related to your content. Visit your website with a web browser and select “View source” or “View HTML source”.                                                                                                                                                               If it is difficult for you to spot the actual content of your website then search engines might also have difficulty to parse your pages.

Element 5: Your website navigation causes problems

Fancy JavaScript or DHTML menus cannot be parsed by most search engine robots. Flash or AJAX menus are even worse when it comes to website navigation.
As mentioned above, search engine robots are very simple programs. They can follow HTML links; all other links can cause problems.

Optimized web page content and good inbound links are crucial for high search engine rankings. However, the best content and the best links won’t help you much if search engines cannot index your pages.

506 B Navbharat Estates
Zakaria Bunder Road
Sewri (W)
Mumbai - 400 015
India

Tel.: +91 22 2411 2836
        +91 22 3253 3724

Fax.: +91 22 2413 6007

Email:

SEO | Corporate Profile | Outsource SEO | SEO Blog | SEO Articles | Research | SEO Tools | Careers |
Search Engine Marketing | Search Engine Optimization | Pay Per Click | Geo Specific Website Optimization: SEO India | SEO UK | Web Analytics | Social Media Optimization | Blog Marketing |
Useful Resources | Site Map | Privacy Policy
Copyright© 2004 Convonix™ Inc. - The Search Engine Optimization Firm (The SEO Firm in India). All rights reserved.