Page 51 - LoudOffice_Guide-to-HTML_Part-II_Advanced.PDF
P. 51
Search Engine ROBOTs
Most pages become listed when a search engine “robot” visits their site and indexes the
content into their database. This is a controllable action, since there are probably some of
you who do not wish every page to be indexed by the spiders at all. The r o b o t s META
attribute was designed with this problem in mind. The format uses is:
< m e t a n a m e = " r o b o t s " c o n t e n t = " a l l | n o n e | i n d e x |
n o i n d e x | f o l l o w | n o f o l l o w " >
The default for the robot attribute is "a l l ". This would allow all of the files to be indexed.
"n o n e " would tell the spider not to index any files, and not to follow the hyperlinks on the
page to other pages. "i n d e x " indicates that this page may be indexed by the spider, while
"f o l l o w " would mean that the spider is free to follow the links from this page to other
pages. The inverse is also true:
< m e t a n a m e = " r o b o t s " c o n t e n t = " n o i n d e x " >
This META tag would tell the spider not to index this page, but would allow it to follow
subsidiary links and index those pages. "n o f o l l o w " would allow the page itself to be
indexed, but the links could not be followed. As you can see, the robots attribute can be
very useful for Web developers. For more information about the robot attribute, visit the
W3C’s robot paper.
How to Get Listed
There is No Such Thing as Magic!
There is no panacea that will improve if and how your site appears in search engines.
Learning the art of search engine placement takes time, patience, and experimentation.
Below are a few points to help you improve your ability to get your site listed:
1. Study the different types of search engines – Like so many other elements of
web design, search engines take many different types of forms and functions. Some
sites (such as Excite) use “robots”, programs that browse the web from site to site,
reading the HTML code of a page and indexing it to an online database. Others are
directory-structured, many times with actual humans adding sites to online
databases. Examples of these include Yahoo! and the Open Directory Project
(dmoz.org). Each look at your site differently – robots can only attempt to analyze
your HTML code while human directories look at the page itself.
2. Submit Your Site - Visit the different search engines you wish to get your site listed
with and find out how to submit your site. Robot sites normally ask for some basic
information and then let their robots visit your site, while others require you to enter
your information, a description, keywords, etc. Read their instructions VERY
CAREFULLY, and don’t try submitting your site until it is as close to 100% ready as
possible. You can also find services to do so for a fee, and some are very credible,
but no matter what they say, there are no guarantees.
3. Make Sure Your Description and Keywords are Relevant – if your site is about
the gestation period of intestinal flu variants, then you would not be served using
“mickey mouse” as a keyword or in your description. The temptation is to stuff your
keywords and descriptions with “popular” search term. Even though millions of
people look for information on Disney, while probably a few dozen look up
information on flu variants, you are better off having relevant, rather than spam,
keywords. The reason for this is that a search robot may see “mickey mouse” in
LoudOffice.com Guide to HTML – Part II Page 51