Search Engine Optimisation |
Techniques, advice, theories and ideas for search engine optimisationBased on a few contributions by members of the news group: alt.internet.search-engines during October 2003 and much modified by me since. We did not get many contributions so I have added much of my own and made a number of subsequent amendments. |
Links: |
Header area of html web page
The <head> area of each page should have a title, description and keywords.
The "description" should relate only to the visible body content of the page and be up to about 20 words or 120 characters. Write in good sentences and make it appeal to the human reader and clearly explain what the page is about.
The <title> should relate to the description and body content and comprise up to about 10 words or 60 characters. Keep anything important within the first 40 characters and make the first word a keyword starting with a capital letter. Make it appeal to the human reader.
Keywords should also relate to body text, in accordance to importance. Put in only important words. If the page is about two topics put in two keywords, if the page is vague, put in more. If you use as many as 10 keywords, the value of each will be less than if you used fewer keywords. See pages about keywords and keywords meta
Write both title and description so that they appeal to readers who see them listed. Do use words that accurately match the page. Do start the first word with a capital letter. Do use a noun or 'big meaning' word as the first word. Don't use words like just the company name, unless it is so well known that people search for it. Don't use the same title or description for more than one page. Don't repeat anything over and over again that it makes bad reading. You can perhaps use a word twice in the title and twice in the description provided both title and description are of reasonable length. Make it read nicely
Example:
<meta name="Description"
content="Ideas, advice and suggestions for writing
a web page optimised for a search engine.">
<meta name="Keywords" content="search,
engine, optimisation, advice, seo">
<title>Search Engine Optimisation SEO - Advice and suggestions.</title>
Body area of html web page
The body of the page should contain some valuable information for the viewer to read and learn something from. If the viewer stays a while and reacts favourably then the viewer has perhaps found what they want. This is good. If they back out after 3 seconds that is bad and there is little point in your page being listed by search engines.
Search engines know and analyse what your visitors want and do, in order to try to distinguish between good and bad pages. Try to make your page so good that people will read it all in detail, bookmark it, tell their friends about it, mention it in newsgroups, put links to it, follow links from it, vote for it, write reviews about it etc.
Optimisation for page content quality is widely held to be the most sound approach to search engine optimisation. This means making it such that it satisfies the searcher's needs.
Within the body make sensible use of <h1> <h2> etc headers. Use <h1> for the most important heading.
Count all the words used in visible body and in the title and description text. If the page is about three topics make sure that these are prominently clear of all the others in the frequency table e.g.
search 8.7%
engine 7.2%
optimisation 5.13%
next
3.6% (Aim for less than half the one above, if possible)
In some instances a page may contain high percentages of certain spurious words. I have a page, for example, with a giant table that contains an "Updated" column of relatively unimportant and repetitive dates and therefore can have predominance of a word like Jan, Feb, March etc. This does not seem to be a problem. The search engines seem to understand what the page is about from the title and the heading, even if an analysis of the text content would give a different impression.
Other factors:
Outgoing links: Remember that the objective of your page is to satisfy the searcher's needs. However, this does not mean that everything the searcher is after has to be found on the specific page. It is quite acceptable to be able to refer the searcher on to other relevant helpful pages on your site or to relevant helpful pages elsewhere on the internet. When you link out like this do research the pages you are linking to carefully for quality and pleasant visitor experience. Avoid linking to slow sites with large image file sizes, to flash pages, to anything with pop ups etc - whatever you dislike or think your visitors will dislike. Link direct to specific pages that are good - you don't need to link to other web site home pages. Try to keep the number of outgoing links low. This helps visitors as it is not easy for them to choose successfully amongst too many choices. Never put an active link to a rubbish site - it will cause your site to be downgraded. If there is some really valuable information on a dodgy site you have a problem if you want to link to it. Google have provided an answer for this: Use the rel="nofollow" construction and you will not be downgraded for linking to a dodgy area.
Incoming links - anchor text: if your page is mentioned on other pages in the internet then the other pages are effectively voting for yours. This is a search engine technique tried by Google and made public. As a result, many links have been created to try and improve Google page rank or PR. Many useful links were created and it has resulted in much better connectivity on the internet but there are now some extreme examples, as there has been an incentive to create vast numbers of spurious spam links. So don't go overboard, it is not a good idea to pay other sites to put links back to your site or to exchange links in reciprocal link schemes !. It may cause you to be downgraded for link-spamming. PR is just one factor of many in search engine results positions so don't be too worried about its optimisation. See my guess as to how Google Pagerank PR works.
Writing links within your web pages: When you put links back to your home page use the format http://www.satsig.net/ Do not ever show the filename of the home page (index.html, default.html, index.htm etc) anywhere on your site pages. This avoids you having multiple 'home' pages indexed. You can incorporate <base href="http://www.satsig.net/"> in the head area and then use relative links ( just the filename, except for links to the home page / ) elsewhere. If you are confused as to whether or not your web site name has www in front, make up your mind and then tell Google which one to use - see their Google Webmaster tools by going to their Guidelines above. To help the search engine robots to find all of your pages there needs to be at least one link leading to every page. MSN recommends that you should be able to get to any page via three link hops from the home page. That may be possible on some sites, but in my case I guess that 4 or 5 link hops are needed for some of the most obscure pages. Anyway my idea is to put links to suit the likely visitor - what are they likely to want to see next ?
Page technical quality: It helps if your page is straightforward as this makes it easier to read by the search engine. Avoid anything peculiar. If in doubt, start with a text only page with no formatting or graphics at all then spend hours on optimising the usefulness to the reader, working on the language and paragraphs and only then start gradually adding the fancy formatting and image padding. CSS can be useful for presentation and positioning if it reduces the overall page size. Validation of the page is worthwhile http://validator.w3.org as it may spot some serious big mistake. Validators do indicate many minor errors but these are often overlooked by search engines.
Images: To get images properly indexed, and this applies only to interesting images that people might want to find, use the correct keywords in the image alt="text", immediately adjacent in the body, either side of the image, and also use image file names like keyword1-keyword2.jpg with a hyphen to separate the two words. If an image is a link it is helpful to have alt text on it to describe the destination page.
Avoiding getting downgraded: Don't use hidden text. Don't put links to dubious sites or link-farms. Don't participate in link schemes. Don't link to any site that demands a reciprocal link - the demand itself is evidence of wrong doing - put links on merit for the benefit of your visitors. Don't over use particular words or word pairs. Don't put up similar sites or duplicate pages. Don't buy incoming links in an attempt to get better ranked. You may sin mildly in some of these areas and be downgraded a little. Regarding hidden text be careful that the text really does appear visible to the search engine. White coloured link text on a white background almost certainly counts as bad, even if it shows up visibly as blue or purple on the browser screen. You need to read the html code to spot these errors. Be extra careful if you ever use black backgrounds or white text on coloured backgrounds. When you later come to edit the page or copy bits to another page you may carry over underlying html code about the colours of the text that may not be obvious.
Duplicate pages: This is to be avoided, but there are cases where large numbers of pages are virtually identical. An example is my forum where the "User profile display page" is 95% identical for around 2200 pages, the only difference being the username and email currently displayed. In these cases make sure the pages are not indexed by search engines by using the NOINDEX NOFOLLOW in the header or by using your robots.txt Google provides a wildcard * option so you can easily stop large numbers of similar pages being indexed by using just one line in the robots.txt file. If you have all pages duplicated for the purpose of having "printer friendly" versions, the instructions are to block one of each pair in the robots.txt file.
Dynamic pages: Dynamic pages with question mark extensions ( ? ) may not be processed successfully by search engines. In particular avoid session variables where the same content is generated with a variety of url names. This can cause the search engine to index the same page content countless times, all with identical content but with different file url name extensions. This gets the search robot annoyed and might easily be mis-interpreted as deliberate page duplication spam - it certainly wastes the search engine storage space. There is also something peculiar about the way PR works with dynamic pages. I suspect that a PR value is assigned only to the root url name and that the displayed PR given to all the pages with extensions codes is just a display of the root PR divided by an estimate of the number of pages, to keep you happy. One good recommendation is to have proper static versions of all significant dynamic pages and then block your entire dynamic page system using the robots.txt file.
Doorway pages: I've never quite worked out what is a doorway page, but the thing to remember is that doorway pages are to be avoided as it is said that search engines intensely dislike them. I think a doorway page is supposed to be an isolated, almost stand alone page, with a link to your normal web site pages. Anyway if you think you know what a doorway page is, try to avoid doing it. I am sorry I can't be more helpful. I think I had several doorway pages once, that said "You have arrived here via a misspelled link - please click here to go to the intended page". I have deleted them in case they were being interpreted as spam and replaced them with proper 301 permanent redirects in the Apache server configuration file.
Content management system (CMS): If you are generating pages using a content management system (CMS) bear in mind that the search engine is interested only in the real core information that you are putting out and that this is often best expressed by putting the input data files as text direct to the search engine. So if you have anything interesting to say put it up as plain text pages and make all the CMS type web pages NOINDEX NOFOLLOW to the search engines. Using a CMS often just dilutes any genuine information amongst a load of repetitive dross in many instances.
Remember: Search engines are only interested in pages that are helpful in solving searchers' needs.
Further additional comments to me please by email to: Eric Johnston
Back to miscellaneous index page: other.htm
All content Copyright Reserved (c) Satellite Signals Ltd. Started 10 Oct 2003. Last updated: 6 May 2008