Making your site Search Engine Friendly

From Joomla! Documentation

Revision as of 08:52, 13 July 2013 by Tom Hutchison (talk | contribs) (RightTOC template with width)

Why Create a Search Engine Friendly Site?[edit]

In order to add your pages to its database, a search engine (such as Google) will send out so-called crawlers, spiders or bots to harvest the text on your site. These bots cannot harvest things that are created by Javascript, or 'see' images (though they do check alt tags) and the don't play well with Flash files if at all.

While all these things may make the site look better, they do little to nothing in terms of search engine optimisation (SEO) without adding descriptive information about those resources which are visible to a search engine (but not necessarily visible to your site visitors).

It is important that your website can be found by people who are looking for its content, therefore you must serve content to search engine 'bots' in a way that they can interpret, analyse and identify how relevant it is to the search query.

For this to happen, you need to bring to the attention of the 'bots' important information about the page using various techniques detailed below - almost like a 'signpost' telling the 'bot' what the page contains. It will then compare what you tell it the page is about, with what it finds by itself, and run various algorithms to check if the page is in fact relevant. It also runs other checks to make sure that you are not trying to cheat the system using 'black hat' or 'grey hat' tactics to make your page rank higher.

It is also possible to add contextual information to your website which helps the 'bot' to understand the context of the information it is indexing, ultimately resulting in more appropriate search results pages when people are searching for topics.

Using a Sitemap[edit]

While search engines can usually find your pages by the way they are linked from other places on the internet, it is good practice to create a Sitemap which gives search engine 'bots' a list of the pages on your website - think of it as a map to find all the content on your site.
Sitemaps are not only important for search engines, they are also very helpful for people with disabilities who may need a simple interface to view your site structure and navigate around the site without using your menu structures. W3C Working Group Note on Sitemaps

A sitemap serves several purposes:

  • Provides a structured list showing an overview of all content on your website
  • Allows a visitor to quickly get an overview of your site structure
  • Provides an alternative way of navigating your website, without the need for complex menu structures
  • Provides search engines with a means of finding content which might not be available through your menu structures (e.g. landing pages)

Types of Sitemap[edit]

It is possible to provide sitemaps for specific types of information, including:

These specialist sitemaps allow you to provide information relating to the specific media type - for example with a video sitemap you can provide information about the running time, category and family friendly status; with image sitemaps you can specify the subject of the image, its license for use, and type of image.

Creating a sitemap[edit]

On a static site, creating a sitemap is simply a case of manually creating an XML file using the appropriate standards, and saving it as an XML file. On a dynamic site, where content changes regularly, this is not really an option - you would have to manually update the sitemap file every time you added some new content!

For this reason there are several sitemap extensions available on the Joomla Extensions Directory (Sitemap category on Joomla Extensions Directory) which allow you to dynamically build a sitemap which meets the sitemap standards expected by search engines. Sitemaps protocol

Most of these extensions work by choosing menu items which you wish to include in a sitemap, and specifying how often they change (see Update Frequency). It is also possible to include sub-pages from those menu items (for example, a menu item might lead to a category blog page, but you want to display all the articles which are shown on this page as individual items - another example might be a menu item pointing at a shop category page, and in the sitemap you would want to list the category, and then each product within it as a separate link).

Update Frequency[edit]

While you can manually specify in your Sitemap how frequently search engine spiders should visit your website, most search engines have in-built systems which automatically adjust the frequency of return visits based on how often the page in question has changed.

So, for example, if you tell search engine bots to visit your page on a daily basis, but when it visits the page nothing has changed for a week, it may adjust the frequency of revisits accordingly and not return as often as you told it to. You can request, via the various webmasters portals, for the revisit rate to be amended if required.

This would suggest, therefore, that if you have regularly changing content, your website will be 'spidered' more frequently - leading to content being indexed quicker than websites which do not change often.

It is generally sensible to specify pages which are static to be crawled less frequently than those which change regularly. For example, a static text article might be set with an update frequency of once a month, whereas your blog or news page may be set with an update frequency of once a day or once a week, depending on how often you add new content.

Google Webmaster Tools thread on Googlebot requests & sitemap frequencies

HTML Sitemaps[edit]

An HTML sitemap is essentially a table of contents for your site which you can make available to visitors of your website. This serves two purposes:

  1. It provides a place where visitors can go to easily get to any content on your site, even if it isn't necessarily easy to access by other navigation aids on the site
  2. It provides a centralised store of links to the content on your site that can be easily indexed by search engines
  3. It allows users with disabilities to be able to quickly navigate your website with a simple list of links, rather than through complex menus

At the very least, a sitemap should link to the main sections and pages within your site, but the more detailed you can make it, the better.

There are available extensions previously mentioned that create sitemaps automatically based on Joomla content.

XML Sitemaps[edit]

XML Sitemaps are an easy way for webmasters to inform search engines about new and existing pages on their sites that are available for crawling. In its simplest form, a Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL (when it was last updated, how often it usually changes, and how important it is, relative to other URLs in the site) so that search engines can more intelligently crawl the site.

Using the Sitemap protocol does not guarantee that web pages are included in search engines, but provides hints for web crawlers to do a better job of crawling your site.

  1. An XML sitemap provides a list of links to the content on your site that can be easily indexed by search engines
  2. It is possible to create specific XML sitemaps for News, Mobile URLs, Images, and Video

There are available extensions that create XML sitemaps automatically based on Joomla content. More about the Sitemap protocol

Title tag[edit]

The title tag is found in the head portion of your pages. This title tag becomes the clickable title in search engine result pages (SERPs). A title should be under 70 characters in length. It should also include your keywords for the specific page, as close to the start of the title tag as possible.

Google recommends that you create unique, descriptive page titles to describe to searchers what the page is about.

If a title page is not specified, or importantly if Google determines that the title is not appropriate for the content being returned for the search term, algorithms may be used to generate alternative titles which are more relevant. In the screenshot below you can see the page title for this page (in large blue text)

SEF-Titles.png

Google recommends that you follow these key principles when creating a title:

  • Specify a unique title for every page
  • Make your title descriptive of the page content, and concise
  • Avoid keyword stuffing (repeatedly using similar words like "Foobar, foo bar, foobars, foo bars")
  • Avoid using generic titles - each page should have an unique title, ideally dynamically updated in relation to the content being displayed
  • Brand your titles, but do it concisely and in relation to the content being served
  • Use robots.txt carefully, don't disallow search engines from visiting your website

There are various Webmaster Tools which can be used to identify if there are problems with your listings in a particular search engine - it is always worth paying attention and correcting any problems.

Google support article on using titles for your web pages

Meta Description[edit]

The meta description tag allows you to provide a summary of the content on the page in one paragraph. The tag can be used by search engines to display a description of the page when displaying it in search engine result pages (SERPs).

Do I need to include a meta description? I thought search engines didn't use them any more?[edit]

With the advent of contextual 'semantic' markup which allows a website to provide a context for the information, sometimes search engines will find more relevant information within the page itself or from other sources which link to that page (such as DMOZ) and use that in preference to your meta description. This does not mean that the meta description is not used, it simply means that if more relevant information is found to describe that page in search listings, a search engine may use that information rather than your specified description. In fact, Google recommends that every page on your site has it's own unique meta description. Google help on site titles & meta descriptions

Recommendations for creating meta descriptions[edit]

When preparing meta descriptions for your pages, try to keep the total word count less than 155 characters, as there is a limit to the number of characters displayed in search results pages, so if you exceed this your text may be truncated. The meta description (if used) is one of the key factors that a person searching uses to decide whether your link is relevant to their search, so make sure that it makes sense to read, contains your keywords, and explains what is on the page. It is possible to dramatically improve your click through rate in search engines by making sure your meta descriptions are well optimised and relevant to the content on the page.

Google recommends the following to ensure that you gain the most from your search engine indexing:

  • Ensure every page has unique, relevant meta descriptions
  • Ensure you apply metadata for listing pages (e.g. blog & list layouts) in addition to individual articles - this is commonly overlooked on Joomla! websites
  • Include factual information if relevant (e.g. blog articles may include the author, products might include the price or manufacturer)
  • Consider using automatically generated metadata - but make sure it's relevant, readable and accurate
  • Make your descriptions descriptive!

Google support article on using metadata

Joomla! has a global setting for meta description doesn't it?[edit]

Joomla! does indeed have a global field for meta description, which you can find under Global Configuration > Site > Global Site Meta Description. This will be used on any page within your site which does not have it's own meta description specified, so it can be a 'get out clause' if you don't want to create a meta description for each page, however search engines do not like duplicate title or meta description tags - using this approach you will inevitably have many pages with the same meta description.

Consider carefully if you wish to complete these fields - in most cases the meta description in global configuration can be left blank.

Keywords[edit]

Keyword Strategy[edit]

Before you even begin creating your Joomla site it is important to carry out a keyword research project to identify what the core keywords will be for your site, and associated keywords which complement or are relevant to the core keywords. The reason for this is that keywords should be at the centre of your website design - your site structure (content categories, product groupings and so forth), menu items, content articles and so on should all be optimised to make the best use of your core and associated keywords.

As a basic guideline, ask yourself (or your client) the following questions:

  1. What are my top 20 keywords?
  2. What other associated words and phrases are important? (you can use the Keyword Checker Tool from Adwords to find complementary terms)
  3. What questions will people be asking which my website will answer?
  4. What will be the most important pages on my website (this is not always the homepage, often these will be pages within the website which contain specific information)

Meta Keywords[edit]

The meta keyword tag found in the head of most Web documents has little effect for Google, Bing and many of the other larger search engines any more, however some search engines such as Yahoo do still make use of the meta tags as part of their algorithm - so again this is something you should still continue to provide.

While it is true that many search engines do not use the information, they do still read the information - so make sure that your tags are relevant and descriptive of the content. As a rule of thumb, your main keywords should feature in your title, meta description, meta keywords and within the first few paragraphs of the content. If they don't, then consider whether you need to include them as keywords.

Keyword Density[edit]

Keyword density used to be the buzz word of the Search Engine Optimisation (SEO) world, however it has become less important since search engines moved their focus away from meta keywords in their ranking algorithms.

At a basic level, keyword density is a way of explaining the number of times a word is repeated in the readable content of a page. If you have ten readable words and one word is a keyword the density of that keyword would be 10%. If you have 100 words and one of them is a keyword you now have a density of 1%.

It is recommended that content pages should try to ensure that keyword density falls somewhere between 3.5-7% as this is considered to be 'readable' without appearing 'spammy'. There are many online tools whihc will analyse your website to provide estimated keyword densities for your pages.

Guidelines on keyword use[edit]

It is important to keep at the forefront of your mind that any content on your website should first and foremost be written for human beings, not for search engine 'bots'. When writing an article, for example, ensure that when you read the text back to yourself it makes grammatical sense to include the keywords you are using in the places they have been used. If it sounds strange or odd, or if you've repeated words un-necessarily, adjust your text.

Keywords can (and should) be used in your title, alias, URL, meta description, meta keywords and within the content itself, but you should not try to 'stuff' your content full of keywords - it should always be pass the 'readability' test above. Content that does not make sense and/or is full of keywords for the sole purpose of raising the keyword density to a high level could cause your site to be penalised by search engines, which could result in a drop in search ranking position and potentially de-listing of the page in question. It is actually quite to write an article that someone will enjoy reading or find usable at the same time as having a keyword density over 7%, so it should be obvious if your content falls into this category of potentially spammy material.

This section has a keyword density on the term “keyword” of 1.79% and on the term “keywords” of 2.05%. The combined density is approximately 3.8%. You can see how deliberate you would have to be to stuff a site greater than 7%!

How to add meta keywords to your site[edit]

  • Add your core keywords to your Global Configuration > Site > Global site meta keywords if you wish to use global metadata
  • Add specific meta keywords (which may include core keywords) to articles using the metadata information (make sure the keywords listed here are in the content of the article, title, alias and meta description also)
  • Adding the data does not harm your rankings in search engines, and may help you in Yahoo, and meta crawlers
  • Do not add more than 25 words in the meta data
  • Ensure that you separate words and phrases with a comma
  • Do not repeat keywords

Semantic HTML[edit]

What is Semantic HTML?[edit]

Semantic HTML is a way of using HTML coding to create or enhance the structure of a page. In other words, it's a way of using HTML markup - classes, divs, tags and so forth - to complement the actual words or resources on a page. This helps 'bots' and visitors using screen readers to understand the structure and context of the information on the page, along with its importance, relevance, and how it is related to other resources.

How to use Semantic HTML Markup[edit]

It is important to have an understanding of Semantic HTML if you are developing websites or writing content for them, as you will need to use the structural markup regularly.

An example of misuse of Semantic Markup can be found when an article has been written using normal text, but at some point in the text the writer wants to emphasise a particular phrase. They like the styling of the H1 tag, so they apply H1 to this phrase to 'make it look pretty'. Unfortunately, this is confusing to a search engine 'bot' and to users of screen readers, because they are told that this is the main heading of the page - rather than emphasised or important text.

Semantic HTML markup should only ever be used to add structure to a page - not to change the way it looks (this is done using Cascading Style Sheets (CSS) or in-line styling)

An example of Semantic HTML Markup[edit]

For example, lets say we have an article:

<h1>Using headings</h1>
This is an article about the importance of headings

<h2>Why use headings?</h2>
It is important to use headings so that search engine bots can tell what is an <strong>important</strong> part of your article

<h3>Types of headings</h3>
You can use set types of headings, but they should be ordered, and structured, within your page.  H1 should be your page title, with H2 being used for sub-headings of the page.  Any headings within your sub-headings should cascade using H3, H4, and H5 as appropriate.

<h2>Is it hard to implement headings?</h2>
It is really easy to implement headings, you just use the appropriate HTML code

<h3>Using headings on dynamic pages</h3>
On dynamic pages, simply wrap your main heading within a H1 (for example, the title of a category listing page would be H1) then wrap all subsequent headings in H2.

Here, a search engine 'bot' could clearly see the structure - h1, h2, h3 - but if we were to simply make these titles bold, underlined and larger font, it would be much more difficult to identify the structure. It is also possible to identify that the word 'important' is an emphasised word, something that is important within the page.

Semantic HTML is also

  • Easier to read (in the code)
  • Easier for accessibility purposes - screen readers function in a similar way to search engine bots to identify important headings
  • Potentially better for search engine optimisation

In the example provided below of this page in Google search results, you can see how the heading tags are being used by Google to identify smaller links within the main page which may be of interest to the person searching for a term (displayed in small blue hyperlinked text beneath the description) - another reason to ensure you structure your content well!

SEF-Titles.png

Microdata[edit]

Microdata is a more advanced form of Semantic HTML Markup which allows you to give even more contextual information about the content and structure of your website - for more information start with this page on Microdata.

Wikipedia article on Semantic HTML

Linking to other sites[edit]

Building links to other websites has always been an important part of search engine optimisation and is one of the most abused ways of attempting to gain prominence in search listings. As a result, search engines have cracked down on this kind of unnatural activity, heavily penalising websites which try to artificially increase their search engine rankings by building large volumes of reciprocal links with websites over a short period of time. In an attempt to address this problem, search engines regularly update their algorithms, and penalise any sites they find to be operating outside their best practice guidelines (See Google Updates for more information).

Safe link-building[edit]

While best practice recommendations change frequently, the 'rule of thumb' is to use common sense. Link building should be a natural process, and the websites which you link to, or that link to your website, should be related in some way to your site.

Anchor Text[edit]

Any links you have on your website which lead to an external resource should clearly inform your visitors what they are linking to without being spammy - for example Joomla! Documentation Project would be a clear indication that the link is directing people to the homepage of the Joomla! Documentation Project. If the link were to be written as here it is not so clear what page you might be visiting - and as you haven't used any keywords from the landing page, their site will not be getting the best benefit from your link.

The words which you use within the hyperlink are known as the 'anchor text' - and ideally these should contain information about the page to which you are linking, even better if these words feature in the URL as well. An example where this is often done poorly on Joomla! websites is where 'Read more' links display Read More rather than the title of the article, in the hyperlinked text. Joomla 3.x and above has implemented the ability to include the article title in the read more tag, which is an improvement.

Another common anchor text which doesn't help the user or search engine 'bots' is for content to only include a link on the word here - this doesn't tell you (or search engine 'bots') anything about the content on the page they are being directed to, unless they have read and understood the surrounding text. Using a well-written anchor text also gives the person visiting your site confidence in the link they are clicking - they know what to expect when visiting the link, so it may be more likely that they do so.

This is true for both internal links (if you're linking to another page or area on your website) and also on external links to other sites.

A word of warning[edit]

It is important to note that with recent updates to Google's algorithms (named Panda, and more recently Penguin), sites with unnatural linking profiles may be down-ranked in search engine positions. One of the main factors in the Penguin update targeted websites which had a large amount of its traffic coming from keyword-stuffed anchor text on hyperlinks from low-value websites. Keep your links relevant, appropriate and relating to what you are linking to.

Where possible, it's also wise to regularly check that your links are still valid. The user experience is tainted somewhat if 50% of the links on your site result in a 404 - Page Not Found error. There are tools available online which allow you to check broken links from your site - it is sensible to check this occasionally and correct any links which no longer work.

In short, use links appropriately, and build links to your site in an organic way. Natural link building adds value to your website for your visitors, and brings visitors to your site who are interested in your content. The user experience is improved greatly when you link to any article you may be referencing (whether internal or external), and the Search Engines generally recognise this and favour it.

Important points to Remember[edit]

The following points are important to bear in mind:

  • Anything that requires a login will not be 'seen' by a search engine (though some search engines will allow you to tell them how to bypass these)
  • This article is a very basic introduction to the topics discussed
  • Search Engine Optimisation (SEO) is a moving target - it is not something you do once and forget about, but requires continual effort to maintain rankings in search engines
  • SEO is only the start - it might help people find your site through search engines but you still need to have an engaging and accessible website with useful content to retain and convert those visitors

Search Engine Optimisation is an ongoing task, the 'rules' used change frequently and simply undertaking SEO work once will not guarantee you a high ranking for the rest of time. Unique content is important, but the user experience - the ability of a user to actually interact with your site and find what they need regardless of how or where they are accessing your site from - is becoming far more relevant. If a search engine finds it difficult to navigate your site (e.g. needs 7 'clicks' to reach an article) it will assume that real users will encounter similar difficulties, and you may not rank as well in search engines against competitors who have a better user experience.

Sitemaps can be helpful to provide search 'bots' with a list of pages on your site, but this does not negate the need for an easy to use, clear navigation structure.

Although Search Engine Optimisation is important, focusing on the basic elements of the user experience (easy navigation paths, unique and compelling content etc.) is often one of the best ways to ensure a higher ranking. Simple steps like ensuring appropriate Meta Keywords and Internal links may help to improve that experience further.