Latest News

Showing posts with label link. Show all posts
Showing posts with label link. Show all posts

Sunday, May 24, 2015

MYTHS AND MISCONCEPTIONS ABOUT SEARCH ENGINES | Chapter 9

Over the past several years, a number of misconceptions have emerged about how the search engines operate. For the beginner SEO, this causes confusion about what's required to perform effectively. In this section, we'll explain the real story behind the myths.

Search Engine Submission

In classical SEO times (the late 1990s), search engines had submission forms that were part of the optimization process. Webmasters and site owners would tag their sites and pages with keyword information, and submit them to the engines. Soon after submission, a bot would crawl and include those resources in their index. Simple SEO!
Unfortunately, this process didn't scale very well, the submissions were often spam, so the practice eventually gave way to purely crawl-based engines. Since 2001, not only has search engine submission not been required, but has become virtually useless. The engines all publicly note that they rarely use submitted URLs, and that the best practice is to earn links from other sites. This will expose your content to the engines naturally.
You can still sometimes find submission pages (here's one for Bing), but these are remnants of the past, and are unnecessary in the practice of modern SEO. If you hear a pitch from an SEO offering search engine submission services, run, don't walk, to a real SEO. Even if the engines used the submission service to crawl your site, you'd be unlikely to earn enough link juice to be included in their indices or rank competitively for search queries.
Search Engine Assistance

Meta Tags

Once upon a time, meta tags (in particular, the meta keywords tag) were an important part of the SEO process. You would include the keywords you wanted your site to rank for, and when users typed in those terms, your page could come up in a query. This process was quickly spammed to death, and was eventually dropped by all the major engines as an important ranking signal.
Other tags, in particular the title tag and meta description tag(covered previously in this guide), are crucial for quality SEO. Additionally, the meta robots tag is an important tool for controlling crawler access. So, while understanding the functions of meta tags is important, they're no longer the central focus of SEO.

Keyword Stuffing

Ever see a page that just looks spammy? Perhaps something like:
"Bob's cheap Seattle plumber is the best cheap Seattle plumber for all your plumbing needs. Contact a cheap Seattle plumber before it's too late."
Not surprisingly, a persistent myth in SEO revolves around the concept that keyword density—the number of words on a page divided by the number of instances of a given keyword—is used by the search engines for relevancy and ranking calculations.
Despite being disproved time and again, this myth has legs. Many SEO tools still feed on the concept that keyword density is an important metric. It's not. Ignore it and use keywords intelligently and with usability in mind. The value from an extra 10 instances of your keyword on the page is far less than earning one good editorial link from a source that doesn't think you're a search spammer.

Paid Search Helps Bolster Organic Results

Put on your tin foil hats; it's time for the most common SEO conspiracy theory: spending on search engine advertising (pay per click, or PPC) improves your organic SEO rankings.
In our considerable experience and research, we've never seen evidence that paid advertising positively affects organic search results. Google, Bing, and Yahoo! have all erected walls in their organizations specifically to prevent this type of crossover.
At Google, advertisers spending tens of millions of dollars each month have noted that even they cannot get special access or consideration from the search quality or web spam teams. So long as the search engines maintain this separation, the notion that paid search bolsters organic results should remain a myth.

SEARCH ENGINE SPAM

As long as there is search, there will be spam. The practice of spamming the search engines—creating pages and schemes designed to artificially inflate rankings or abuse the ranking algorithms—has been rising since the mid-1990s.
The stakes are high. One SEO noted that a single day ranking atop Google's search results for the query "buy viagra" could bring upwards of $20,000 in affiliate revenue. So it's little wonder that manipulating the engines is such a popular activity. However, it has become increasingly difficult and, in our opinion, less and less worthwhile for two reasons:

1. Not Worth the Effort

Users hate spam, and the search engines have a financial incentive to fight it. Many believe that Google's greatest product advantage over the last 10 years has been its ability to control and remove spam better than its competitors. It's undoubtedly something all the engines spend a great deal of time, effort, and resources on. While spam still works on occasion, it generally takes more effort to succeed than producing good content, and the long-term payoff is virtually non-existent.
Instead of putting all that time and effort into something that the engines will throw away, why not invest in a value-added, long-term strategy instead?

2. Smarter Engines

Search engines have done a remarkable job identifying scalable, intelligent methodologies for fighting spam manipulation, making it dramatically more difficult to adversely affect their intended algorithms. Metrics like Moz's TrustRank, statistical analysis, and historical data, have all driven down the value of search spam and made white hat SEO tactics (those that don't violate the search engines' guidelines) far more attractive.
More recently, Google's Panda update introduced sophisticated machine learning algorithms to combat spam and other low-value pages, and the search engines continue to innovate and raise the bar for delivering quality results.
We obviously don't recommend employing spam tactics. But to assist the large number of SEOs who seek help when their sites get penalized, banned, or flagged, it is worthwhile to review some of the factors the engines use to identify spam. For additional details about spam from the engines, see Google's Webmaster Guidelinesand Bing's Webmaster FAQs (PDF).
The important thing to remember is this: manipulative techniques generally won't help you, and they often result in search engines imposing penalties on your site.

PAGE-LEVEL SPAM ANALYSIS

Search engines perform spam analysis across individual pages and entire websites (domains). We'll look first at how they evaluate manipulative practices on the URL level.

Keyword Stuffing

One of the most obvious and unfortunate spamming techniques, keyword stuffing, involves littering keyword terms or phrases repetitively on a page in order to make it appear more relevant to the search engines. As discussed above, this strategy is almost certainly ineffectual.
Scanning a page for stuffed keywords is not terribly challenging, and the engines' algorithms are all up to the task. You can read more about this practice, and Google's views on the subject, in a blog post from the head of their web spam team: SEO Tip: Avoid Keyword Stuffing.
Santa's Sleigh

Manipulative Linking

One of the most popular forms of web spam, manipulative link acquisition, attempts to exploit the search engines' use of link popularity in their ranking algorithms to artificially improve visibility. This is one of the most difficult forms of spamming for the search engines to overcome because it can come in so many forms. A few of the many ways manipulative links can appear include:
  • Reciprocal link exchange programs: Sites create link pages that point back and forth to one another in an attempt to inflate link popularity. The engines are very good at spotting and devaluing these as they fit a very particular pattern.
  • Link schemes: These include "link farms" and "link networks" where fake or low-value websites are built or maintained purely as link sources to artificially inflate popularity. The engines combat these by detecting connections between site registrations, link overlap, and other methods targeted at common link scheme tactics.
  • Paid links: Those seeking to earn higher rankings buy links from sites and pages willing to place a link in exchange for money. These sometimes evolve into larger networks of link buyers and sellers, and although the engines work hard to stop them (Google in particular has taken dramatic actions), they persist in providing value to many buyers and sellers (more on that perspective).
  • Low quality directory links: These are a frequent source of manipulation for many in the SEO field. A large number of pay-for-placement web directories exist to serve this market and pass themselves off as legitimate, with varying degrees of success. Google often takes action against these sites by removing the PageRank score from the toolbar (or reducing it dramatically), but won't do this in all cases.
There are many more manipulative link building tactics that the search engines have identified. In most cases, they have found algorithmic methods for reducing their impact. As new spam systems emerge, engineers will continue to fight them with targeted algorithms, human reviews, and the collection of spam reports from webmasters and SEOs.

Cloaking

A basic tenet of search engine guidelines is to show the same content to the engine's crawlers that you'd show to a human visitor. This means, among other things, not to hide text in the HTML code of your website that a normal visitor can't see.
When this guideline is broken, the engines call it "cloaking" and take action to prevent these pages from ranking in their results. Cloaking can be accomplished in any number of ways and for a variety of reasons, both positive and negative. In some cases, the engines may let practices that are technically cloaking pass because they contribute to a positive user experience. For more on the subject of cloaking and the levels of risk associated with various tactics, see our article on White Hat Cloaking.

Low Value Pages

Although it may not technically be considered web spam, the engines all have methods to determine if a page provides unique content and value to its searchers. The most commonly filtered types of pages are thin affiliate content, duplicate content, and dynamically-generated content pages that provide very little unique text or value. The engines are against including these pages and use a variety of content and link analysis algorithms to screen out low value pages.
Google's 2011 Panda update took aggressive steps to reduce low quality content across the web, and Google continues to iterate on this process.

DOMAIN LEVEL SPAM ANALYSIS

In addition to scanning individual pages for spam, engines can also identify traits and properties across entire root domains or subdomains that could flag them as spam.

Linking Practices

Just as with individual pages, the engines can monitor the kinds of links and quality of referrals sent to a website. Sites that are clearly engaging in the manipulative activities described above on a consistent or seriously impacting way may see their search traffic suffer, or even have their sites banned from the index. You can read about some examples of this from past posts, including Widgetbait Gone Wild and the more recent coverage of theJC Penney Google penalty.
Mythical Creature

Trustworthiness

Websites that have earned trusted status are often treated differently from those that have not. SEOs have commented on the double standards that exist for judging big brand, high-importance sites compared to newer, independent sites. For the search engines, trust most likely has to do with the links your domain has earned. If you publish low-quality, duplicate content on your personal blog, then buy several links from spammy directories, you're likely to encounter considerable ranking problems. However, if you post that same content on Wikipedia, even with the same spammy links pointing to the URL, it would likely still rank tremendously well. Such is the power of domain trust and authority.
Trust can also be established through inbound links. A little duplicate content and a few suspicious links are far more likely to be overlooked if your site has earned hundreds of links from high-quality, editorial sources like CNN.com or Cornell.edu.

Content Value

As we've seen, an individual page's value is computed in part based on its uniqueness and the visitor's experience; likewise is the entire domain's value assessed. Sites that primarily serve non-unique, non-valuable content may find themselves unable to rank, even if classic on- and off-page SEO is well-optimized. The engines simply don't want thousands of copies of Wikipedia filling up their indexes, so they use algorithmic and manual review methods to prevent this.
Search engines constantly evaluate the effectiveness of their own results. They measure when users click on a result, quickly hit the back button on their browser, and try another result. This indicates that the result they served didn't meet the user's expectations.
It's not enough just to rank for a query. Once you've earned your ranking, you have to prove it over and over again.

So How Do You Know If You’ve Been Bad?

It can be tough to know if your site or page actually has a penalty. Sometimes, search engines' algorithms change. Or maybe you changed something on your site that negatively impacted your rankings. Before you assume you've been penalized, check for the following:
Step 1: Rule Out
Once you’ve ruled out the list below, follow the flowchart beneath for more specific advice.

Errors

Errors on your site that may have inhibited or prevented crawling. Google's Webmaster Toolsis a good, free place to start.

Changes

Changes to your site or pages that may have changed the way search engines view your content. (on-page changes, internal link structure changes, content moves, etc.).

Similarity

Check for sites that share similar backlink profiles, and see if they’ve also lost rankings. When the engines update ranking algorithms, link valuation and importance can shift, causing ranking movements.

Duplicate Content

Modern websites are rife with duplicate content problems, especially when they scale to large size. Check out this post on duplicate content to identify common problems.
Step 2: Follow FlowchartFlowchart


While this chart’s process won’t work for every situation, the logic has proven reliable in helping us identify spam penalties and mistaken flagging for spam by the engines, and separating those from basic ranking drops. This page from Google (and the embedded YouTube video) may also provide value on this topic.

Getting Penalties Lifted

The task of requesting reconsideration or re-inclusion in the engines is painful and often unsuccessful. It's also rarely accompanied by any feedback to let you know what happened or why. However, it is important to know what to do in the event of a penalty or banning.
  1.  If you haven't already, register your site with the engine's Webmaster Tools service (Google's and Bing's). This registration creates an additional layer of trust and connection between your site and the search engine teams.
  2.  Make sure to thoroughly review the data in your Webmaster Tools accounts, from broken pages to server or crawl errors to warnings or spam alert messages. Very often, what's initially perceived as a mistaken spam penalty is, in fact, related to accessibility issues.
  3.  Send your reconsideration/re-inclusion request through the engine's Webmaster Tools service rather than the public form; again, this creates a greater trust layer and a better chance of hearing back.
  4.  Full disclosure is critical to getting consideration. If you've been spamming, own up to everything you've done—links you've acquired, how you got them, who sold them to you, etc. The engines, particularly Google, want the details so they can improve their algorithms. Hold back, and they're likely to view you as dishonest, corrupt, or simply incorrigible (and they probably won't respond).
  1.  Remove or fix everything you can. If you've acquired bad links, try to get them taken down. If you've done any manipulation on your own site (over-optimized internal linking, keyword stuffing, etc.), get it off before you submit your request.
  2.  Get ready to wait. Responses can take weeks, even months, and re-inclusion itself, if it happens, is a lengthy process. Hundreds, perhaps thousands, of sites are penalized every week; you can imagine the request backlog.
  3.  If you run a large, powerful brand on the web, re-inclusion can be faster by going directly to an individual source at a conference or event. Engineers from all of the engines regularly participate in search industry conferences (SMXSESPubcon, etc.). The value of quickly being re-included can be worth the price of admission.
Be aware that with the search engines, lifting a penalty is not their obligation or responsibility. Legally, they have the right to include or reject any site or page. Inclusion is a privilege, not a right; be cautious and don't apply SEO techniques that you're skeptical about, or you might find yourself 
Source: Moz
Author: Bryan Granse
Contact: Click Me

SEARCH ENGINE TOOLS AND SERVICES | Chapter 8

SEOs tend to use a lot of tools. Some of the most useful are provided by the search engines themselves. Search engines want webmasters to create sites and content in accessible ways, so they provide a variety of tools, analytics and guidance. These free resources provide data points and unique opportunities for exchanging information with the engines.
Below we explain the common elements that each of the major search engines support and identify why they are useful.

Common Search Engine Protocols

1. Sitemaps

Think of a sitemap as a list of files that give hints to the search engines on how they can crawl your website. Sitemaps help search engines find and classify content on your site that they may not have found on their own. Sitemaps also come in a variety of formats and can highlight many different types of content, including video, images, news, and mobile.
You can read the full details of the protocols at Sitemaps.org. In addition, you can build your own sitemaps at XML-Sitemaps.com. Sitemaps come in three varieties:

XML

Extensible Markup Language (recommended format)
  • This is the most widely accepted format for sitemaps. It is extremely easy for search engines to parse and can be produced by a plethora of sitemap generators. Additionally, it allows for the most granular control of page parameters.
  • Relatively large file sizes. Since XML requires an open tag and a close tag around each element, file sizes can get very large.

RSS

Really Simple Syndication or Rich Site Summary
  • Easy to maintain. RSS sitemaps can easily be coded to automatically update when new content is added.
  • Harder to manage. Although RSS is a dialect of XML, it is actually much harder to manage due to its updating properties.

Txt

Text File
  • Extremely easy. The text sitemap format is one URL per line up to 50,000 lines.
  • Does not provide the ability to add meta data to pages.

2. Robots.txt

The robots.txt file, a product of the Robots Exclusion Protocol, is a file stored on a website's root directory (e.g., www.google.com/robots.txt). The robots.txt file gives instructions to automated web crawlers visiting your site, including search crawlers.
By using robots.txt, webmasters can indicate to search engines which areas of a site they would like to disallow bots from crawling, as well as indicate the locations of sitemap files and crawl-delay parameters. You can read more details about this at the robots.txt Knowledge Center page.
The following commands are available:

Disallow

Prevents compliant robots from accessing specific pages or folders.

Sitemap

Indicates the location of a website’s sitemap or sitemaps.

Crawl Delay

Indicates the speed (in milliseconds) at which a robot can crawl a server.
An Example of Robots.txt
#Robots.txt www.example.com/robots.txt
User-agent: *
Disallow:

# Don’t allow spambot to crawl any pages
User-agent: spambot
disallow: /

sitemap:www.example.com/sitemap.xml
Warning: Not all web robots follow robots.txt. People with bad intentions (e.g., e-mail address scrapers) build bots that don't follow this protocol; and in extreme cases they can use it to identify the location of private information. For this reason, it is recommended that the location of administration sections and other private sections of publicly accessible websites not be included in the robots.txt file. Instead, these pages can utilize the meta robots tag (discussed next) to keep the major search engines from indexing their high-risk content.
Disallow Robot

3. Meta Robots

The meta robots tag creates page-level instructions for search engine bots.
The meta robots tag should be included in the head section of the HTML document.
An Example of Meta Robots
<html>
  <head>
    <title>The Best Webpage on the Internet</title>
    <meta name="ROBOTS" content="NOINDEX, NOFOLLOW">
  </head>
  <body>
    <h1>Hello World</h1>
  </body>
</html>
In the example above, “NOINDEX, NOFOLLOW” tells robots not to include the given page in their indexes, and also not to follow any of the links on the page.
Robots Meta Tag
People Going Over the Edge

4. Rel="Nofollow"

Remember how links act as votes? The rel=nofollow attribute allows you to link to a resource, while removing your "vote" for search engine purposes. Literally, "nofollow" tells search engines not to follow the link, although some engines still follow them to discover new pages. These links certainly pass less value (and in most cases no juice) than their followed counterparts, but areuseful in various situations where you link to an untrusted source.
An Example of nofollow
<a href="http://www.example.com" title="Example" rel="nofollow">Example Link</a>
In the example above, the value of the link would not be passed to example.com as the rel=nofollow attribute has been added.

5. Rel="canonical"

Often, two or more copies of the exact same content appear on your website under different URLs. For example, the following URLs can all refer to a single homepage:
  • http://www.example.com/
  • http://www.example.com/default.asp
  • http://example.com/
  • http://example.com/default.asp
  • http://Example.com/Default.asp
To search engines, these appear as five separate pages. Because the content is identical on each page, this can cause the search engines to devalue the content and its potential rankings.
The canonical tag solves this problem by telling search robots which page is the singular, authoritative version that should count in web results.
An Example of rel="canonical" for the URL http://example.com/default.asp
<html>
  <head>
    <title>The Best Webpage on the Internet</title>
    <link rel="canonical" href="http://www.example.com">
  </head>
  <body>
    <h1>Hello World</h1>
  </body>
</html>
In the example above, rel=canonical tells robots that this page is a copy of http://www.example.com, and should consider the latter URL as the canonical and authoritative one.

Search Engine Tools

Google Webmaster Tools

Key Features

Geographic Target - If a given site targets users in a particular location, webmasters can provide Google with information that will help determine how that site appears in its country-specific search results, and also improve Google search results for geographic queries.
Preferred Domain - The preferred domain is the one that a webmaster would like used to index their site's pages. If a webmaster specifies a preferred domain as http://www.example.com and Google finds a link to that site that is formatted as http://example.com, Google will treat that link as if it were pointing at http://www.example.com.
URL Parameters - You can indicate to Google information about each parameter on your site, such as "sort=price" and "sessionid=2". This helps Google crawl your site more efficiently.
Crawl Rate - The crawl rate affects the speed (but not the frequency) of Googlebot's requests during the crawl process.
Malware - Google will inform you if it has found any malware on your site. Malware creates a bad user experience, and hurts your rankings.
Crawl Errors - If Googlebot encounters significant errors while crawling your site, such as 404s, it will report these.
HTML Suggestions - Google looks for search engine-unfriendly HTML elements such as issues with meta descriptions and title tags.
Google Webmasters Tools

Your Site on the Web

Statistics provided by search engine tools offer unique insight to SEOs, like keyword impressions, click-through rates, top pages delivered in search results, and linking statistics.

Site Configuration

This important section allows you to submit sitemaps, test robots.txt files, adjust sitelinks, and submit change of address requests when you move your website from one domain to another. This area also contains the Settings and URL parameters sections discussed in the previous column.

+1 Metrics

When users share your content on Google+ with the +1 button, this activity is often annotated in search results. Watch this illuminating video on Google+ to understand why this is important. In this section, Google Webmaster Tools reports the effect of +1 sharing on your site's performance in search results.

Labs

The Labs section of Webmaster Tools contains reports that Google considers still in the experimental stage, but which can nonethelsss be useful to webmasters. One of the most important of these reports is Site Performance, which indicates how fast or slow your site loads for visitors.
Webmaster Center

Bing Webmaster Center

Key Features

Sites Overview - This interface provides a single overview of all your websites' performance in Bing powered search results. Metrics at a glance include clicks, impressions, pages indexed, and number of pages crawled for each site.
Crawl Stats - Here you can view reports on how many pages of your site Bing has crawled and discover any errors encountered. Like Google Webmaster Tools, you can also submit sitemaps to help Bing to discover and prioritize your content.
Index - This section allows webmasters to view and help control how Bing indexes their web pages. Again, similar to settings in Google Webmaster Tools, here you can explore how your content is organized within Bing, submit URLs, remove URLs from search results, explore inbound links, and adjust parameter settings.
Traffic - The traffic summary in Bing Webmaster Center reports impressions and click-through data by combining data from both Bing and Yahoo! search results. Reports here show average position as well as cost estimates if you were to buy ads targeting each keyword.

Moz Open Site Explorer

Moz's Open Site Explorer provides valuable insight into your website and links.

Features

Identify Powerful Links - Open Site Explorer sorts all of your inbound links by their metrics that help you determine which links are most important.
Find the Strongest Linking Domains - This tool shows you the strongest domains linking to your domain.
Analyze Link Anchor Text Distribution - Open Site Explorer shows you the distribution of the text people used when linking to you.
Head to Head Comparison View - This feature allows you to compare two websites to see why one is outranking the other.
Social Share Metrics - Measure Facebook Shares, Likes, Tweets, and +1's for any URL.
Learn more
Open Site Explorer
Search engines have only recently started providing better tools to help webmasters improve their search results. This is a big step forward in SEO and the webmaster/search engine relationship. That said, the engines can only go so far to help webmasters. It is true today, and will likely be true in the future, that the ultimate responsibility for SEO lies with marketers and webmasters.
It is for this reason that learning SEO for yourself is so important.
Source: Moz
Author: Bryan Granse
Contact: Click Me

Recent Post