A crib sheet for search-savvy marketers

By · June 15, 2008 · Filed in iMedia Connection 2005-2008

Many of you recall John Battelle’s description of Google as “The Database of Intentions” in “The Search: How Google and Its Rivals Rewrote the Rules of Business and Transformed Our Culture.”

Battelle thought Google was “sitting on a gold mine of information” due to the billions of queries going through its servers. He was right.

But does Google satisfy intention in a query? Perhaps not quite, since we all need better search results, right? New search technologies are launched each year trying to supplant Google. Commercial search has been growing since 1995, back when I was starting my first search engine positioning work on GlobalSafety.com. Best practices in those days were about installing proper meta data and submitting your site to search engines manually. This was before the term “search engine optimization” was coined, and prior to black hat vs. white hat and paid search advertising entering the search marketing landscape.

The history of search in a nutshell
While Tim Burners-Lee invented the internet in 1989, it was merely a collection of servers for information sharing among academics and government in those days. In 1990, Burners-Lee developed the first web browser and called it the World Wide Web. The browser enabled users to locate web pages and view text and graphics on a page. It was the breakthrough that made the web available to a wide audience of home and business computer users, spawning the commercial aspect of the web.

Even though we had the World Wide Web in the early nineties, it didn’t truly evolve until more advanced browsers like Netscape Navigator (1994) and Microsoft Internet Explorer (1995) were launched. We all know Internet Explorer became the browser of choice because it was bundled with the Microsoft Operating Systems (Windows 95 and 98). However, AOL and Mozilla may get the last laugh as Firefox gains market share over Internet Explorer.

In the early days, the storing and retrieving of files was done by FTP (File Transfer Protocol), a means for exchanging files over the internet. However, the data was not organized, which made it difficult to find anything. That’s when the first search engine, Archie, was born, organizing the files and making them available to users upon a query. Archie answered user queries by searching the FTP sites across the internet, indexing the files and giving users access to its database. As search tools matured, Veronica and Gopher located documents rather than files.

Many search engines and directories were launched between 1994 and 1996, including Yahoo, Open Directory Project (DMOZ), WebCrawler, AltaVista, LookSmart, Lycos, HotBot, Infoseek and Excite. AOL, MSN and Google joined the fray in 1997 and 1998. Besides these major players, there were a number of specialty search engines on the web as well.

Next page >>

The foundation of modern search marketing
Two major events took place in the late 1990s that laid the groundwork for today’s competitive search marketing environment: the advent of Google PageRank and pay-per-click advertising.

Search took a dramatic shift when Google introduced PageRank in 1998, propelling the new kid on the block on its path to search dominance. The differentiator was the inclusion of authoritative links in the organic algorithm, providing much needed relevance. Users noticed in a hurry, and Google practically became an overnight sensation.

Another bold move in late 1997 was the introduction of pay-per-click advertising on GoTo.com (aka Overture and currently Yahoo Search Marketing).

GoTo got off to a slow start at first, with many search practitioners dismissing its importance. But advertisers and customers loved it. Despite the bad rap of paying for position, it eventually dominated internet marketing.

These two search innovations, linking to authority sites (PageRank) and pay-per-click advertising, were the twin forces that made Google the profitable search/media giant it is today.

<< Previous page | Next page >>

Phases of search development
The evolution of search has been described in terms of binary computer stages and has encompassed a number of dynamic changes over the past 13 years. Search Engine Land’s Danny Sullivan summarized the progression of search in his keynote at SMX-West 2008 as follows.

  • Search 1.0: Location and frequency of terms on the page — 1994-1998
  • Search 2.0: Off-page factors: PageRank and anchor text — 1998-2008
  • Search 3.0: Blended search — 2007-2008
  • Search 4.0: Personalized and social search — 2007-2008

Search progress can also be described in terms of algorithm development:

  1. Text Phase: Algorithms used the location and frequency of keywords on the page for ranking results. (1994 to 1998).
  2. Link Phase: Algorithms focused more on off-page factors due to Google PageRank. Linking and anchor text became important for ranking results. (1998 to present).
  3. Social Phase: Algorithms include past history and clickstream data; universal results include vertical databases (2007 to present).
  4. Behavior Phase: Algorithms to include user intent (2008 and beyond).

Google’s organic search algorithm uses personalization, blended search and social search to provide search results. It is the only search engine doing all that, even though Ask 3D includes blended search results. Google results are based on history and clickstream data, including your own behavior/clicks, those of people you know, and those of people in aggregate.

The screenshot below shows the effect of including personalization and vertical databases in web search results on a query from my computer for the movie “Zohan.” Besides the website, there are news results, YouTube and blog results. However, the same query would likely provide different search results for others because their web history and clickstream data are different from mine.

Currently, a typical Google search touches some 700 to 1,000 machines in Google’s data centers and can return a list of 5 million results in 0.16 seconds. Google is investing $2 billion each year into its 30 data centers and operates more than 467,000 servers.

<< Previous page | Next page >>

Alternative search technologies
Alternatives to Google have come and gone over the years, but none have yet challenged Google’s popularity. Google itself noted the inadequacy of its content-based filtering system in 2007 by implementing Personalized, Social, and Blended Search. Below is a brief review of a few alternative search technologies.

  • Search engines like Quintura and Clusty provide a cluster of results based on tags and keywords. Users view results along with the clusters of relationships in graphic form. This shows how the query is related to other topics, which can be pursued or not.
  • Engines like Hakia and Powerset use technologies that apply logic to infer user intent, then they use that knowledge to improve search results. They use natural language processing to understand the meaning behind a user query and personalization to tailor queries to user interests.
  • Collarity uses recommendations-based technology to provide search results based partly on content analysis and partly on the recommendations of individuals and groups, calling it Content Search and Discovery. It is focused on improving website monetization through relevant site search results and content recommendations driven by the anonymous behavior of a site’s audience.
  • Rollyo was one of the first social search engines allowing users to create their own search engine of trusted sources. This resulted in a searchroll restricted to selected websites that can be shared with other users. Search sites that restrict data sources to provide better search results include A9 and verticals like Retrevo (electronics search engine).
  • Mahalo is a human-powered search engine, using guides (editors) to create search results for popular search terms. Users can edit a page on Mahalo if they are registered and logged in, although Mahalo guides check every edit for accuracy. When Mahalo has no page for a search term, it serves results from Google and other engines.
  • ChaCha and Wikia also use human editors in an attempt to provide better search results. ChaCha pays human guides to answer questions for mobile users. Ask a question like, “How old is Hillary Clinton?” and it will tell you she’s 60 and was born October 26, 1947. Wikia, on the other hand, generates machine results first, and then teams of volunteer tech enthusiasts filter the sites and rank search results within a community-based model similar to Wikipedia. Since inception in January, Wikia has about 25,000 mini articles and 60,000 edits to the wiki, along with 20,000 registered users. However, Wikia lacks volume, as its database hasn’t had enough time to fully evolve.

There are hundreds of alternative search engines looking for the Holy Grail to better search results. What can be used to meet the demands of modern-day searchers? The latest trend is to put the human touch back into search with a focus on human intent.

User intent can be used in search technology for the benefit of consumers and marketers alike. Recommendations-based search technology may be ideal for tapping into user intent. While most search engines use algorithms based on content filtering technology, collaborative filtering technology can be used to make recommendations based partly on content analysis and partly on the recommendations of a person or group.
<< Previous page | Next page >>

Content filtering vs. collaborative filtering
Search engines currently use two types of search technology: content filtering and collaborative filtering. Most of today’s search engines use content filtering systems. While every search engine has its own secret sauce consisting of hundreds of ranking criteria, content relevancy is basically measured by density (the number of times the word occurs relative to the size of the document) and proximity (the occurrence of the terms near one another).

Recommendations-based algorithms with collaborative filtering offer the advantage of making recommendations based partly on content analysis and partly on the recommendations of individuals or groups. This brings the wisdom of crowds to users evaluating products on retail sites. However, ratings can be skewed as all users don’t rate all items, and ratings are subjective. Despite this, recommendations-based search results have proven useful.

Text-based vs. behavior-driven search
Is the combination of content filtering and collaborative filtering good enough to provide optimum relevancy, or do we need more information to determine intent? Over the past decade, we have progressed from text-based to behavior-driven search.

Our first algorithms were based on text on the page with an emphasis on meta tags and keywords. Robots followed links across the web, making copies of new pages. The pages were stored in a database called the index. Robot crawlers indexed millions of pages in those days compared to the billions indexed today. This ask-and-find technology was fairly crude compared to the sophistication of today’s search algorithms.

The big change came about when Google introduced link analysis. The reasoning was that links from authoritative sites made your site more relevant for its subject matter. This worked well for a number of years until sites started paying for links, partially diluting the effectiveness of links to determine relevancy.

In response, Google added a social component to search: It offered Google account holders a personalized homepage and started using a user’s web history in personalized search results. Google has been using information from users’ search history and Google homepage for well over a year. After personalization, Google started including its vertical databases in web results so users get blended search results of text documents, images, videos and many different types of content on a web search. This not only improves relevancy for users, it provides many new opportunities for marketers, as sites can create content to rank in multiple search results.

Behavior-driven search
The behavior phase has just begun and is manifest in product-driven search for ecommerce with behavior-driven algorithms. Behavior-driven search is based on the premise that each human has a set of core values that can be identified and measured. This information can then be used to determine user intent when a query is made. For instance, if you add human intent to the history and clickstream data, queries suddenly become more relevant because the matches come closer to the person’s query. But how do we discover human intent?

The behavior phase of search technology will be based on human needs rather than on content. This means new search algorithms will be based on the needs of users rather than the needs of content providers. This approach began with the recommendations-based technology used to power large ecommerce sites, as well as the voting algorithms on social sites like StumbleUpon, SeeqPod (music search), Scouta (media recommendations), and Criticter (movie reviews). Collective wisdom can be useful because it helps people make decisions by becoming aware of what other users like or dislike about any product or item.

<< Previous page | Next page >>

Recommendations-based search engines
ReadWriteWeb provides a list of ten recommendation engines besides the usual suspects like Netflix and Pandora. Among these are MyStrands, MatchMine, Zync and StumbleUpon. Not only does recommender technology play an important role in ecommerce, it is increasingly being used in general search engine algorithms. That’s because it’s a well-known fact that people trust people they know more than advertisers when it comes to product recommendations. So, it’s a no-brainer that recommendations-based technology will become pervasive in the search industry as technologies develop and mature.

The power of site search
Recommendations-based technology is creating new ecommerce revenue opportunities and is increasing both customer retention rates and the number of shoppers who become buyers. This is possible with the one tool that lets shoppers personalize their shopping experience: site search. With advanced site search and good navigation, shoppers can immediately filter hundreds of thousands of SKUs (stock keeping units) for the five or ten products they are looking for. Every dynamically generated search results page is an opportunity for a retailer to offer the right merchandising message to match a unique shopper’s expressed intent.

This combination of search and merchandising is called searchandising, and it can greatly increase conversion rates and average order values. It’s a no brainer for merchants to use the information from site search to tailor promotions for each shopper. For example, Mercado Software is an excellent site search engine for sophisticated ecommerce sites.

Customer-centric marketing
How does recommendations-based technology increase customer retention and sales? One large etailer creates a customer-centric experience by generating a microsite for each and every customer. Your site experience is dynamic rather than static and becomes more personalized as you continue to shop. Notice that after you conduct a few searches, the etailer will provide a “Recommended for you” box highlighting products related to your prior searches. In other words, if you search for an iPhone, it assumes you might be interested in other high-tech products. We’ve all seen those “customers with similar searches purchased …” promotions. The merchandising message is, “others like you bought these items, so you might think about buying this.”

You’d be surprised how effective this is, and it all takes place on the ecommerce homepage. The most important merchandising page on the site is completely automated, dynamic and personal.

This particular etailer’s revenue grew 37 percent in Q1 2008, compared to the same period in 2007 for $4.14 billion in net sales. Net income increased 30 percent to $143 million in Q1. Using similar technology, a smaller etailer increased revenue 400 percent in Q1 2008 compared to the same period in 2007. Not bad for so-called recessionary times.

So, what’s the potential for recommendations-based technology? Gartner research shows two-fifths of U.S. consumers expect retailers to provide personalized promotions. Yet, Forrester reports only 16 percent of retailers currently use personalized recommendations tools. There’s a huge market potential for recommendations-based technology.

<< Previous page | Next page >>

Vortex DNA
The latest innovation in recommender technology might be VortexDNA, a predictive modeling technology used to improve relevance for many types of online content. It can be used to improve search results, improve display ad click rates, and to improve relevance in many areas because VortexDNA is industry agnostic and can be used in a variety of situations.

VortexDNA is a plug-in that can be used in ecommerce to suggest products to customers. It is also used in the insurance industry to improve the accuracy of establishing premium rates. In television, it can help to create programming that customers will watch. In human resources, it can help match job seekers with available jobs, cutting recruitment costs. In search, it helps provide more relevant results with better personalization.

The VortexDNA plug-in contains a map of human consciousness, expressed as a seven-digit number revealing the pattern of your beliefs. It uses a unique profiling of your purpose, values and life focus to derive this seven-digit number from the essence of your intention as shaped by your behavior and actions online.

Initially, you must download the MywebDNA Firefox extension, filling out a brief questionnaire to derive your DNA number. Your DNA number is generated from the surveys you take or from your click activity. When you search or browse, MywebDNA matches your numerical profile (who you are based on your purpose, values, and life focus) with a numerical profile of links on the web.

Tip: Unless you are experienced in Beta testing new products, we don’t recommend downloading this plug-in.

The plug-in runs invisibly in the background as you browse, appearing only to circle those links that match your profile. Every time you click on a link, MywebDNA updates your DNA to make better predictions next time. You can also fill out additional surveys to enhance your profile.


Conclusion
Recommendations-based technologies are being used more and more to make search results more relevant. These technologies have the potential to improve the relevance of search results along with many applications that personalize and make the web experience more relevant for users. They can be used to optimize advertising placement, improve product referrals and to personalize website content to make it more relevant to individual visitors.

Satisfying user intent in search queries from off-site search engines or on-site internal search is a big part of the answer for making results more relevant and meeting the demands of modern-day searchers.

What’s Next? Look for semantic search to rise above Google’s current algorithm. Powerset and Cognition Technologies use systems which understand English language structure and the definitions of words to retrieve search results.

Example: In a semantic search engine, a query asking “which NASCAR drivers lost to Mario Andretti” would return a series of champion NASCAR drivers who “defeated,” “triumphed,” and “beat” Andretti. Google, doesn’t answer the query correctly; it links to several pages on the driver Andretti and no reference to those drivers who lost to Andretti. Powerset comes closer; their # 6 link in response to the same query finds a page in reference to Tom Sneva who “lost the CART title to Mario Andretti by 13 points..” coming closer to a relevant answer.

The evolution of search will continue to improve and provide a much more predictable outcome for acquiring new traffic, increasing conversion rates, retaining loyal returning visitors, and extending our business strategy online.

Leave a Comment