The FAST and Fresh Machine

By Paul · February 12, 2002 · Filed in Search Engine Guide 2005-2007

Search technology is rapidly evolving these days, and Fast Search & Transfer (FAST) ranks among the best with its AllTheWeb public search engine. Like many search engine companies these days, FAST’s business strategy is two-pronged. Its public search engine delivers fresh results fast, and it also serves as a showcase and testing ground for OEM partners.

To quote Dr. Rob Rubin, FAST general manager of the Internet Business Unit, “The same search technology developed by FAST to power Internet portals like Lycos and T-Online is used to provide ASP-based hosted search solutions for major corporations like IBM. We distribute our technology in shrink-wrapped packages and OEM for companies like Reuters and TIBCO. We also supply major e-commerce vendors like eBay and BroadVision.”

Rubin went on to explain that FAST has modest financial goals for AllTheWeb because it’s used not only to deliver results to a sophisticated user base, but also to test FAST’s technology. “We’re looking to cover costs, not drive profits – that’s for our OEM partners to do with their destination sites,” Rubin said

FAST did well financially, reporting a Q3 2001 revenue of $9.6 million, an increase of over 10 percent compared to the $8.7 million reported in Q2. Such performance doesn’t happen without some frugality. In 2001, FAST trimmed its overhead and narrowed its focus toward revenue-producing product suites such as FAST Web Search and FAST Data Search. By year-end, its workforce had shrunk from 230 to 170, and management resources became concentrated in Oslo, Norway, and Boston due to downsizing in the U.S., Norway, and the U.K.

Monetizing the Internet Business Unit

FAST consists of two major business units. The Internet Business Unit (AllTheWeb), competes with search engines like Google, Inktomi, and AltaVista. The Enterprise Business Unit, which powers Internet portals and provides OEM solutions to major corporations, competes with firms like Google, Inktomi, Autonomy, and Verity. In Q3 last year, the Enterprise Unit accounted for 60 percent of the revenue.

Plans are underway to further monetize the Internet unit by promoting AllTheWeb’s paid-inclusion PartnerSite service, which guarantees entry in the company’s databases for a per-URL fee (administered through partners like Terra Lycos). While this service does notĀ  influence positioning or ranking, it gives visibility to many different file types and dynamic content.

PartnerSite includes a 24-hour refresh and distribution through all of FAST’s customers using the same technology (e.g., IBM). This option isn’t cost-effective for small sites, but it can be ideal for those who have time to monitor and rewrite each page every 24 hours to achieve high rankings (provided their SEO operations include staffing for this redundant monitoring and modification).

What’s been done to enhance AllTheWeb? “FAST just introduced its linguistics tool kit to automatically detect phrases and provide accurate search results” replied Rubin. I checked this out by entering “Who is Andrea Bocelli?” and “What is the weather in Rome, Italy?” Excellent references and fast!

The Deep Crawl

To index more of the Deep Web and run faster, FAST completely redesigned and relaunched its crawler technology last November. AllTheWeb can index multiple document types and crawls most dynamic content. For example, see the PDF documents indexed at Scirus or the video clips on Megasoccer.com.

How does dynamic indexing work? Stephen Baker, Director of Marketing and Business Development for the Internet Business Unit, answers that question for us. “FAST employs a real-time filter and fast indexing schedule for up-to-date information. Non-HTML files are indexed from HTTP and FTP servers using our state-of-the art crawling and indexing technologies. The index supports over 200 file types, including PDF, Real Media, JPEG’s, and a variety of executables. In addition, we have the ability to work with Web sites and content owners to incorporate database-driven content in the Web Search index through our FAST PartnerSite system.”

Companies with content in multiple formats can have their content indexed through FAST’s PartnerSite inclusion program. Firms like IBM have the option to decide which documents on their public Web sites they want to make available through the index. Note that many different document types are already available by FTP search on AllTheWeb.

FAST also created a real-time cluster in its data center to support news crawling, which is based on the same technology used to power Reuters. FAST uses this technology to continuously crawl 3,000 premier news sites, retaining the index for five days and continually refreshing it every two hours. The index is updated in sub-second latencies as new documents are crawled.

Tips for Top Listings

FAST suggests that webmasters focus on the basics, with emphasis on leading-paragraph content. “Clean design, good title text, clear descriptions, and good linkage from appropriate reference points within the site will help improve your rankings,” said Rubin. “Always focus on improving your site content.”

“The FAST robot starts off by checking Open Directory Project (ODP) descriptions. If none are found, it will look at Meta descriptions. Again, if none are found, it will examine the first 250 characters of content. Linguistic analysis of keyword frequency may reveal suspicious patterns, such as too many adjectives or the overuse of keywords. Such documents are flagged as spam,” he continued. I might add that FAST won’t tolerate cloaking or spam and fights vigorously to eradicate both, which includes getting offenders black listed.

Changing User Behavior

The percentage of users finding Web sites through search engines has fluctuated over the years (86 percent according to the GVU User Surveys and 42 percent or lower in other surveys). It’s a given that most people go to search engines seeking information. Rubin provided information from a Jupiter report (August 2001) in which Rob Leathern stated that (1) experienced users most often use search engines for needed data, and (2) about 30 percent of e-commerce purchases start at a general-purpose search engine. “FAST knows from its log investigations that people are launching their searches and navigating more using a public search engine,” added Rubin.

What’s in Store for the Future?

I think we’ve all noticed the trend toward categorization these days. Rubin predicts more navigation features in future search engines, much likeĀ ”FAST Topics” (listings shown at the top of the page to help users further refine searches within specific categories). “Search companies will concentrate on better intelligence, leading users to the data they seek. This includes categorization and phrase matching with larger phrase dictionaries” said Rubin.

Because of the continuing demand to access more Web content, Rubin believes we’ll see more specialized catalogs such as those created for science data at Scirus and for soccer information at Megasoccer.

Predicting a period of great innovation, Rubin says FAST is set to “attack the four pillars of search: relevancy, freshness, size, and speed.” Looks to me like they’ve got a good head start.