A Web search engine is a tool designed to search for information on the World Wide Web The World Wide Web is a system of interlinked hypertext documents accessed via the Internet. With a Web browser, one can view Web pages that may contain text, images, videos, and other multimedia and navigate between them using hyperlinks. Using concepts from earlier hypertext systems, the World Wide Web was invented in 1989 by the English. The search results are usually presented in a list and are commonly called hits. The information may consist of web pages A web page or webpage is a document or resource of information that is suitable for the World Wide Web and can be accessed through a web browser and displayed on a computer screen, images, information and other types of files. Some search engines also mine data Data mining is the process of extracting hidden patterns from data. As more data is gathered, with the amount of data doubling every three years, data mining is becoming an increasingly important tool to transform this data into information. It is commonly used in a wide range of profiling practices, such as marketing, surveillance, fraud available in databases A Database is an integrated collection of logically related records or files that is stored in a computer system which consolidates records previously stored in separate files into a common pool of data records that provides data for many applications. A Database is a collection of information that is organized so that it can easily be accessed, or open directories A web directory or link directory is a directory on the World Wide Web. It specializes in linking to other web sites and categorizing those links. Unlike Web directories A web directory or link directory is a directory on the World Wide Web. It specializes in linking to other web sites and categorizing those links, which are maintained by human editors, search engines operate algorithmically In mathematics, computing, linguistics, and related subjects, an algorithm is a finite sequence of instructions, an explicit, step-by-step procedure for solving a problem, often used for calculation and data processing. It is formally a type of effective method in which a list of well-defined instructions for completing a task, will when given an or are a mixture of algorithmic and human input.

Contents

History

Timeline (full list This is a list of Wikipedia articles about search engines, including web search engines, selection-based search engines, metasearch engines, desktop search tools, and web portals and vertical market websites that have a search facility for online databases)
Year Engine Event
1993 Aliweb ALIWEB can be considered the first Web search engine, as its predecessors were either built with different purposes (the Wanderer, Gopher) or were literally just indexers (Archie, Veronica and Jughead) Launch
JumpStation JumpStation was the first WWW search engine that behaved, and appeared to the user, the way current web search engines do. It started indexing on Sunday 12th December 1993 and was announced on the Mosaic "What's New" webpage on 21st December 1993. It was hosted at the University of Stirling in Scotland Launch
1994 WebCrawler WebCrawler is a metasearch engine that blends the top search results from Google, Yahoo!, Live Search , Ask.com, About.com, MIVA, LookSmart and other popular search engines. WebCrawler also provides users the option to search for images, audio, video, news, yellow pages and white pages. WebCrawler is a registered trademark of InfoSpace, Inc Launch
Infoseek Infoseek was a very popular search engine founded in 1994 by Steve Kirsch. It was also known as "big yellow" Launch
Lycos Lycos began as a search engine research project by Dr. Michael Loren Mauldin of Carnegie Mellon University in 1994. Bob Davis joined the company as its CEO and first employee in 1995. Lycos then enjoyed several years of growth and, in 1999, became the most visited online destination in the world, with a global presence in more than 40 countries Launch
1995 AltaVista AltaVista is a web search engine owned by Yahoo!. AltaVista was once one of the most popular search engines but its popularity has waned due to the rise of Google Launch
Open Text Open Text Corporation is a Canadian high-tech company based in Waterloo, Ontario, Canada. It produces and distributes computer software applications designed to enable enterprise content management solutions for large corporate and government systems. Its flagship offering is the Open Text Enterprise Suite supported by Open Text Content Services Web Index Launch[1]
Magellan Launch
Excite Excite is an Internet portal, and as one of the major "dotcom" "portals" of the 1990s , it was once one of the most recognized brands on the Internet. Today it offers a variety of services, including search, web-based email, instant messaging, stock quotes, and a customizable user homepage. The content is collated from over 100 Launch
SAPO Sapo is the Portuguese and Spanish term for a toad. It may also refer to: Launch
1996 Dogpile Dogpile is a metasearch engine that fetches results from Google, Yahoo!, Live Search, Ask.com, About.com, MIVA, LookSmart and several other popular search engines, including those from audio and video content providers. It is a registered trademark of InfoSpace, Inc. Dogpile began operation in November of 1996. The Dogpile search engine earned the Launch
Inktomi Inktomi Corporation was a California company that provided software for Internet service providers. It was founded in 1996 by UC Berkeley professor Eric Brewer and graduate student Paul Gauthier. The company was initially founded based on the real-world success of the search engine they developed at the university. After the bursting of the dot- Founded
HotBot HotBot is one of the early Internet search engines and was launched in May 1996 as a service of Wired Magazine. It was launched using a "new links" strategy of marketing, claiming to update its search database more often than its competitors. It also offered free webpage hosting, but only for a short time, and it was taken down without Founded
Ask Jeeves Ask.com is a search engine started in 1996 by Garrett Gruener and David Warthen in Berkeley, California. The original search engine software was implemented by Gary Chevsky from his own design. Chevsky, Justin Grant, and others built the early AskJeeves.com website around that core engine. Three venture capital firms, Highland Capital Partners, Founded
1997 Northern Light Northern Light Group, LLC is a company specializing in strategic research portals, enterprise search technology, and text analytics solutions. The company provides custom, hosted, turnkey solutions for its clients Launch
Yandex Yandex is a Russian search engine and the largest Russian Web portal. Yandex was launched in 1997. Its name can be explained as "Yet Another iNDEXer" (yandex) or "Языково́й (language) Index". The Russian word "Я" corresponds to English "I" (as the singular first-person pronoun), making "Яndex& Launch
1998 Google Google search is a web search engine owned by Google Inc. and is the most-used search engine on the Web . Google receives several hundred million queries each day through its various services. Google search was originally developed by Larry Page and Sergey Brin in 1997 Launch
1999 AlltheWeb AlltheWeb is an Internet search engine that debuted in mid-1999. It grew out from FTP Search, Tor Egge's doctorate thesis at the Norwegian University of Science and Technology, which he started on in 1994, which in turn resulted in the formation of Fast Search and Transfer established on July 16, 1997. It was used primarily as a show piece site Launch
GenieKnows GenieKnows is a division of IT Interactive Services Inc., a privately owned vertical search engine company based in Halifax, Nova Scotia. Like many internet search engines, its revenue model centers on an online advertising platform and B2B transactions. It focuses on a set of niche search markets, or verticals, including health search, video Founded
Naver Naver is the most popular search portal in South Korea. Naver was launched in June 1999, the first portal in Korea that used its own proprietary search engine. Among Naver's innovations was "Comprehensive Search", launched in 2000, which provides results from multiple categories on a single page, and was later possibly benchmarked by Launch
Teoma Teoma, pronounced chawmuh , was an Internet search engine founded in 2000 by Professor Apostolos Gerasoulis and his colleagues at Rutgers University in New Jersey. Professor Tao Yang from UCSB co-led technology R&D. Their research grew out of the 1998 DiscoWeb project. The original research was published in the paper, "DiscoWeb: Applying Founded
Vivisimo Vivisimo is a privately held enterprise search software company in Pittsburgh that develops and sells software products to improve search on the web and in enterprises. The focus of Vivisimo's research thus far has been the concept of clustering search results based on topic: for example, dividing the results of a search for "cell" into Founded
2000 Baidu Baidu provides an index of over 740 million web pages, 80 million images, and 10 million multimedia files. The domain baidu.com attracted at least 5.5 million visitors annually by 2008 according to a Compete.com scentury Founded
Exalead Exalead provides thumbnail previews of the target pages along with the results, and allows advanced refining on the results page but also further data refinement, such as rich content (audio, video, RSS) and related terms, allowing users to browse the web by serendipity. To date, Exalead is the only search engine that has announced plans to adopt Founded
2003 Info.com Info.com is a metasearch engine which provides results from leading search engines and pay-per-click directories, including Google, Yahoo!, Bing.com, Ask, LookSmart, About and Open Directory Launch
2004 Yahoo! Search Originally, Yahoo! Search started as a web directory of other websites, organized in a hierarchy, as opposed to a searchable index of pages. In the late 1990s, Yahoo! evolved into a full-fledged portal with a search interface and, by 2007, a limited version of selection-based search Final launch
A9.com A9.com is a subsidiary of Amazon.com based in Palo Alto, California that develops search engine technology. A9 currently has over 100 employees in offices its Palo Alto, Bangalore, and Dublin offices Launch
Sogou Sogou is a Chinese search engine which can search text, images, music, and maps. It was launched 4 August 2004 and is owned by Sohu, Inc., SoGou means "Search Dog" in Chinese Launch
2005 MSN Search Bing is a web search engine (advertised as a "decision engine"), Microsoft's current incarnation of its search technology. Unveiled by Microsoft CEO Steve Ballmer on May 28, 2009 at the All Things Digital conference in San Diego, Bing is a replacement for Live Search. It went fully online on June 3, 2009, with a preview version released Final launch
Ask.com Ask.com is a search engine started in 1996 by Garrett Gruener and David Warthen in Berkeley, California. The original search engine software was implemented by Gary Chevsky from his own design. Chevsky, Justin Grant, and others built the early AskJeeves.com website around that core engine. Three venture capital firms, Highland Capital Partners, Launch
GoodSearch GoodSearch is a Yahoo-powered search engine that donates 50% of its revenue, about a penny per search, to listed American charities and schools designated by its users. The money donated comes from the site's advertisers. According to the company, as of June 2009 more than 80,000 non-profits are participating in the program and 100 new Launch
SearchMe Founded
2006 wikiseek Wikiseek was a search engine that indexed Wikipedia pages and pages that were linked to from Wikipedia articles. The search engine was founded by Palo Alto based internet startup SearchMe and was officially launched on January 17, 2007. Most of the funding came from Sequoia Capital. It used Google ads on its search returns to generate profit. As Founded
Quaero Quaero is a European research and development program which has the goal of developing multimedia and multilingual indexing and management tools for professional and general public applications (such as search engines). The European Commission approved the aid granted by France on 11 March 2008 Founded
Ask.com Ask.com is a search engine started in 1996 by Garrett Gruener and David Warthen in Berkeley, California. The original search engine software was implemented by Gary Chevsky from his own design. Chevsky, Justin Grant, and others built the early AskJeeves.com website around that core engine. Three venture capital firms, Highland Capital Partners, Launch
Live Search Bing is a web search engine (advertised as a "decision engine"), Microsoft's current incarnation of its search technology. Unveiled by Microsoft CEO Steve Ballmer on May 28, 2009 at the All Things Digital conference in San Diego, Bing is a replacement for Live Search. It went fully online on June 3, 2009, with a preview version released Launch
ChaCha ChaCha is a mobile question answering service which uses a technique known as the Human search engine. ChaCha was created by Scott A. Jones and Brad Bostic. The company is based in Carmel, Indiana, a suburb of Indianapolis Beta Launch
Guruji.com In addition to its tool for searching webpages, Guruji Search also provides services for searching images, services within a specific city, music and finance . These are linked from the main search page Beta Launch
2007 wikiseek Wikiseek was a search engine that indexed Wikipedia pages and pages that were linked to from Wikipedia articles. The search engine was founded by Palo Alto based internet startup SearchMe and was officially launched on January 17, 2007. Most of the funding came from Sequoia Capital. It used Google ads on its search returns to generate profit. As Launched
Sproose Launched
Wikia Search On June 3, 2008, an upgraded version of Wikia Search was released with additional features such as improved screen display and facilities for users to rate, edit and enhance the search results. In particular, it offered users the possibility of adding pertinent URLs to the results displayed and deleting any misleading results with immediate effect Launched
Blackle.com Blackle is a non-profit website powered by Google Custom Search, which aims to save energy by displaying a black background color for search results, similar to the non-profit search engine Launched
2008 Powerset Powerset is working on building a natural language search engine that can find targeted answers to user questions . For example, when confronted with a question of the form 'which U.S. state has the highest income tax?', conventional search engines ignore the question and instead do a search on the keywords 'state, income and tax'. Powerset on the Launched
Viewzi Viewzi is a search engine company based in Dallas, Texas that is developing a highly visual experience that tailors the way users look at information based on what they are looking for . The search engine lightens the data overload by filtering and grouping results into several distinct interfaces. Users get over 16 "views" for their Launched
Cuil Cuil is a search engine that organizes web pages by content and displays relatively long entries along with thumbnail pictures for many results. It claims to have a larger index than any other search engine, with about 120 billion web pages. It went live on July 28, 2008 Launched
Boogami Launched
LeapFish Beta Launch
VADLO Launched
Sperse! Search Launched
Duck Duck Go Launched
Searchme Launched
2009 Bing Launched

Before there were web search engines there was a complete list of all webservers. The list was edited by Tim Berners-Lee and hosted on the CERN webserver. One historical snapshot from 1992 remains.[1] As more and more webservers went online the central list could not keep up. On the NCSA site new servers were announced under the title "What's New!" but no complete listing existed any more.[2]

The very first tool used for searching on the (pre-web) Internet was Archie.[3] The name stands for "archive" without the "v." It was created in 1990 by Alan Emtage, a student at McGill University in Montreal. The program downloaded the directory listings of all the files located on public anonymous FTP (File Transfer Protocol) sites, creating a searchable database of file names; however, Archie did not index the contents of these sites.

The rise of Gopher (created in 1991 by Mark McCahill at the University of Minnesota) led to two new search programs, Veronica and Jughead. Like Archie, they searched the file names and titles stored in Gopher index systems. Veronica (Very Easy Rodent-Oriented Net-wide Index to Computerized Archives) provided a keyword search of most Gopher menu titles in the entire Gopher listings. Jughead (Jonzy's Universal Gopher Hierarchy Excavation And Display) was a tool for obtaining menu information from specific Gopher servers. While the name of the search engine "Archie" was not a reference to the Archie comic book series, "Veronica" and "Jughead" are characters in the series, thus referencing their predecessor.

In June 1993, Matthew Gray, then at MIT, produced what was probably the first web robot, the Perl-based World Wide Web Wanderer, and used it to generate an index called 'Wandex'. The purpose of the Wanderer was to measure the size of the World Wide Web, which it did until late 1995. The search engine Aliweb appeared in November 1993. Aliweb did not use a web robot, but instead depended on being notified by website administrators of the existence at each site of an index file in a particular format.

JumpStation (released in December 1993[4]) used a web robot to find web pages and to build its index, and used a web form as the interface to its query program. It was thus the first WWW resource-discovery tool to combine the three essential features of a web search engine (crawling, indexing, and searching) as described below. Because of the limited resources available on the platform on which it ran, its indexing and hence searching were limited to the titles and headings found in the web pages the crawler encountered.

One of the first "full text" crawler-based search engines was WebCrawler, which came out in 1994. Unlike its predecessors, it let users search for any word in any webpage, which has become the standard for all major search engines since. It was also the first one to be widely known by the public. Also in 1994 Lycos (which started at Carnegie Mellon University) was launched, and became a major commercial endeavor.

Soon after, many search engines appeared and vied for popularity. These included Magellan, Excite, Infoseek, Inktomi, Northern Light, and AltaVista. Yahoo! was among the most popular ways for people to find web pages of interest, but its search function operated on its web directory, rather than full-text copies of web pages. Information seekers could also browse the directory instead of doing a keyword-based search.

In 1996, Netscape was looking to give a single search engine an exclusive deal to be their featured search engine. There was so much interest that instead a deal was struck with Netscape by 5 of the major search engines, where for $5Million per year each search engine would be in a rotation on the Netscape search engine page. These five engines were: Yahoo!, Magellan, Lycos, Infoseek and Excite.[citation needed]

Search engines were also known as some of the brightest stars in the Internet investing frenzy that occurred in the late 1990s.[5] Several companies entered the market spectacularly, receiving record gains during their initial public offerings. Some have taken down their public search engine, and are marketing enterprise-only editions, such as Northern Light. Many search engine companies were caught up in the dot-com bubble, a speculation-driven market boom that peaked in 1999 and ended in 2001.

Around 2000, the Google search engine rose to prominence.[citation needed] The company achieved better results for many searches with an innovation called PageRank. This iterative algorithm ranks web pages based on the number and PageRank of other web sites and pages that link there, on the premise that good or desirable pages are linked to more than others. Google also maintained a minimalist interface to its search engine. In contrast, many of its competitors embedded a search engine in a web portal.

By 2000, Yahoo was providing search services based on Inktomi's search engine. Yahoo! acquired Inktomi in 2002, and Overture (which owned AlltheWeb and AltaVista) in 2003. Yahoo! switched to Google's search engine until 2004, when it launched its own search engine based on the combined technologies of its acquisitions.

Microsoft first launched MSN Search (since re-branded Bing) in the fall of 1998 using search results from Inktomi. In early 1999 the site began to display listings from Looksmart blended with results from Inktomi except for a short time in 1999 when results from AltaVista were used instead. In 2004, Microsoft began a transition to its own search technology, powered by its own web crawler (called msnbot).

As of late 2007, Google was by far the most popular Web search engine worldwide.[6] [7] A number of country-specific search engine companies have become prominent; for example Baidu is the most popular search engine in the People's Republic of China.

How Web search engines work

This section does not cite any references or sources. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed. (November 2007)

A search engine operates, in the following order

  1. Web crawling
  2. Indexing
  3. Searching

Web search engines work by storing information about many web pages, which they retrieve from the WWW itself. These pages are retrieved by a Web crawler (sometimes also known as a spider) — an automated Web browser which follows every link it sees. Exclusions can be made by the use of robots.txt. The contents of each page are then analyzed to determine how it should be indexed (for example, words are extracted from the titles, headings, or special fields called meta tags). Data about web pages are stored in an index database for use in later queries. Some search engines, such as Google, store all or part of the source page (referred to as a cache) as well as information about the web pages, whereas others, such as AltaVista, store every word of every page they find. This cached page always holds the actual search text since it is the one that was actually indexed, so it can be very useful when the content of the current page has been updated and the search terms are no longer in it. This problem might be considered to be a mild form of linkrot, and Google's handling of it increases usability by satisfying user expectations that the search terms will be on the returned webpage. This satisfies the principle of least astonishment since the user normally expects the search terms to be on the returned pages. Increased search relevance makes these cached pages very useful, even beyond the fact that they may contain data that may no longer be available elsewhere.

When a user enters a query into a search engine (typically by using key words), the engine examines its index and provides a listing of best-matching web pages according to its criteria, usually with a short summary containing the document's title and sometimes parts of the text. Most search engines support the use of the boolean operators AND, OR and NOT to further specify the search query. Some search engines provide an advanced feature called proximity search which allows users to define the distance between keywords.

The usefulness of a search engine depends on the relevance of the result set it gives back. While there may be millions of webpages that include a particular word or phrase, some pages may be more relevant, popular, or authoritative than others. Most search engines employ methods to rank the results to provide the "best" results first. How a search engine decides which pages are the best matches, and what order the results should be shown in, varies widely from one engine to another. The methods also change over time as Internet usage changes and new techniques evolve.

Most Web search engines are commercial ventures supported by advertising revenue and, as a result, some employ the practice of allowing advertisers to pay money to have their listings ranked higher in search results. Those search engines which do not accept money for their search engine results make money by running search related ads alongside the regular search engine results. The search engines make money every time someone clicks on one of these ads.

See also

References

Notes

The footnotes below are given in support of the statements above. Because some facts are proprietary secrets held by private companies and therefore not documented in journals, such facts are reasoned from facts that are public.

  • GBMW: Reports of 30-day punishment, re: Car maker BMW had its German website bmw.de delisted from Google, such as: Slashdot-BMW (05-Feb-2006).
  • INSIZ: Maximum size of webpages indexed by MSN/Google/Yahoo! ("100-kb limit"): Max Page-size (28-Apr-2006).
  1. ^ http://www.w3.org/History/19921103-hypertext/hypertext/DataSources/WWW/Servers.html
  2. ^ http://home.mcom.com/home/whatsnew/whats_new_0294.html
  3. ^ "Internet History - Search Engines" (from Search Engine Watch), Universiteit Leiden, Netherlands, September 2001, web: LeidenU-Archie.
  4. ^ Archive of NCSA what's new in December 1993 page
  5. ^ Gandal, Neil (2001). "The dynamics of competition in the internet search engine market". International Journal of Industrial Organization 19 (7): 1103–1117. doi:10.1016/S0167-7187(01)00065-0.
  6. ^ Nielsen NetRatings: August 2007 Search Share Puts Google On Top, Microsoft Holding Gains, SearchEngineLand, September 21, 2007
  7. ^ comScore: August 2007 Google Top Worldwide Search Engine; Baidu Beats Microsoft

Bibliography

External links

Internet search
Types Web search engine (List) · Collaborative search engine · Metasearch engine
Tools Local search · Vertical search · Search engine marketing · Search engine optimization · Search oriented architecture · Selection-based search · Social search · Document retrieval · Text mining · Web crawler · Multisearch · Federated search · Search aggregator · Index/Web indexing · Focused crawler · Spider trap · Robots exclusion standard · Distributed web crawling · Web archiving · Website mirroring software · Web search query · Voice search · Human flesh search engine · Natural language search engine · Web query classification
Applications Image search · Video search engine · Enterprise search · Semantic search
Protocols and standards Z39.50 · Search/Retrieve Web Service · Search/Retrieve via URL · OpenSearch · Representational State Transfer ·
See also Search engine · Desktop search

Categories: Information retrieval | Internet search engines | Internet terminology

 

The above information uses material from Wikipedia and is licensed under the GNU Free Documentation License.
Some facts may not have been fully verified for accuracy. [Disclaimers]
This page was last archived by our server on Mon Jul 6 21:34:40 2009. [ refresh local cache ]
Displaying this page or its contents does not use any Wikimedia Foundation's resources.
The owners of this site proudly support the Wikimedia Foundation.


Search Engine Uncovers and Maps Job Openings - PC World
news.google.com
Search Engine Uncovers and Maps Job Openings

PC World

The creators of a new job-hunting Web site say they've built a specialized search engine that digs around the Web looking for jobs that may not be ...

Free search engine finds local jobs for local people ComputerWeekly.com



all 17 news articles »
Google News Search: Search engine,
Thu Jul 9 22:33:29 2009
optimisation search engine jpg
cliftonwebdesign.co.uk
optimisation search engine jpg
120px x 479px | 9.40kB

[source page]

Our Search Engine Optimisation Services SEO is an abbreviation for search engine optimisation it is a major tool for marketing your website

Yahoo Images Search: Search engine,
Fri Jul 10 18:54:44 2009
Bing Grew Faster Than Google and Yahoo in June but Lost Market ...
digitaldaily.allthingsd.com
Bing Grew Faster Than Google and Yahoo in June but Lost Market ...

John Paczkowski

hu, 09 Jul 2009 21:32:03 GM

Microsoft's new Bing Internet . search engine. may have exceeded the growth of its rivals in June, but it didn't do much for the company's overall share of the . search. market. Bing grew faster than Yahoo and Google during the month.

Google Blogs Search: Search engine,
Fri Jul 10 05:57:58 2009
What is the best search engine on the internet?
Q. Give me a list of your favorites. Google always brings up website that have paid to be listed, this isn't helpful most of the time. What do you think is the best search engine to use, and get good results?
Asked by 2-4-View - Mon Nov 10 06:57:29 2008 - - 6 Answers - 1 Comments

A. Google is NOT a good search engine because, heck, anyone can make a stupid website and post it there! And wikipedia is the same THING. Some one can change anything on there. They can make the whole thing messed up and stupid. It can be totally false! i would say...JRank. It is the best search engine in the world.
Answered by Britt Britt - Mon Nov 10 21:07:21 2008

Yahoo Answers Search: Search engine,
Fri Jul 10 03:43:22 2009