|
|
|
|
|
|
|
|
| ( 1 of 1 ) |
| United States Patent | 8,131,712 |
| Thambidorai , et al. | March 6, 2012 |
| **Please see images for: ( Certificate of Correction ) ** |
A corpus of documents is identified, such as a large corpus of web documents. A quality score is applied to each, and at least some of the documents in the corpus of documents are identified based on their respective quality scores. At least one query characteristic, for instance, the language of a query, associated with a plurality of search queries is identified. A subset of documents in the corpus of documents is identified that satisfy the at least one query characteristic. An index is built that includes the identified at least some documents and the identified subset of documents.
| Inventors: | Thambidorai; Gautham (San Jose, CA), Lipkovitz; Eisar A. (San Francisco, CA), Nicolaou; Cosmos (Palo Alto, CA), Fan; Li (Los Altos, CA) |
|---|---|
| Assignee: |
Google Inc.
(Mountain View,
CA)
|
| Family ID: | 45758015 |
| Appl. No.: | 11/872,386 |
| Filed: | October 15, 2007 |
| Current U.S. Class: | 707/715; 707/724; 707/610 |
| Current CPC Class: | G06F 16/951 (20190101); G06F 16/29 (20190101) |
| Current International Class: | G06F 7/00 (20060101) |
| Field of Search: | ;707/610,715,724 |
| 5893093 | April 1999 | Wills |
| 6526440 | February 2003 | Bharat |
| 6999932 | February 2006 | Zhou |
| 2002/0156917 | October 2002 | Nye |
| 2004/0194099 | September 2004 | Lamping et al. |
| 2004/0254932 | December 2004 | Gupta et al. |
| 2006/0200490 | September 2006 | Abbiss |
| 2007/0038634 | February 2007 | Glover et al. |
| 2007/0288422 | December 2007 | Cao |
| 2008/0281804 | November 2008 | Zhao et al. |
|
|