Selection Or Weighting Of Terms From Queries, Including Natural Language Queries (epo) Patents and Patent Applications (Class 707/E17.071)

METHOD AND SYSTEM FOR UPDATING A SEARCH ENGINE

Publication number: 20120011116

Abstract: A method and a system for maintaining the freshness of a search engine server's database. A popularity parameter is defined, and a popularity value is assigned to each link in the search engine's database. The most popular links are selected for updating the contents stored, or associated with, the site to which the links refer. In one embodiment, popularity is' based at least in part on the search results generated by the search engine in response to user queries.

Type: Application

Filed: July 11, 2011

Publication date: January 12, 2012

Inventor: Jim McKeeth

METHOD AND SYSTEM FOR SEARCHING A WIDE AREA NETWORK

Publication number: 20110238662

Abstract: A method and system for searching a wide area network that enables users to find the information they seek more quickly and more easily than prior art search engines are disclosed. The method employs various innovative processes that can be included separately or in combination in different embodiments of the invention to improve searching on a wide area network.

Type: Application

Filed: June 6, 2011

Publication date: September 29, 2011

Applicant: HOSHIKO LIMITED LIABILITY COMPANY

Inventors: Brian Mark Shuster, Gary Stephen Shuster

GENERATING RECOMMENDED ITEMS IN UNFAMILIAR DOMAIN

Publication number: 20110213786

Abstract: The present invention provides a method and apparatus for generating recommended items for a current user in an unfamiliar domain. The method includes selecting a reference user of the current user, in a reference domain different from the unfamiliar domain, wherein the behavior of the current user and the behavior of the reference user have a user similarity index in the reference domain which satisfies a condition. The method further includes generating the recommended items in the unfamiliar domain for the current user according to history behavior data of the reference user in the unfamiliar domain. Even if there is little or no history behavior data of the current user in the unfamiliar domain, an effective recommendation can be made to the current user. To carryout the steps of the method, the apparatus includes a reference user determining module, a current user recommending module and a demarcating module.

Type: Application

Filed: February 25, 2011

Publication date: September 1, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Xian Wu, Quan Yuan, Xia Tian Zhang, Shiwan Zhao

INTEGRATION OF VOTER AND CONTRIBUTOR DATA INTO POLITICAL SOFTWARE AND COMPLIANCE SYSTEMS FOR PURPOSES OF SOLICITATION, COMPLIANCE, VETTING, AND CALLS TO ACTION

Publication number: 20110202542

Abstract: A system for gathering and analyzing data on individuals, comprising a processor connected to a network; a database server connected to the processor; a user database connected to the processor; and a user interface operable to interact with the processor and the user database; wherein the processor is operable to obtain information associated with a reference individual from sources connected to the network and store the information in the database server, wherein one source includes a social network database and the information includes related individual information; receive a search request from the user interface, search the information stored in the database server in response to the search request and generate a search result based thereon; determine a relevance value for the related individual information; cause the user interface to display the search result and relevance value; and store the search result in the user database as permitted by law.

Type: Application

Filed: February 14, 2011

Publication date: August 18, 2011

Applicant: ARISTOTLE INTERNATIONAL INC.

Inventors: Dean Aris Phillips, John Aris Phillips, James Xu, Brian Williams, Peter Kelly

Method and Apparatus for Generating a Web Page

Publication number: 20110106816

Abstract: A method, apparatus and computer readable medium generate a webpage using keywords identified from user input and user email communications. The keywords are identified, ranked, and transmitted to a server where a search engine uses one or more of the keywords to identify items of interest such as articles or videos. A web page is generated using selected items of interest or links to the items of interest which may then be displayed to a user as the user's homepage.

Type: Application

Filed: October 29, 2009

Publication date: May 5, 2011

Applicant: AT&T Intellectual Property I, L.P.

Inventor: Wayne Crolley

Providing Increased Quality of Content to a User Over Time

Publication number: 20110099168

Abstract: A method for increasing quality of content provided to a user. Communities of practice a user is associated with are determined based on login data. A corresponding set of tags is retrieved for each of the communities of practice. All corresponding sets of tags are aggregated to define a role for the user. A personal set of tags associated with the user is retrieved. The personal set of tags is added to the aggregate of all corresponding sets of tags to create a new set of tags. A context of the user in the particular task is recorded. The new set of tags is filtered based on the context to create a sub-set of tags. A defined number of tag aware information sources are queried using the sub-set of tags. Content is received from the defined number of tag aware information sources based on the query. The content is outputted.

Type: Application

Filed: October 22, 2009

Publication date: April 28, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: John E. Boyer, Peter A. Coldicott, Edward E. Kelley, Eoin Lane

RELATED CONTACT AND RECORD RECOMMENDATION FOR PRODUCT DESIGN

Publication number: 20110093502

Abstract: In one embodiment, a description of a product design is received. The description is analyzed to determine terms describing the product design. The determined terms are compared to a plurality of stored records. A subset of the stored records is determined based on the comparison. The stored records are considered relevant to the description of the product design. A set of recommended contacts associated with the subset of stored records is then determined. The recommended contacts are considered to have knowledge relevant to the description of the product design. At least one of the subset of stored records and the set of recommended contacts is output. For example, at least one of the subset of stored records and the list of recommended contacts may be displayed on an interface that is being used to input the description of the product design.

Type: Application

Filed: October 15, 2009

Publication date: April 21, 2011

Applicant: Oracle International Corporation

Inventors: Anurag Batra, Steve Chalgren, Joel Nave

WEB-SCALE ENTITY SUMMARIZATION

Publication number: 20110078162

Abstract: Described is a summarizing a web entity (e.g., a person, place, product or so forth) based upon the entity's appearance in web documents (e.g., on the order of hundreds of millions or billions of webpages). Webpages are separated into blocks, which are then processed according to various features to filter the number of blocks to further process, and rank the most relevant blocks with respect to the entity that remain. A redundancy removal mechanism removes redundant blocks, leaving a set of remaining blocks that are used to provide a summary of information that is relevant to the entity.

Type: Application

Filed: September 30, 2009

Publication date: March 31, 2011

Applicant: Microsoft Corporation

Inventors: Zaiqing Nie, Ji-Rong Wen, Liu Yang

RECRUITMENT SCREENING TOOL

Publication number: 20110078154

Abstract: A recruitment screening tool is configured to generate a respective resume score for each of a plurality of candidate resumes based on a comparison of each resume to predetermined resume criteria by a candidate screener. The recruitment screening tool compares each respective resume score to a predetermined scoring scale having at least one scoring threshold. A candidate recruiter is notified of each resume having a respective resume score greater than the at least one scoring threshold. The recruitment screening tool is also configured to perform quality control functions with respect to the candidate screener. The tool is also configured to generate at least one statistic based on at least one analysis factor with respect to content of the candidate resumes.

Type: Application

Filed: September 28, 2009

Publication date: March 31, 2011

Applicant: Accenture Global Services GmbH

Inventors: Simon Daniel Rickman, Stephen Michael McLaughlin

Interest Learning from an Image Collection for Advertising

Publication number: 20110072047

Abstract: Described herein is a technology that facilitates learning interests for advertising based on automated analysis of images. In several embodiments a person's interests are automatically learned based on the person's photographs for targeted advertising. Techniques are described that facilitate automatically detecting a user's interest from images and suggesting user-targeted ads. As described herein, these techniques include computer-annotating images with learned tags, performing topic learning to obtain an interest model, and performing advertisement matching and ranking based on the interest model.

Type: Application

Filed: September 21, 2009

Publication date: March 24, 2011

Applicant: Microsoft Corporation

Inventors: Xin-Jing Wang, Lei Zhang, Wei-Ying Ma

RANKING ENTITY RELATIONS USING EXTERNAL CORPUS

Publication number: 20110072025

Abstract: Exemplary methods and apparatuses are disclosed that may be used to provide or otherwise support ranking entity relations utilizing the vocabulary of at least one external corpus for use in search engine information management systems.

Type: Application

Filed: September 18, 2009

Publication date: March 24, 2011

Applicant: Yahoo!, Inc., a Delaware corporation

Inventors: Roelof van Zwol, Vanessa Murdock, Borkur Sigurbjornsson

AUTOMATICALLY FINDING CONTEXTUALLY RELATED ITEMS OF A TASK

Publication number: 20110066619

Abstract: Architecture for enabling a user to automatically recover documents and other information associated with work contexts and recover documents and other information artifacts associated with a specific project. The architecture enables monitoring and recording of activity information related to user interactions with information artifacts pertaining to a particular work context. The user can select a document having a portion of work content (e.g., a term or other type of reference item in a document) related to the work context. A lexical analysis is performed on the activity information and the reference item to identify lexical similarities. A list of candidate items (e.g., related documents) is inferred from the information artifacts based on the lexical similarities. The candidate items related to the work context are presented to the user, who can select specific items to reestablish the work context.

Type: Application

Filed: September 16, 2009

Publication date: March 17, 2011

Applicant: Microsoft Corporation

Inventors: George Perantatos, Kuldeep Karnawat, John S. Wana

Interactive writing aid to assist a user in finding information and incorporating information correctly into a written work

Publication number: 20110060761

Abstract: A machine and computer-implemented process that assists a user in authoring any written work in that it automatically searches multiple sources simultaneously on the world wide web or other designated database in order to provide automatic citation and/or information suggestions to an author's written work. The invention parses and sorts both user entered information and returned search results to create databases which assist in suggesting the most relevant information and citation suggestions to the user. The machine and computer-implemented process also provide automatic formatting, in a user pre-selected style, of both the written work and the citations which are automatically generated and suggested to the author based upon user defined presets and relevancy criteria. The invention described assists a user in finding information and the next step in a variety of processes.

Type: Application

Filed: September 8, 2009

Publication date: March 10, 2011

Inventor: Kenneth Peyton Fouts

SYSTEM AND METHOD FOR GENERATING A VALUATION OF REVENUE OPPORTUNITY FOR A KEYWORD FROM A VALUATION OF ONLINE SESSIONS ON A WEBSITE FROM USER ACTIVITIES FOLLOWING A KEYWORD SEARCH

Publication number: 20110055229

Abstract: An improved system and method for generating a valuation of online sessions on a website following a keyword search. The revenue generated for online sessions of users for pairs of keyword/referrer may be calculated from activities performed on the website during the online sessions and may be added to a sum representing the session value for the pair of keyword/referrer on the website. In an embodiment, the revenue opportunity of the pairs of keyword/referrer for a website may be estimated by multiplying the session value by the difference of a total count of clicks for multiple websites on search results by a referrer for the keyword and a count of clicks for the website on search results by a referrer for the keyword. The pairs of keyword/referrer may be ranked for the website by the estimated revenue opportunity, and then applied to optimize monetization of online content.

Type: Application

Filed: August 25, 2009

Publication date: March 3, 2011

Applicant: Yahoo! Inc.

Inventor: Anurag Kumar

Relevance-Based Image Selection

Publication number: 20110047163

Abstract: A system, computer readable storage medium, and computer-implemented method presents video search results responsive to a user keyword query. The video hosting system uses a machine learning process to learn a feature-keyword model associating features of media content from a labeled training dataset with keywords descriptive of their content. The system uses the learned model to provide video search results relevant to a keyword query based on features found in the videos. Furthermore, the system determines and presents one or more thumbnail images representative of the video using the learned model.

Type: Application

Filed: August 24, 2009

Publication date: February 24, 2011

Applicant: GOOGLE INC.

Inventors: Gal Chechik, Samy Bengio

PROFILE-BASED AND DICTIONARY BASED GRAPH CACHING

Publication number: 20110016154

Abstract: Methods and apparatuses are disclosed for caching portions of a Deterministic Finite Automata (DFA) graph during a compilation stage prior to a run-time stage that identifies attack traffic based on the graph. Cacheable components are identified based on a traffic profile, a dictionary of keywords, and/or a geometrical configuration of the graph. Techniques are disclosed for performing various types of caching alone or in combination with other types. Caching based on a dictionary or profile exploit a tendency of graph traversals performed during non-attack scenarios to remain near root nodes that correspond to the start of patterns designating blacklist traffic. By caching nodes that are near root nodes and that are visited frequently during peacetime (non-attack) scenarios, significant cache hits may be achieved during run-time execution. Caching graph components while compiling patterns using presently disclosed techniques avoids the need for expensive hardware to learn what and when to cache.

Type: Application

Filed: July 17, 2009

Publication date: January 20, 2011

Inventors: Rajan Goyal, Satyanarayana Lakshmipathi Billa, Jai Singh Rana

USING LINK STRUCTURE FOR SUGGESTING RELATED QUERIES

Publication number: 20110016115

Abstract: An approach is provided for determining related queries for a given search query based on the linking structure of electronic documents within a document set. Document titles are used to represent potential search queries and links between the electronic documents are used to determine relationships between the potential search queries. As such, the document set may be represented as a directed graph in which document titles (which represent potential search queries) are nodes and links are edges between the nodes. When a particular search query is received, a corresponding node is identified and related queries are determined by identifying other nodes having connections with that node.

Type: Application

Filed: September 24, 2010

Publication date: January 20, 2011

Applicant: MICROSOFT CORPORATION

Inventors: NICHOLAS ERIC CRASWELL, HUGH EVAN WILLIAMS, ARIEL J. LAZIER

Community-Driven Relational Filtering of Unstructured Text

Publication number: 20110010331

Abstract: A method and apparatus for calculating a score for word selection, which may be used to preprocess sets of words prior to a dimensionality reduction process, employs information about relationships between words themselves (such as synonym relationships) or relationships between items with which the words are associated (such as products in a catalog). In some embodiments, the relationships are also community based; i.e., the relationships are established by a community of users. The relationships may be references to two or more word sets in which the word of interest is common. In one embodiment, the word sets are descriptions of products in an online catalog, the community is the group of people who view the catalog, and the relationships used for calculating the score for a particular word of interest are coreferences (e.g., viewing or purchasing) of pairs of products for which the catalog descriptions both include the particular word.

Type: Application

Filed: July 7, 2009

Publication date: January 13, 2011

Applicant: ART TECHNOLOGY GROUP, INC.

Inventors: Bruce D'AMBROSIO, Stephen JENSEN

METHOD AND SYSTEM FOR RECOMMENDATION OF CONTENT ITEMS

Publication number: 20100281025

Abstract: A method of generating recommendations for content items comprises providing a domain ontology where concepts are characterized by a term vector with terms and associated weights. Associated term sets, each of which comprises a set of terms that characterize a content item, are further provided. A concept set is generated for each associated term set by determining the concepts of the domain ontology that match the terms of the associated term set. In addition, a user profile for a user is provided where the user profile comprises at least some of the concepts of the ontology coupled with preference weights. Recommendations for content items are generated based on the plurality of associated concept sets and the user profile. The invention may allow improved and/or facilitated generation of recommendations from text based characterizing data.

Type: Application

Filed: May 4, 2009

Publication date: November 4, 2010

Applicant: MOTOROLA, INC.

Inventors: Dorothea Tsatsou, Paul C. Davis, Symeon Papadopoulos, Fotis Menemenis, Ben M. Bratu, George Kalfas, Ioannis Kompatsiaris

OPERATIONAL RELIABILITY INDEX FOR THE KNOWLEDGE MANAGEMENT SYSTEM

Publication number: 20100274789

Abstract: Embodiments of the present invention provide systems, methods, and computer program products for an operational reliability index (“ORI”) scoring system in the knowledge management system that is standardized and centralized across the channels and sub-channels in an organization. The ORI system scores the reliability or confidence of the channels, sub-channels, and applications in an organization. The ORI receives reliability data associated with one or more predictability factors related to a business application. The ORI determines predictability factor reliability scores for each of the one or more predictability factors based on the reliability data and weighted values assigned to the predictability factors. Weighted values are also assigned to the categories, applications, sub-channels, and channels.

Type: Application

Filed: April 22, 2009

Publication date: October 28, 2010

Applicant: BANK OF AMERICA CORPORATION

Inventors: Daniel Douglas Grace, Srinivas Darga, Eric Nathaniel Hunsaker, Bryce Robert Elliott, Rajaraman Viswanathan, Michael J. Schreder, Greg M. Lavelle, Darryl Alan Sansbury, Christine Roche, Rama Rao Pandrapagada

Method and system for representing information

Publication number: 20100274807

Abstract: A preferred method and system for dynamically and/or statically identifying, manipulating, registering and comparing information are disclosed. In a preferred method, the elements and their respective associations respective to a data string or corpus are identified and/or represented through corresponding network elements and/or configurations. In addition, this disclosure further teaches the methodology of implementing the disclosed methodology of “informational networks” to perform an information application such as that of a search engine while effectively avoiding semantic irrelevance or selecting only relevant information with restrictions of a given grammar.

Type: Application

Filed: April 23, 2009

Publication date: October 28, 2010

Inventor: Frank John Williams

SYSTEM AND METHOD FOR MAKING A RECOMMENDATION BASED ON USER DATA

Publication number: 20100274808

Abstract: There is described a system and computer-implemented method for providing a recommendation based on a sparse pattern of data. An exemplary method comprises determining a likelihood that an item for which no user preference data is available will be preferred. The exemplary method also comprises determining a likelihood that an item for which user preference data is available for users other than a particular user will be preferred based on the likelihood that the item for which no user preference data is available will be preferred. The exemplary method additionally comprises predicting that an item for which no user preference data relative to the particular user is available will be preferred if the likelihood that the particular user will prefer the item exceeds a certain level.

Type: Application

Filed: April 27, 2009

Publication date: October 28, 2010

Inventors: Martin B. Scholz, Rong Pan, Rajan Lukose

SYSTEM, METHOD, OR APPARATUS FOR CALIBRATING A RELEVANCE SCORE

Publication number: 20100268709

Abstract: Embodiments of methods, apparatuses, devices and systems associated with calibrating one or more relevance scores are disclosed.

Type: Application

Filed: April 21, 2009

Publication date: October 21, 2010

Applicant: Yahoo! Inc., a Delaware corporation

Inventor: Alex Cozzi

Determining relevancy and desirability of terms

Patent number: 7814112

Abstract: A system and method to sort search results based upon a desirability value is illustrated. This desirability value may be based upon the difference between a demand value and a supply value. Demand may be based upon user activity such as click-throughs, purchases, price, or location. Supply may be based upon a supply of keywords that may be the number of times a word is used in search or item title. The system and method may include receiving a search query, associating a first numerical value with a keyword that is a part of the search query, tracking user activity associated with the keyword, associating a second numerical value with the keyword based upon the user activity, finding a difference value between the first and second numerical values, associating this difference value with the keyword, sorting keywords based upon the difference values, and returning the search results of the sorting.

Type: Grant

Filed: February 28, 2007

Date of Patent: October 12, 2010

Assignee: eBay Inc.

Inventors: Raghav Gupta, Sichun Xu

SYSTEM AND METHOD FOR PROVIDING AN OPTION TO AUTO-GENERATE A THREAD ON A WEB FORUM IN RESPONSE TO A CHANGE IN TOPIC

Publication number: 20100257186

Abstract: Methods and systems for providing an option to auto-generate a thread on a web forum in response to a change in topic are described. When a post is received on a thread in the web forum, wherein the thread includes one or more thread keywords and wherein each of the one or more thread keywords are associated with a relevancy score, the post is searched for the one or more thread keywords. The relevancy scores of any of the one or more thread keywords located within the post are added together to obtain a post total relevancy score. A query is then provided, to a user, for example, to auto-generate a new thread on the web forum when the post total relevancy score is less than a threshold relevancy score.

Type: Application

Filed: April 2, 2009

Publication date: October 7, 2010

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Ami H. Dewar, Robert C. Leah, Nicholas E. Poore

Content-Based Information Retrieval

Publication number: 20100257202

Abstract: Content-based information retrieval is described. In an example, a query item such as an image, document, email or other item is presented and items with similar content are retrieved from a database of items. In an example, each time a query is presented, a classifier is formed based on that query and using a training set of items. For example, the classifier is formed in real-time and is formed in such a way that a limit on the proportion of the items in the database that will be retrieved is set. In an embodiment, the query item is analyzed to identify tokens in that item and subsets of those tokens are selected to form the classifier. For example, the subsets of tokens are combined using Boolean operators in a manner which is efficient for searching on particular types of database.

Type: Application

Filed: April 2, 2009

Publication date: October 7, 2010

Applicant: Microsoft Corporation

Inventors: Martin Szummer, Andrew Fitzgibbon, Lorenzo Torresani

APPARATUSES, METHODS AND SYSTEMS FOR IMPROVING THE RELEVANCY OF IPG SEARCH RESULTS ON A WIRELESS USER'S HANDSET AND TELEVISION

Publication number: 20100257165

Abstract: This disclosure details the implementation of apparatuses, methods and systems for improving the relevancy of Interactive Program Guide search results on a wireless user's handset and television (hereinafter, “IPG”). The IPG implements a search facility whereby users may enter search criteria into a wireless user's handset or television or the like and receive search results sorted to provide the most relevant results first. In one embodiment, a method is disclosed, comprising: receiving search criteria into a search engine; determining one or more search results in response to the search criteria; querying one or more databases of attributes; comparing the search results to the attributes; calculating weights for each of the search results according to one or more attributes; sorting the search results so the results are returned in order of relevance according to the weighting; and returning the results.

Type: Application

Filed: April 3, 2009

Publication date: October 7, 2010

Applicant: Verizon Patent and Licensing Inc.

Inventors: Zhiying Jin, Wenjie Liu, Juhong Liu, Jimena Velarde, Haosheng Guo, Angel Cordero, Martin Busse

Determining User Preference of Items Based on User Ratings and User Features

Publication number: 20100250556

Abstract: A set of item-item affinities for a plurality of items is determined based on collaborative-filtering techniques. A set of an item's nearest neighbor items based on the set of item-item affinities is determined. A set of user feature-item affinities for the plurality of items and a set of user features is determined based on least squared regression. A set of a user feature's nearest neighbor items is determined based in part on the set of user feature-item affinities. Compatible affinity weights for nearest neighbor items of each item and each user feature are determined and stored. Based on user features of a particular user and items a particular user has consumed, a set of nearest neighbor items comprising nearest neighbor items for user features of the user and items the user has consumed are identified as a set of candidate items, and affinity scores of candidate items are determined.

Type: Application

Filed: March 31, 2009

Publication date: September 30, 2010

Inventors: Seung-Taek Park, Wei Chu, Todd Beaupre, Deepak K. Agarwal, Scott Roy, Raghu Ramakrishnan

GRAPH BASED RE-COMPOSITION OF DOCUMENT FRAGMENTS FOR NAME ENTITY RECOGNITION UNDER EXPLOITATION OF ENTERPRISE DATABASES

Publication number: 20100250598

Abstract: Methods and systems are described that involve recognizing complex entities from text documents with the help of structured data and Natural Language Processing (NLP) techniques. In one embodiment, the method includes receiving a document as input from a set of documents, wherein the document contains text or unstructured data. The method also includes identifying a plurality of text segments from the document via a set of tagging techniques. Further, the method includes matching the identified plurality of text segments against attributes of a set of predefined entities. Lastly, a best matching predefined entity is selected for each text segment from the plurality of text segments. In one embodiment, the system includes a set of documents, each document containing text or unstructured data. The system also includes a database storage unit that stores a set of predefined entities, wherein each entity contains a set of attributes.

Type: Application

Filed: March 30, 2009

Publication date: September 30, 2010

Inventors: Falk Brauer, Wojciech Barczynski, Hong-Hai Do, Alexander Loser, Marcus Schramm

Calculating Web Page Importance

Publication number: 20100250555

Abstract: The page ranking technique described herein employs a Markov Skeleton Mirror Process (MSMP), which is a particular case of Markov Skeleton Processes, to model and calculate page importance scores. Given a web graph and its metadata, the technique builds an MSMP model on the web graph. It first estimates the stationary distribution of a EMC and views it as transition probability. It next computes the mean staying time using the metadata. Finally, it calculates the product of transition probability and mean staying time, which is actually the stationary distribution of MSMP. This is regarded as page importance.

Type: Application

Filed: March 27, 2009

Publication date: September 30, 2010

Applicant: Microsoft Corporation

Inventors: Bin Gao, Tie-Yan Liu

APPARATUS AND METHODS FOR CONCEPT-CENTRIC INFORMATION EXTRACTION

Publication number: 20100241639

Abstract: Disclosed are methods and apparatus for extracting (or annotating) structured information from web content. Web content of interest from a particular domain is represented as one or more tree instances having a plurality of branching nodes that each correspond to a web object such that the tree instances correspond to one or more structured data instances. The particular domain is associated with domain knowledge that includes one or more presentation rulesets that each specifies a particular structure for a set of data instances, a domain-specific concept labeler, one or more specified properties of the web objects in the tree instances, and a concept schema that specifies a representation of the data to be extracted from the web content. A structured data instance that conforms to the concept schema is extracted from the one or more tree instances based on the domain knowledge for the particular domain.

Type: Application

Filed: March 20, 2009

Publication date: September 23, 2010

Applicant: YAHOO! INC.

Inventors: Daniel Kifer, Srujana Merugu, Ankur Jain, Sathiya Keerthi Selvaraj, Alok S. Kirpal, Philip L. Bohannon, Raghu Ramakrishnan

Data mapping document design system

Patent number: 7801884

Abstract: A data mapping document design system provides a market differentiator that facilitates creating the technical specification for migrating legacy databases. The system addresses the significant technical problems associated with the immensely labor intensive, complex, and error prone endeavor of manually creating the technical specification. The system not only achieves cost and time savings in clearly measurable aspects of data migration such as migration project cost and completion timelines, but also achieves improvements in other harder to measure and track areas, such as data quality, and achieves reductions in subsequently discovered data errors.

Type: Grant

Filed: December 31, 2007

Date of Patent: September 21, 2010

Assignee: Accenture Global Services GmbH

Inventor: Alex George Zachariah

ASSESSMENT OF CORPORATE DATA ASSETS

Publication number: 20100228786

Abstract: The present invention provides a data processing system and a method of assessing the data value of a data assets inventory which comprises: a) preparing a data map on a computer database comprising inputting data types and data subtypes into said database, connecting a data storing location to the data subtypes and recording the data subtype occurrences in said database; b) assigning a weighting to each data subtype occurrence in said database to provide a data assets inventory and recording the data assets inventory in said database; c) preparing evaluation types on said database wherein the evaluation type has a calculation type attribute and wherein the evaluation type is either quantity independent or quantity dependent; d) connecting at least one evaluation type to each data subtype with a reference value and recording the reference value in said database; e) determining the data value of the data assets inventory and recording the data value in said database wherein when the evaluation type is quant

Type: Application

Filed: March 9, 2009

Publication date: September 9, 2010

Inventor: Tibor Török

METHOD AND APPARATUS FOR PROVIDING A PROGRAM GUIDE HAVING SEARCH PARAMETER AWARE THUMBNAILS

Publication number: 20100211584

Abstract: A method, apparatus, article of manufacture, and a memory structure for presenting a program guide for a video-on-demand system describing a plurality of media programs, each media program having a plurality of video frames. In one embodiment, the method comprises the steps of accepting a search request from a user, the search request comprising a search parameter having a search value; searching the media program database for the search value, the media program database having first metadata associated with a first individual video frame of the media program; and providing the program guide comprising a thumbnail depicting the first individual video frame of the media program associated with the first metadata to the user if the first metadata includes the search value.

Type: Application

Filed: February 19, 2009

Publication date: August 19, 2010

Applicant: HULU LLC

Inventors: Zhibing Wang, Ting-hao Yang, Yizhe Tang, Qian Chang

PERSONALIZED RECOMMENDATIONS ON DYNAMIC CONTENT

Publication number: 20100211568

Abstract: This disclosure describes systems and methods for selecting and/or ranking web-based content predicted to have the greatest interest to individual users. In particular, articles are ranked in terms of predicted interest for different users. This is done by optimizing an interest model and in particular through a method of bilinear regression and Bayesian optimization. The interest model is populated with data regarding users, the articles, and historical interest trends that types of users have expressed towards types of articles.

Type: Application

Filed: February 19, 2009

Publication date: August 19, 2010

Inventors: Wei Chu, Seung-Taek Park

METHOD FOR ORDER INVARIANT CORRELATED ENCRYPTING OF DATA AND SQL QUERIES FOR MAINTAINING DATA PRIVACY AND SECURELY RESOLVING CUSTOMER DEFECTS

Publication number: 20100198846

Abstract: According to one embodiment of the present invention, a method for debugging a computer system is provided. According to one embodiment of the invention, a method includes encrypting data and query program instructions using correlated order invariant encrypting, the data and query program instructions operating in a customer computer system. The encrypted data and encrypted query program instructions are then transferred to a servicing entity having a test system. The encrypted data and encrypted query program instructions are run on the test system to generate a set of results. The set of results are then used to generate a diagnosis of a problem with the customer computer system. Thus the customer problem can be resolved without the servicing entity having access to the customer's data and query program instructions.

Type: Application

Filed: January 30, 2009

Publication date: August 5, 2010

Applicant: International Business Machines Corporation

Inventor: Pramod S. Gupta

Modeling images as sets of weighted features

Publication number: 20100189354

Abstract: An apparatus, method, and computer program product are provided for generating an image representation. The method includes receiving an input digital image, extracting features from the image which are representative of patches of the image, generating weighting factors for the features based on location relevance data for the image, and weighting the extracted features with the weighting factors to form a representation of the image.

Type: Application

Filed: January 28, 2009

Publication date: July 29, 2010

Applicant: Xerox Corporation

Inventors: Teofilo E. de Campos, Florent C. Perronnin

INTEREST-BASED LOCATION TARGETING ENGINE

Publication number: 20100185642

Abstract: A method for targeted advertisement is provided, for which one or more tags relating to an advertisement is/are determined, one or more of the most representative entities for the tag(s) is/are determined, and the advertisement is targeted to the one or more most representative entities. In addition, for each of a plurality of tags, one or more of the most representative entities is/are determined based on term frequency-inverse document frequency, such that an entity is relatively more representative of a tag if the tag is more uniquely and/or frequently associated with the entity. For each tag, the associated entities may be divided into multiple categories, such that one or more most representative entities within each category is/are determined for each tag.

Type: Application

Filed: January 21, 2009

Publication date: July 22, 2010

Applicant: Yahoo! Inc.

Inventors: Christopher William Higgins, Marc Eliot Davis, Christopher Todd Paretti, Carrie Burgener, Rahul Nair, Simon P. King

DETERMINING SUITABILITY OF ENTITY TO PROVIDE PRODUCTS OR SERVICES BASED ON FACTORS OF ACQUISITION CONTEXT

Publication number: 20100185660

Abstract: A method, system and computer program product for determining the suitability of an entity to provide products or services. Category and measurement data is received concerning the entity where each category is assigned a value based on the acquisition context. If the value assigned to a category exceeds a threshold, then the measurement data for that category is used in evaluating the entity. This measurement data is weighted according to the acquisition context. A binary value is generated for each weighted measurement value that exceeds a threshold. These binary values are summed and weighted according to the confidence that the source of the data is correct. Further, the past performance and reputation of the entity is used in applying a weight to the summed binary values to generate a suitability value. If the suitability value exceeds a threshold, then it is deemed suitable to conduct business with the entity.

Type: Application

Filed: September 30, 2009

Publication date: July 22, 2010

Applicant: Board of Regents, The University of Texas System

Inventors: Eric R. White, Phillip Bookert, Sheila Rosenberg

SYSTEM AND METHOD FOR DETERMINING INTERVALS OF A SPACE FILLING CURVE IN A QUERY BOX

Publication number: 20100185692

Abstract: A system and method is disclosed for determining intervals of a space filling curve in a query box. The method includes the operation of providing a range query-box contained within a data set, wherein the data set has a plurality of elements in N dimensions. A space filling curve is applied to the data set. The space filling curve contacts each of the elements in the N dimensions. The space filling curve is also applied to a range-query box contained within the data set. An entry point of the space filling curve into the query box is determined. A first endpoint box is formed to cover an hquad of the space filling curve at the entry point that includes P×P elements, with a first value of P selected as one. The value of P is increased to expand the endpoint box around a next larger hquad of the space filling curve, until a size of the endpoint box is maximized without exiting the range-query box. The interval of the space filling curve in the endpoint box can then be determined.

Type: Application

Filed: January 20, 2009

Publication date: July 22, 2010

Inventors: Bin Zhang, William K. Wilkinson

NOVEL SYSTEMS AND METHODS FOR TRANSMITTING SYNTACTICALLY ACCURATE MESSAGES OVER A NETWORK

Publication number: 20100169352

Abstract: The present invention is directed to systems and methods for encoding and retrieving information from a variety of sources using novel search techniques. The systems and methods of the invention are capable of extracting all types of structural and relational information from a query or a source data allowing for the recognition of subtle differences in meaning. The capability of discerning subtle differences in meaning that are beyond the search systems and methods presently available, the invention described herein is capable of repeatedly providing accurate and meaningful responses to a diverse set of queries.

Type: Application

Filed: December 31, 2008

Publication date: July 1, 2010

Inventors: John S. Flowers, Michael Farmer, Martin A. Quiroga, Gordon H. Fischer, John A. DeSanto

SYSTEMS AND METHODS TO SEARCH A DATA SOURCE BASED ON A COMPATABILITY VIA A SPECIFICATION

Publication number: 20100153405

Abstract: Methods and systems to search a data source based on compatibility via a specification are disclosed. The system receives a query from a buyer that includes keywords and identifies at least one keyword in the query as application information. The application information describes a first application. Next, the system infers the other keywords in the query as item information that describes a first part that is sought on a network-based marketplace. The first part is a component of the first application. Next, the system associates the application information with specification identifiers respectively for parts that fit the first application. Next, the system searches the data storage device to identify listings that respectively describe an item for sale on the network-based marketplace, the listings to include listing specification information that matches at least one of the specification identifiers. The listings further include listing item information that matches the item information.

Type: Application

Filed: October 14, 2009

Publication date: June 17, 2010

Inventors: Brian M. Johnson, Bharat Kumar Venkat, Jennifer M. Dante, Raffi Tutundjian, Kristine Chin Aronson, Richard D. Henderson

Information Processing Apparatus and Information Processing Method

Publication number: 20100138408

Abstract: According to an aspect of the present invention, there is provided an information processing apparatus including: an associated information acquiring module configured to acquire associated information of content selected from plural contents data; a first search module configured to search the contents data for first content having first associated information that is relevant to the associated information of the selected content; a second search module configured to search the contents data for second content having second associated information that satisfies a given condition; and a content display module configured to display the first content and the second content.

Type: Application

Filed: June 30, 2009

Publication date: June 3, 2010

Applicant: Kabushiki Kaisha Toshiba

Inventor: Satoshi Sera

Method, device and software for querying and presenting search results

Publication number: 20100131484

Abstract: There is disclosed a method, device, and software for presenting search results in a response to an end-user query. Search results are combined from results from a plurality of indexes, each of the search results having an associated key field. Index entries of each of the plurality of indexes are queried using an index-specific search algorithm to obtain a set of matching search results for each index, each matching search result having a quality of match specific to its index. A relative priority is determined for each of the plurality of indexes and the matching search results from the plurality of indexes are combined into a merged list of ordered search results based on the determined priority. A search result from a lower priority index is discarded in favor of any matching search result from a higher priority index.

Type: Application

Filed: September 8, 2009

Publication date: May 27, 2010

Inventors: David B. Gosse, Tym D. Feindel, Jungho Kim, Justin R. Nutzman, Michael T. Winters, Jennifer L. Gosse

Clustering Image Search Results Through Folding

Publication number: 20100131499

Abstract: A search results page contains images that are organized based on the visual features of those images; images that have common visual features are grouped together using either a folding or a reciprocal election technique. Images that pertain to a particular meaning of a query term are less likely to be scattered across the page. A group of images that have common visual features is represented on the page by a single representative image from that group. Consequently, space for more representative images becomes available on the image search results page. Thus, search results page contains visually diverse representative images; space on the results page is not wasted by repeatedly showing the same image. The initial image search results page also therefore is more likely to contain representative images that otherwise would have occurred too far down a relevance-ranked list to be included within the initial search results page.

Type: Application

Filed: November 24, 2008

Publication date: May 27, 2010

Inventors: Reinier H. van Leuken, Roelof van Zwol

Method and Apparatus for Improving Performance of Approximate String Queries Using Variable Length High-Quality Grams

Publication number: 20100125594

Abstract: A computer process, called VGRAM, improves the performance of these string search algorithms in computers by using a carefully chosen dictionary of variable-length grams based on their frequencies in the string collection. A dynamic programming algorithm for computing a tight lower bound on the number of common grams shared by two similar strings in order to improve query performance is disclosed. A method for automatically computing a dictionary of high-quality grams for a workload of queries. Improvement on query performance is achieved by these techniques by a cost-based quantitative approach to deciding good grams for approximate string queries. An approach for answering approximate queries efficiently based on discarding gram lists, and another is based on combining correlated lists. An indexing structure is reduced to a given amount of space, while retaining efficient query processing by using algorithms in a computer based on discarding gram lists and combining correlated lists.

Type: Application

Filed: December 14, 2008

Publication date: May 20, 2010

Applicant: The Regents of the University of California

Inventors: Chen Li, Bin Wang, Xaochun Yang, Alexander Behm, Shengyue Ji, Jiaheng Lu

SYSTEM, METHOD, AND COMPUTER-READABLE MEDIUM FOR COSTING USER-DEFINED FUNCTIONS AND METHODS IN A DATABASE MANAGEMENT SYSTEM

Publication number: 20100121863

Abstract: A system, method, and computer-readable medium for the calculation of execution time estimates of user defined functions/user defined methods are provided. The execution of a UDF or UDM is timed several times at the time of the UDF/UDM creation, and an average execution time of the UDF/UDM is obtained. The resulting average execution time is then stored in a data dictionary where the optimizer may consult this value to factor it into the cost of execution of a query.

Type: Application

Filed: November 12, 2008

Publication date: May 13, 2010

Inventors: Michael Reed, Elizabeth Brealey, Kevin Virgil

WEBSITE NETWORK AND ADVERTISEMENT ANALYSIS USING ANALYTIC MEASUREMENT OF ONLINE SOCIAL MEDIA CONTENT

Publication number: 20100121843

Abstract: Methods, apparatuses, and computer-readable media for generating a website network graph to model one or more networks of websites relevant to subject matter of interest in a category, wherein generating the website network graph includes performing one or more searches relating to the subject matter of interest in a search engine API using one or more relevant keywords in combination with the subject matter of interest, extracting search results from the one or more searches, and identifying online social media websites with content most relevant to the subject matter of interest based on the website network graph.

Type: Application

Filed: January 13, 2009

Publication date: May 13, 2010

Applicant: Buzzient, Inc.

Inventor: Andreas Goeldi

NAMED ENTITY TRANSLITERATION USING CORPORATE CORPRA

Publication number: 20100106484

Abstract: A document in a first language and an additional document in a second language may be reviewed. It may be determined if the additional document is sufficiently similar to the document. If the additional document is determined sufficiently similar to the document, a named entity in the document may be selected. The method may search for a similar named entity by comparing the named entity to a word in the additional document and determining if the named entity and word are sufficiently similar. If a similar word to the named entity is located, the named entity and the similar named entities may be stored as name entity transliterations.

Type: Application

Filed: October 21, 2008

Publication date: April 29, 2010

Applicant: MICROSOFT CORPORATION

Inventors: Raghavendra Udupa U, Saravanan Khrisnan, Arumugam Kumaran

Method of detecting and responding to changes in the online community's interests in real time

Publication number: 20100094879

Abstract: A method of locating relevant documents wherein documents are given a fingerprint comprising weights associated with particular topic categories of a classification system, each weight representing a degree to which the document relates to the particular topic category. Documents whose fingerprints have a predetermined degree of mathematical overlap with the fingerprint may be considered relevant. A user is alerted to new relevant documents. Advertisers can offer advertisements near search results that achieve a predefined amount of relevance to text submitted by the advertiser rather than bidding on keywords. Unwanted content may be blocked from the search and filters to further refine the search may be used. E-mail spam may be blocked using textual relevance rather than keywords. Visual cues linked to a hierarchy of relevance help display the relevant documents. The methods may be used in combination with keyword searching.

Type: Application

Filed: October 14, 2008

Publication date: April 15, 2010

Inventors: Stuart Donnelly, Can Deniz Akyuz