Скачать презентацию SEMANTIC SEARCH PRESENTED BY Group No 13 Jai Скачать презентацию SEMANTIC SEARCH PRESENTED BY Group No 13 Jai

51e218ef20dafa7c596e35750680beb0.ppt

  • Количество слайдов: 49

SEMANTIC SEARCH PRESENTED BY: Group No: 13 Jai Mashalkar 113050007 Khushraj Madnani 113050041 Lahari SEMANTIC SEARCH PRESENTED BY: Group No: 13 Jai Mashalkar 113050007 Khushraj Madnani 113050041 Lahari Poddar 113050029

SEMANTIC SEARCH Semantic search seeks to improve search accuracy by understanding searcher’s intent and SEMANTIC SEARCH Semantic search seeks to improve search accuracy by understanding searcher’s intent and the contextual meaning of terms as they appear in the searchable dataspace, whether on the Web or within a closed system, to generate more relevant results.

MOTIVATION FOR SEMANTIC SEARCH MOTIVATION FOR SEMANTIC SEARCH

SEMANTIC SEARCH TECHNOLOGY SEMANTIC SEARCH TECHNOLOGY

 Semantically Relatable Sets Query Expansion Relevance Feedback Semantically Relatable Sets Query Expansion Relevance Feedback

SEMANTICALLY RELATABLE SETS SEMANTICALLY RELATABLE SETS

SEMANTICALLY RELATABLE SET A semantically relatable set (SRS) of a sentence is a group SEMANTICALLY RELATABLE SET A semantically relatable set (SRS) of a sentence is a group of unordered words in the sentence (not necessarily consecutive) that appear in the semantic graph of the sentence as linked nodes.

FORMS OF SRS a. b. c. {CW, CW} {CW, FW, CW} {FW, CW} CW: FORMS OF SRS a. b. c. {CW, CW} {CW, FW, CW} {FW, CW} CW: Content Word or Clause FW: Function Words Example: The girl borrowed a book on AI from library. CW: girl, borrowed, book, AI, library FW: the, a, on, from

THE GIRL BORROWED A BOOK ON AI FROM LIBRARY past tense borrowed agent book THE GIRL BORROWED A BOOK ON AI FROM LIBRARY past tense borrowed agent book girl the: definite place object modifier a: indefinite AI on: modifier library from: modifier

THE GIRL BORROWED A BOOK ON AI FROM LIBRARY Sets Formed: a) {the, girl} THE GIRL BORROWED A BOOK ON AI FROM LIBRARY Sets Formed: a) {the, girl} b) {girl, borrowed} c) {borrowed, book} d) {book, on, AI} e) {borrowed, from, library} f) {a, book}

THE PROFESSOR ANNOUNCED THAT HE WILL CONDUCT AN EXTRA LECTURE ON SUNDAY announced agent THE PROFESSOR ANNOUNCED THAT HE WILL CONDUCT AN EXTRA LECTURE ON SUNDAY announced agent object professor the: definite SCOPE that: modifier

THE PROFESSOR ANNOUNCED THAT HE WILL CONDUCT AN EXTRA LECTURE ON SUNDAY SCOPE: Will: THE PROFESSOR ANNOUNCED THAT HE WILL CONDUCT AN EXTRA LECTURE ON SUNDAY SCOPE: Will: Future Tense conduct agent object lecture he modifier extra an: indefinite time sunday on: modifier

SETS FORMED a) b) c) d) e) f) g) h) i) {the, professor} {professor, SETS FORMED a) b) c) d) e) f) g) h) i) {the, professor} {professor, announced} {announced. that, SCOPE} SCOPE: {he, conduct} SCOPE: {will, conduct} SCOPE: {conduct, lecture} SCOPE: {conduct, on, sunday} SCOPE: {extra, lecture} SCOPE: {an, lecture}

SRS BASED SEARCH • The relevance score for a document d: Rq(d) = Relevance SRS BASED SEARCH • The relevance score for a document d: Rq(d) = Relevance of the document d to the query q |Sd| = Number of sentences in the document d rq(s) = Relevance of sentence s to the query q • The relevance of the sentence s to the query q : weight(srs) = weight of the SRS srs. press(srs) = true if srs is present in sentence s, false otherwise.

ANALYSIS OF SRS based search technique gives very high precision value ( the fraction ANALYSIS OF SRS based search technique gives very high precision value ( the fraction of retrieved instances that are relevant) compared to tf-idf based search. But falls short of tf-idf based search due to its low recall( the fraction of relevant instances that are retrieved).

LOW RECALL REASONS: Morphological Divergence Eg: Apparel for man: Clothes for men Synonymy/Hypernymy/Hyponymy Divergence LOW RECALL REASONS: Morphological Divergence Eg: Apparel for man: Clothes for men Synonymy/Hypernymy/Hyponymy Divergence Eg: Color: red/blue Physical Separation Divergence Eg: Book on AI: AI book

LOW RECALL ENHANCEMENTS: Stemming Eg: Moving, moved, moves → move Word Similarity Eg: Clothes LOW RECALL ENHANCEMENTS: Stemming Eg: Moving, moved, moves → move Word Similarity Eg: Clothes ~ Apparel SRS Augmentation ~

QUERY EXPANSION QUERY EXPANSION

QUERY EXPANSION Query expansion is the process of reformulating a seed query to improve QUERY EXPANSION Query expansion is the process of reformulating a seed query to improve retrieval performance. Techniques involved: Finding synonyms of words. Finding all the various morphological forms of a word by stemming

TYPES OF QUERY EXPANSION GLOBAL : Examine word occurrences and relationships using thesaurus. It TYPES OF QUERY EXPANSION GLOBAL : Examine word occurrences and relationships using thesaurus. It can be constructed manually or automatically. LOCAL: Using the top ranked documents retrieved by the original query.

GLOBAL QUERY EXPANSION Manual Thesaurus Generation: Use of a controlled vocabulary (maintained by human GLOBAL QUERY EXPANSION Manual Thesaurus Generation: Use of a controlled vocabulary (maintained by human editors) that is built up from sets of synonymous names for concepts. Automatic Thesaurus Generation: Exploit word co occurrence. Exploit grammatical relations or grammatical dependencies.

ANALYSIS OF QUERY EXPANSION Query expansion is effective in increasing recall of relevant documents. ANALYSIS OF QUERY EXPANSION Query expansion is effective in increasing recall of relevant documents. But it may significantly decrease precision, particularly when the query contains ambiguous terms. In general a domain specific thesaurus is required for better performance.

RELEVANCE FEEDBACK RELEVANCE FEEDBACK

RELEVANCE FEEDBACK i. iii. iv. Initially the given query by user is fired Some RELEVANCE FEEDBACK i. iii. iv. Initially the given query by user is fired Some results are retrieved Analyze whether or not those results are relevant Perform a new query and then produce the final search results by firing this modified query.

TYPES OF RELEVANCE FEEDBACK Explicit Feedback : Process of taking Feedback Taken By users TYPES OF RELEVANCE FEEDBACK Explicit Feedback : Process of taking Feedback Taken By users for assessing a given output(Set of Documents). Eg: After a document is viewed, ask “Was this document helpful? ”

ANALYSIS: ADVANTAGE: It is able to depict the actual requirement and expectations of the ANALYSIS: ADVANTAGE: It is able to depict the actual requirement and expectations of the user DISADVANTAGE: Large fraction of user may not be interested to participate in surveys and Feedbacks. These surveys may be biased based on personal choices of users. e. g. : When searched about inferno, most of the people may rank the pages of musical band named inferno over that of inferno OS

IMPLICIT FEEDBACK: Feedback which is inferred by the actions of user on output documents. IMPLICIT FEEDBACK: Feedback which is inferred by the actions of user on output documents. Factors: Number of times document is visited Duration of visit on particular URL Depth and number of links from visited

ANALYSIS: ADVANTAGE : The interaction time with user is eliminated as the system takes ANALYSIS: ADVANTAGE : The interaction time with user is eliminated as the system takes the feedback of the user implicitly. DISADVANTAGE: Number of Hits on Url: Users may tend to always click on the initial document received. Thus if the search was initially not upto the mark, it may continue performing poor. Time Spent on URL: Sometimes the time taken to reject a document may be substantial enough for the algorithm to believe that it is relevant. Number and Depth of links visited: This will definitely rank a relevant document as relevant. But this will fail to rank a good document without links as relevant.

PSEUDO RELEVANCE FEEDBACK OR BLIND FEEDBACK : Takes a query as an input. From PSEUDO RELEVANCE FEEDBACK OR BLIND FEEDBACK : Takes a query as an input. From some top k ranked results on that query, some keywords (as per their weights) are selected and augmented to the query which results in further search process.

ANALYSIS: ADVANTAGE : It is a completely automated process. Hence totally free from human ANALYSIS: ADVANTAGE : It is a completely automated process. Hence totally free from human biasness. DISADVANTAGE: The efficiency heavily depends on the ranking algorithm used. If the top documents retrieved by the initial query are not very relevant then the final result will also not be very impressive. The type of term associations obtained for QE is restricted to co-occurrence based relationships in the feedback documents, and thus other types of term associations such as lexical and semantic relations (morphological variants, synonyms) are not explicitly captured.

MULTI LINGUAL PRF Given a query in a language, we take the help of MULTI LINGUAL PRF Given a query in a language, we take the help of another language to ameliorate the well known problems of PRF. The steps are: i. Translation: L 1 -> L 2 ii. PRF performed in L 2. iii. Result back-translation: L 2 -> L 1 iv. Combination of feedback models of L 1, L 2. v. Fetch a new ranked list of documents.

ANALYSIS OF MULTILINGUAL PRF Good Feedback from Assisting Language: If the feedback model in ANALYSIS OF MULTILINGUAL PRF Good Feedback from Assisting Language: If the feedback model in the assisting language contains good terms, then the back-translation process will introduce the corresponding feedback terms in the source language, thus leading to improved performance. Finding Synonyms/Morphological Variations: Another situation in which Multi. PRF leads to large improvements is when it finds semantically/lexically related terms to the query terms which the original feedback model was unable to. Abundance of documents in the assisting language in the web compared to the base language.

OBSERVATIONS OBSERVATIONS

CONCLUDING REMARKS Semantic Search will be helpful in case of Research Search but won’t CONCLUDING REMARKS Semantic Search will be helpful in case of Research Search but won’t be much helpful for Navigational Search. Semantic Search performs better than traditional searching methods in case of semantically meaningful sentences or phrases but will fall short for keyword based search. To be able to use Semantic Search Engine to their full potential the users also need to get used to searching with meaningful queries instead of just keywords.

THE FUTURE AHEAD… Semantic search may not able to replace the traditional web completely THE FUTURE AHEAD… Semantic search may not able to replace the traditional web completely but it has the power to enhance it. With semantic search the web will become more intelligent as it will be able to understand exactly what we mean instead of searching just the keywords.

REFERENCES Rajat Mohanty, Anupama Dutta and Pushpak Bhattacharyya, Semantically Relatable Sets: Building Blocks for REFERENCES Rajat Mohanty, Anupama Dutta and Pushpak Bhattacharyya, Semantically Relatable Sets: Building Blocks for Repesenting Semantics, 10 th Machine Translation Summit ( MT Summit 05), Phuket, September, 2005. Manoj Chinnakotla, Karthik Raman and Pushpak Bhattacharyya, Multilingual PRF: English Lends a Helping Hand, SIGIR 2010, Geneva, Switzerland, July, 2010. Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Introduction to Information Retrieval, Cambridge University Press. 2008. Query Expansion Using Local and Global Document Analysis Jinxi Xu and W. Bruce Croft Center for Intelligent Information Retrieval Computer Science Department University of Massachusetts, Amherst, MA 01003 -4610, USA. http: //en. wikipedia. org/wiki/Semantic_search , Last modified on 23 October 2011 at 14: 11, Last Accessed on 02 November 2011 at 17: 31 http: //en. wikipedia. org/wiki/Query_expansion, Last modified on 7 October 2011 at 20: 43, Last Accessed on 04 November 2011 at 18: 45 http: //en. wikipedia. org/wiki/Relevance_feedback, Last modified on 31 October 2011 at 03: 46, Last Accessed on 04 November 2011 at 19: 10

THANK YOU THANK YOU