Query-based Opinion Summarization for Legal Blog Entries Jack

Query-based Opinion Summarization for Legal Blog Entries Jack G. Conrad, Jochen L. Leidner, Frank Schilder, Ravi Kondadadi Corporate Technology Research & Development Twelfth International Conference on Artificial Intelligence & Law (ICAIL 2009) Barcelona, Spain 8 -12 June 2009

OUTLINE • INTRODUCTION • RELATED WORK • SYSTEM • METHODOLOGY • RESULTS • CONCLUSIONS • FUTURE WORK • AI & LAW PROPOSAL

INTRODUCTION (1/4) — Motivations • Amount and rate of legal information flow increasing • Demands on attorneys for work products very high • Essential for productivity tools to be efficient • Legal blogs provide a more immediate forum – Unmoderated, instantaneous, candid, terse – Contain rich viewpoints, individual or in aggregate • Missing piece: ability to summarize blog entries – Legal professionals busy synthesizing traditional legal materials (cases, statutes, analytical documents) • Pressures due to case load and schedules immense • Increasingly impossible to keep up with information bandwidth • Means of consolidating, summarizing artifacts invaluable J. G. Conrad, ICAIL 09, 11 June 2009 3

J. G. Conrad, ICAIL 09, 11 June 2009 4

INTRODUCTION (3/4) • Key contributions 1. First work to perform multi-document opinion-based summarization on legal blog entries 2. Extends the TAC evaluation of opinion summarization task to assess the accuracy of measured polarity, using expert reviewers 3. Presents a proposal to the AI & Law community — host a formal track to pursue the topic in a more structured, in-depth manner J. G. Conrad, ICAIL 09, 11 June 2009 5

INTRODUCTION (4/4) • Opinion Mining for Legal Blogs – Prospective Applications 1. Monitoring — follow what communities are saying about firms, products, services, topics 2. Alerting — inform subscribers of unfavorable developments 3. Profiling — represent litigation patterns of attorneys, courts. . . 4. Tracking — study decisions of judges, reputations of firms. . . 5. Exploration/Education — present law students with contrasting opinions J. G. Conrad, ICAIL 09, 11 June 2009 6

OUTLINE • INTRODUCTION • RELATED WORK • SYSTEM • METHODOLOGY • RESULTS • CONCLUSIONS • FUTURE WORK • AI & LAW PROPOSAL

RELATED WORK (1/2) Ashley & Aleven (1991 ff. ) Intelligent tutoring Summarization Hachey & Grover (2006) Conrad, Leidner, Schilder and Kondadadi (2009) Argumentative zoning Blawg sentiment summarization Saravanan & Raman (2006) TREC, TAC, et al. Conditional Random Fields Lerman & Mc. Donald (2009) Sentiment-modeled summarizers ICAIL, JURIX Legal Domain ICWSM Sentiment Analysis Conrad & Schilder (2007) Blawg polarity classification J. G. Conrad, ICAIL 09, 11 June 2009 8

RELATED WORK (2/2) — TAC, the Text Analysis Conference (www. nist. gov/tac/) – a new annual international workshop sponsored by NIST – the US National Institute of Standards & Technology • organizers disseminate NLP-type tasks and datasets • participants develop systems that solve the tasks – submit their results to NIST for evaluation • members can also propose new tasks for future workshops – the sentiment summarization pilot task consisted of producing short, coherent sentiment summaries of blog text • Thomson Reuters R&D addressed the task • system produced multi-document summaries J. G. Conrad, ICAIL 09, 11 June 2009 9

OUTLINE • INTRODUCTION • RELATED WORK • SYSTEM • METHODOLOGY • RESULTS • CONCLUSIONS • FUTURE WORK • AI & LAW PROPOSAL

SYSTEM (1/3) — Workflow Diagram for Blawg Opinion Summarization Topic: Google Net Neutrality Sample Query: Has Google been a consistent supporter of Net neutrality? Sample Target: Google Net Neutrality They want freedom yet support neutrality? But Black doesn't believe that issues like net neutrality or privacy or copyrights can be considered in isolation; they're all of a piece. The Wall Street Journal attempted to kick up a controversy a couple weeks back with its pronouncement that major tech companies‚ Äîincluding Microsoft and Google‚ Äîwere backing away from their commitment to network neutrality. As Ed Black, the group's president, puts it, " Since we represent innovators, we have continually taken a stand for competition policy that makes it possible for the next You. Tube to make it out of the dormitory or garage—so that the best technology can prevail over current business models. " “ When it comes to broadband deployment, CCIA wants to see federal money only going to companies that roll out high-speed infrastructure: 25 Mbps fiber links to the home or 2 4 Mbps wireless links in areas where fiber laying might be too expensive. Freedom on the Internet is critical to vibrant communication and information exchange, which foster innovation and help drive our economy. Net Neutrality seeks to treat all traffic on the internet the same and prevent service providers from regulating availability or content. ¬† # 6 Internet service providers are chomping at the bit to begin charging for priority access on the Internet. ¬† “ At the core of these issues, ” he writes, “ is the question of how firmly we are committed to a common ethic of promoting Internet openness, freedom, J. G. Conrad, ICAIL 09, 11 June 2009 11

SYSTEM (2/3) — Fast. Sum, design and application • TR’s legal blog opinion summarization system – multi-document summarization system – harnesses regression Support Vector Machine (SVM) for ranking candidate sentences – original system extended to sentiment – current system applied to legal domain (blawgs) Summarization (2007) Legal (2009) (2008) J. G. Conrad, ICAIL 09, 11 June 2009 Sentiment 12

SYSTEM (3/3) Fast. Sum Blog Opinion Summarization Processing Key Modifications • A. 1 HTML parsing & clean-up module • B. 1 Question sentiment & target analyzer • C. 1 Sentence tagger • C. 2 Target overlap J. G. Conrad, ICAIL 09, 11 June 2009 13

OUTLINE • INTRODUCTION • RELATED WORK • SYSTEM • METHODOLOGY • RESULTS • CONCLUSIONS • FUTURE WORK • AI & LAW PROPOSAL

METHODOLOGY (1/7) • Application of Thomson Reuters’ legal blog opinion summarization system 1. Data collection via Web-based queries • submitted to Web Search Engine, Blog Search Engine 2. Summary generation • via modified Fast. Sum System 3. Evaluation • human assessment – two assessors rated each summary – measures modeled on TAC metrics J. G. Conrad, ICAIL 09, 11 June 2009 15

METHODOLOGY (2/7) Scope Engine Properties (selected) General Blog Search Engines (Focus: Blogosphere) technorati. com Includes authority score blogsearch. google. com Date or relevancy ranking www. blogsearchengine. com Focus on higher quality content www. blawg. com Generally shorter entries blawgsearch. justia. com Date or relevancy ranking www. blawgrepublic. com Generally shorter entries Legal Blog Search Engines (Focus: Blawgosphere) Blog Search Engines Examined along with Their Properties J. G. Conrad, ICAIL 09, 11 June 2009 16

METHODOLOGY (2/4) J. G. Conrad, ICAIL 09, 11 June 2009 17

J. G. Conrad, ICAIL 09, 11 June 2009 18

METHODOLOGY (5/7) — Evaluation • Metrics used modeled on TAC (et al. ) evaluation – Two metrics used: 1. Responsiveness 2. Linguistic Quality – Scale: Five-point Likert [1 - 5 ] • 5 = high • 1 = low – Scores generally track those of TAC, though task not completely identical J. G. Conrad, ICAIL 09, 11 June 2009 19

METHODOLOGY (6/7) — Evaluation: Responsiveness Grade Meaning Interpretation (5) Very good On point relative to question, including polarity (4) Good Addresses question, including at least partially the polarity (3) Adequate Marginally relevant to the question, independent of polarity (2) Poor May have overlap with question topic, and its polarity (1) Very poor Misses the general point of question, polarity aside Reviewer Guidelines for Responsiveness [1 -5] J. G. Conrad, ICAIL 09, 11 June 2009 20

METHODOLOGY (7/7) — Evaluation: Linguistic Quality Dimensions Essential Considerations Grammaticality no datelines, system internal formatting, fragments, omissions, capitalization errors, etc. Non-redundancy no unnecessary repetition, especially among complete sentences, facts, noun phrases Referential Clarity easily identifiable pronouns and noun phrases, same with role in summary Focus should have clear focus, sentences’ information should relate only to rest of summary Structure and Coherence should be well-structured and organized, sentences tied together, not an information heap Reviewer Guidelines for Linguistic Quality [1 -5] J. G. Conrad, ICAIL 09, 11 June 2009 21

OUTLINE • INTRODUCTION • RELATED WORK • SYSTEM • METHODOLOGY • RESULTS • CONCLUSIONS • FUTURE WORK • AI & LAW PROPOSAL

RESULTS (1/2) — Baseline Averages No. Topic Queries Polarity Blogs per Responsiveness Summary Rater A Rater B Linguistic Quality Rater A Rater B 2 Training Average: + 4. 5 2. 5 3. 0 2. 5 10 Testing Average: +/ - / neut 4. 4 2. 1 2. 0 2. 3 • Scores comparable to those of TAC 2008 (in 2 -3 range) • Caveat — we scored for correct sentiment polarity; TAC didn’t • Kappa statistic for inter-rater agreement between pair, Κ = 0. 75 J. G. Conrad, ICAIL 09, 11 June 2009 23

RESULTS (2/2) — Sample Fast. Summary Topic: Anonymous Internet Query Logs to shorten the time they keep information about their users. Under the new policy, Yahoo will delete the last eight bits of the Internet Protocol, or I. P. , address associated with a search query after 90 days. To start with, there’ s often not a one-to-one correspondence between IP addresses and Internet users. So, before you search, think. It will also alter so-called cookie data related to each search log and strip out any personal information, like a name, phone number, address or Social Security number, from the query. 2008 December 17 Internet WEB 2. 0 Internet news Categories Internet Archives: January 2009 December 2008 November 2008 October 2008 September 2008 Yahoo Limits Retention of Personal Data Posted by admin| Internet| Wednesday 17 December 2008 4: 16 am, the Internet search company, said that it would limit the time it holds identifiable personal information related to searches to 90 days to address the growing concerns of privacy advocates and government regulators. It turns out that Yahoo won't be deleting the contents of its search logs. Whether or not a search engine does this is usually disclosed in the search engine’ s privacy policy. If Yahoo's logs include information linking each user's various searches together, then even deleting the IP address entirely probably won't be enough to safeguard user privacy. Ms. Toth said she hoped that the new policy would make Yahoo‚ Äôs search service more attractive with users concerned about privacy. Obviously, Yahoo’ s new policy will do little to allay such concerns. J. G. Conrad, ICAIL 09, 11 June 2009 24

RESULTS (2/2) — Sample Fast. Summary Topic: Anonymous Internet Query Logs Topical overlap to shorten the time they keep information about their users. Under the new policy, Yahoo will delete the last eight bits of the Internet Protocol, or I. P. , address associated with a search query after 90 days. To start with, there’ s often not a one-to-one correspondence between IP addresses and Internet users. So, before Deficient you search, think. It will also alter so-called cookie data related to each search log and strip out any Useful to personal information, like a name, phone number, address or Social Security number, from the query. 2008 researcher December 17 Internet WEB 2. 0 Internet news Categories Internet Archives: January 2009 December 2008 November 2008 October 2008 September 2008 Yahoo Limits Retention of Personal Data Posted by admin| Internet| Wednesday 17 December 2008 4: 16 am, the Internet search company, said that it would limit the time it holds identifiable personal information related to searches to 90 days to address the growing concerns of privacy advocates and government regulators. It turns out that Yahoo won't be deleting the Display of contents of its search logs. Whether or not a search engine does this is usually disclosed in the search sentiment engine’ s privacy policy. If Yahoo's logs include information linking each user's various searches together, then even deleting the IP address entirely probably won't be enough to safeguard user privacy. Ms. Toth said she hoped that the new policy would make Yahoo‚ Äôs search service more attractive with users concerned about privacy. Obviously, Yahoo’ s new policy will do little to allay such concerns. J. G. Conrad, ICAIL 09, 11 June 2009 25

OUTLINE • INTRODUCTION • RELATED WORK • SYSTEM • METHODOLOGY • RESULTS • CONCLUSIONS • FUTURE WORK • AI & LAW PROPOSAL

CONCLUSIONS (1/1) • Amount, rate of legal information flow growing – Summarization, identification of trends increasingly valuable • Forums like TAC-opinion summarization beginning to study topic • For certain legal research, such synopses can be very helpful • Viewpoints, individually or in aggregate, can expand arguments, comprehension of underlying legal issues • First effort to produce automatic opinion summaries for entries in legal blog space • Based on multiple documents • For pre-specified polarity – Trained on general, homogeneous news documents (okay) – Trained on specific heterogeneous legal blogs (better) • Assessed by expert legal reviewers – Baseline scores in the low 2. 0 s out of 5 (comparable to TAC) J. G. Conrad, ICAIL 09, 11 June 2009 27

OUTLINE • INTRODUCTION • RELATED WORK • SYSTEM • METHODOLOGY • RESULTS • CONCLUSIONS • FUTURE WORK • AI & LAW PROPOSAL

FUTURE WORK (1/1) • Compare to other summarization systems/techniques – From TAC or elsewhere • Test against model summaries and use the nugget pyramid evaluation method • Train the ML component of Fast. Sum on various blog entries, rather than general news • Formalize the role input data has on result sets; and the impact output length has on results • Incorporate more structure – Qualitative — best template to harness? – Quantitative — optimal length for each section? • Leverage features from the legal domain – E. g. , use a legal dictionary to help rank sentences J. G. Conrad, ICAIL 09, 11 June 2009 29

OUTLINE • INTRODUCTION • RELATED WORK • SYSTEM • METHODOLOGY • RESULTS • CONCLUSIONS • FUTURE WORK • AI & LAW PROPOSAL

AI & LAW PROPOSAL (1/1) • For AI & Law (IAAIL) and TAC (NIST) – NIST offers research groups shared task in multi-document summarization – Why not focus on a shared task in the legal domain? • Need be assessed by IAAIL, NIST communities to determine interest – Who would benefit? 1. Legal practitioners — potentially highly beneficial results 2. Legal researchers — thanks to valuable testbed 3. AI & Law Community — can breath in new life, members • What data collections could be used? • TAC uses the very large BLOG 06 collection • Text Entailment uses the RTE collection; a hybrid also possible J. G. Conrad, ICAIL 09, 11 June 2009 31

Query-based Opinion Summarization for Legal Blog Entries Jack G. Conrad, Jochen L. Leidner, Frank Schilder, Ravi Kondadadi Research & Development Gracias! Twelfth International Conference on Artificial Intelligence & Law (ICAIL 2009) Barcelona, Spain 8 -12 June 2009 ¿Preguntas?

INTRODUCTION (1/4) — Motivations • Modern legal information environment increasingly dynamic, fast-paced – Blawgs (legal blogs) provide a more immediate forum • Generally unmoderated, instantaneous, candid, terse • Viewpoints to be gleaned, in aggregate or individually, are rich • Missing piece: ability to summarize blog entries – Legal professionals busy simply synthesizing traditional legal materials (cases, statutes, analytical documents) • Pressures due to case load and schedules immense • Increasingly impossible to keep up with information bandwidth • Means of consolidating, summarizing artifacts invaluable J. G. Conrad, ICAIL 09, 11 June 2009 33

AI & LAW PROPOSAL (1/1) • For AI & Law (IAAIL) and TAC (NIST) – NIST offers research groups shared task in multidocument summarization – Why not focus on such a shared task in the legal domain? – Need be assessed by IAAIL, NIST communities to determine interest – Potentially of great benefit to legal practitioners – Could raise the bar on current baseline system – Could use: • Blog-based data set like BLOG 06, as used in TAC 2008, with a legal component • RTE (Recognizing Text Entailment) data set, again with a legal component J. G. Conrad, ICAIL 09, 11 June 2009 • a combination of the two 34

RELATED WORK (1/2) – Ashley & Aleven (1991 ff. ) — produce intelligent tutoring applications to teach law students how to argue in the context of caselaw – Farzindar & Lapalme (2004) — present the Let. Sum system to summarize Canadian court decisions – Hachey & Grover (2006) — apply argumentative zoning to summarize decisions from the House of Lords – Saravanan & Raman (2006) — use statistical graphical models (CFRs) for legal summarization, while extracting rhetorical roles – Lerman, B. -G. , and Mc. Donald (2009) — show users have a strong preference for summarizers that model sentiment over non-sentiment baselines J. G. Conrad, ICAIL 09, 11 June 2009

SYSTEM (4/5) • Fast. Sum’s legal blog opinion summarization system – Sequence of operation 1. Pre-processing (a) tokenization (b) sentence splitting (c) boiler plate expression removal (e. g. , ‘Response by. . . ’) 2. Question analysis (a) sentiment analysis (tagging) (b) target analysis (matching) • Sentiment Filter 1. sentences with proper polarity selected; else, filtered out J. G. Conrad, ICAIL 09, 11 June 2009 36

RELATED WORK (2/2) — TAC, the Text Analysis Conference – a new annual international workshop sponsored by NIST – the National Institute of Standards & Technology • organizers disseminate NLP-type tasks and datasets • participants develop systems that solve the tasks – submit their results to NIST for evaluation • members can also propose new tasks for future workshops – the sentiment summarization pilot task consisted of producing short, coherent sentiment summaries of blog text • our system produced multi-document summaries – Related Conferences: • TREC — the Text Retrieval Conference (started in mid-90 s) • DUC — Document Understanding Conference (from 2001 -07) J. G. Conrad, ICAIL 09, 11 June 2009 – evaluated many automatic summarization systems during period 37

SYSTEM (5/5) • Fast. Sum’s legal blog opinion summarization system – Sequence of operation (cont. ) 4. Feature extraction – focus largely on correspondence with terms in query • at different levels of granularity: title, description, document – also harness sentence-based features • length, position 5. Sentence ranker – trained regression SVM on feature set — goal: summary worthiness • Redundancy removal 4. basic idea — change relative importance of remaining sentences w. r. t. currently selected sentences J. G. Conrad, ICAIL 09, 11 June 2009 38

SYSTEM (5/5) • Fast. Sum’s legal blog opinion summarization system – Sequence of operation (cont. ) 4. Feature extraction – – – topic word frequency (title, description) content word frequency document frequency headline frequency <topic> <num> D 0703 A </num> <title> age discrimination </title> <narr> This expose documents the increasing occurrence of age discrimination in the workplace in Canada. . . </narr> </topic> sentence-based features (length, position) • Sentence ranker – trained regression SVM on feature set — goal: summary worthiness 4. Redundancy removal – basic idea — change relative importance of remaining sentences w. r. t. currently selected sentences J. G. Conrad, ICAIL 09, 11 June 2009 39