28
LII Legal Information Institute Sara Frug Associate Director for Technology, LII Thomas Bruce Director, LII

LII Legal Information Institute - PhUSE Wiki · LII! Legal Information Institute" Sara Frug" Associate Director for Technology, LII" " Thomas Bruce" Director, LII"

  • Upload
    ngothuy

  • View
    219

  • Download
    0

Embed Size (px)

Citation preview

Page 1: LII Legal Information Institute - PhUSE Wiki · LII! Legal Information Institute" Sara Frug" Associate Director for Technology, LII" " Thomas Bruce" Director, LII"

LII!Legal Information Institute"

Sara Frug"Associate Director for Technology, LII"

"Thomas Bruce"Director, LII"

Page 2: LII Legal Information Institute - PhUSE Wiki · LII! Legal Information Institute" Sara Frug" Associate Director for Technology, LII" " Thomas Bruce" Director, LII"

Who are we?"

•!Open access legal publisher based in Cornell Law School #since 1992$"

•!Known for enhanced US Code and CFR"

•!Help people find and understand the law #for 30 million visitors last year$"

•!Collaborators with information scientists at Cornell and elsewhere"

Page 3: LII Legal Information Institute - PhUSE Wiki · LII! Legal Information Institute" Sara Frug" Associate Director for Technology, LII" " Thomas Bruce" Director, LII"

What do we do"•!Provide a bottom%up #regulations%first$

approach to the law "

•!Enrich primary text "

•!Extract metadata"

•!Connect things%in%the%world to their legal context"

•!Present these connections in an understandable way"

Page 4: LII Legal Information Institute - PhUSE Wiki · LII! Legal Information Institute" Sara Frug" Associate Director for Technology, LII" " Thomas Bruce" Director, LII"

For example"

Page 5: LII Legal Information Institute - PhUSE Wiki · LII! Legal Information Institute" Sara Frug" Associate Director for Technology, LII" " Thomas Bruce" Director, LII"

What makes this hard?"•!Traditional semantic web quality requirements"

•!Combined with"

•!Dirty source data"

•! Incomplete source data"

•!Performance of extraction technologies"

•! Formalizing and converting paper finding aids unearths flaws and mapping gaps"

Page 6: LII Legal Information Institute - PhUSE Wiki · LII! Legal Information Institute" Sara Frug" Associate Director for Technology, LII" " Thomas Bruce" Director, LII"

What have we done?"

•!A lot of prototypes; some production features"

•!XML enhancement of US Code, CFR"

•!Metadata to RDF"

•! Feature development"

•!Visualization"

Page 7: LII Legal Information Institute - PhUSE Wiki · LII! Legal Information Institute" Sara Frug" Associate Director for Technology, LII" " Thomas Bruce" Director, LII"

Data Sources"•! FDsys eCFR XML"

•!Parallel Table of Authorities, Table of Popular Names of Legislation, etc."

•! FR Thesaurus of Indexing Terms"

•!U.S. Government Manual"

•!Agency guidances"

•!Ontologies and vocabularies"

Page 8: LII Legal Information Institute - PhUSE Wiki · LII! Legal Information Institute" Sara Frug" Associate Director for Technology, LII" " Thomas Bruce" Director, LII"

eCFR minimum feature set % and why it’s hard to build"

Feature! Dependency! Challenge!

Legislative references"

Bill structure metadata"

Metadata availability"

Pinpoint cross%references"

Beneath%section structure"

Inconsistent enumeration"

Breadcrumbs, TOCs"

Knowledge of title structure"

XML structure is volume%based"

Page 9: LII Legal Information Institute - PhUSE Wiki · LII! Legal Information Institute" Sara Frug" Associate Director for Technology, LII" " Thomas Bruce" Director, LII"

Definitions"•!Definition extraction"

•!Definiendum parsing"

•! Scope detection and resolution"

•!Term%in%context disambiguation"

M.Eng. Teams: "Fall 2015: Karthik Venkataramaiah, Dhwanish Pramthesh Shah, Shivananda Pujeri, Vishal Kumkar, Jigar Bhati; Supervisors: Sara Frug and Sylvia Kwakye, Ph.D. "Fall 2013: Deepthi Rajagopalan, Neha Kulkarni, and Siyu Zhan Supervisor: Mohammad AL Asswad, Ph.D."Fall 2012: Sarah Bouwman, Debraj Sinha; Supervisor: Nuria Casellas, Ph.D. "

Page 10: LII Legal Information Institute - PhUSE Wiki · LII! Legal Information Institute" Sara Frug" Associate Director for Technology, LII" " Thomas Bruce" Director, LII"

Linked Entities"•!Ontologies and Vocabularies"

•!Agrovoc and Agris"

•!Drugbank"

•!MeSH"

•!DBpedia"M.Eng. Teams:"Fall 2015: #Drugbank$ Arpitha Shivakumar, Sanapureddy Ram Sai, Sheena Jain; Domain "expert: Caroline Young, JD, MLIS"Spring 2015: Jai Bhatt"Fall 2014: #FIBO, DBPedia$ Jai Bhatt, Trupti Bavalatti, Meghana Pavagada Chandrashekar, "Nikhil Navali, Surya Sumukh SP, Joshua Freeberg"Spring 2013: #Agrovoc, Agris$ Yan Huang, Timothy SavardSpring 2012: #UNSPSC$ Jie Lin, "Krithi Rai"

Page 11: LII Legal Information Institute - PhUSE Wiki · LII! Legal Information Institute" Sara Frug" Associate Director for Technology, LII" " Thomas Bruce" Director, LII"

Bottom%up Vocabulary"•!Parsed all language of the CFR"

•!Extracted broader, narrower, related"

•!Extracted obligations and requirements"

M.Eng. Teams: "% Sharvari Marathe, Dallas Dias, Ankit Singh; Sanjna Venkataraman; Supervisor: Núria Casellas, Ph.D. "% Caleb Perkins; Supervisor: Mohammad AL Asswad, Ph.D. "

Page 12: LII Legal Information Institute - PhUSE Wiki · LII! Legal Information Institute" Sara Frug" Associate Director for Technology, LII" " Thomas Bruce" Director, LII"

Topic Modeling"•! Software: Mallet #David Mimno$"

•!Visualizer: fork of DFR Browser #Andrew Goldstone; fork by Josh Campbell$"

M.Eng. Teams: "CFR: Eva Sharma, Shreya Roy Chowdhury, Lisha Murthy"Federal Register: Nivedhitha Sundarmurthi, Srinisha Ramaswamy, Shubhangi Kumar"Visualization: Joshua Campbell, Saicharan Shriram Mujumar, Anisha Venugopal Reddy" "

Page 13: LII Legal Information Institute - PhUSE Wiki · LII! Legal Information Institute" Sara Frug" Associate Director for Technology, LII" " Thomas Bruce" Director, LII"

Visualizer: Topic List"

Page 14: LII Legal Information Institute - PhUSE Wiki · LII! Legal Information Institute" Sara Frug" Associate Director for Technology, LII" " Thomas Bruce" Director, LII"

Visualizer: Topic View"

Page 15: LII Legal Information Institute - PhUSE Wiki · LII! Legal Information Institute" Sara Frug" Associate Director for Technology, LII" " Thomas Bruce" Director, LII"

Visualizer: Document View"

Page 16: LII Legal Information Institute - PhUSE Wiki · LII! Legal Information Institute" Sara Frug" Associate Director for Technology, LII" " Thomas Bruce" Director, LII"

Visualizer: Stopword Evaluation"

Page 17: LII Legal Information Institute - PhUSE Wiki · LII! Legal Information Institute" Sara Frug" Associate Director for Technology, LII" " Thomas Bruce" Director, LII"

Visualizer: Stability over Time"

Page 18: LII Legal Information Institute - PhUSE Wiki · LII! Legal Information Institute" Sara Frug" Associate Director for Technology, LII" " Thomas Bruce" Director, LII"

Linked Data: CFR and Agency Structure"

Page 19: LII Legal Information Institute - PhUSE Wiki · LII! Legal Information Institute" Sara Frug" Associate Director for Technology, LII" " Thomas Bruce" Director, LII"

Linked Data: CFR Cross%References"

Page 20: LII Legal Information Institute - PhUSE Wiki · LII! Legal Information Institute" Sara Frug" Associate Director for Technology, LII" " Thomas Bruce" Director, LII"

CFR Entities"

Page 21: LII Legal Information Institute - PhUSE Wiki · LII! Legal Information Institute" Sara Frug" Associate Director for Technology, LII" " Thomas Bruce" Director, LII"

CFR and Guidances"

Page 22: LII Legal Information Institute - PhUSE Wiki · LII! Legal Information Institute" Sara Frug" Associate Director for Technology, LII" " Thomas Bruce" Director, LII"

Evaluation and Crowdsourcing"

•!We tried domain%expert%annotates%samples"

•!Domain experts were reliable for a few terms and a few paragraphs at a time and would otherwise find it di&cult not to fall back on word%processing search tools"

•!We decided to separate evaluation of precision and recall"

Page 23: LII Legal Information Institute - PhUSE Wiki · LII! Legal Information Institute" Sara Frug" Associate Director for Technology, LII" " Thomas Bruce" Director, LII"

All user see links:"

Page 24: LII Legal Information Institute - PhUSE Wiki · LII! Legal Information Institute" Sara Frug" Associate Director for Technology, LII" " Thomas Bruce" Director, LII"

Interested users may evaluate"

Page 25: LII Legal Information Institute - PhUSE Wiki · LII! Legal Information Institute" Sara Frug" Associate Director for Technology, LII" " Thomas Bruce" Director, LII"

Crowdsourcing: How it Can Work"

Assessment! Expertise required!

Was the definition phrase truncated?" Simple reading comprehension !any layperson""

Was the defined term parsed correctly?" Reading comprehension !any layperson"""

Was a non%definition extracted / linked?" Familiarity with legal text !law student, careful reader""

Was a term marked when “the context otherwise requires”?"

Familiarity with legal text or substantive domain expertise !careful student, lay expert on object of regulation""

Was scoping language extracted?" Familiarity with legal text !law student, careful reader""

Does the definition apply within this scope? #simple scope$"

Familiarity with legal text !law student, careful lay reader""

Does the definition apply within this scope? #complex scope$"

Experience with regulatory practice, regulatory drafting, or law librarianship !lawyer""

Page 26: LII Legal Information Institute - PhUSE Wiki · LII! Legal Information Institute" Sara Frug" Associate Director for Technology, LII" " Thomas Bruce" Director, LII"

Crowdsourcing: Looking Ahead"

•! Inspiration: Games With a Purpose and reCAPTCHA #Luis Von Ahn$"

•!Make recall evaluation simple and granular enough for a knowledgeable audience to complete successfully"

Page 27: LII Legal Information Institute - PhUSE Wiki · LII! Legal Information Institute" Sara Frug" Associate Director for Technology, LII" " Thomas Bruce" Director, LII"

More information"•! The Law of Where I’m Standing Right Now %

https://blog.law.cornell.edu/tbruce/2013/12/24/the%law%of%where%im%standing%right%now/"

•! Making Metasausage #legislative data modeling$ % https://blog.law.cornell.edu/metasausage"

•! Linked Linked Legal Data: A SKOS Vocabulary for the Code of Federal Regulations http://www.semantic%web%journal.net/content/linked%legal%data%skos%vocabulary%code%federal%regulations"

•! What can an Index Do? % http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2443386"

Page 28: LII Legal Information Institute - PhUSE Wiki · LII! Legal Information Institute" Sara Frug" Associate Director for Technology, LII" " Thomas Bruce" Director, LII"

Questions?"

•! Sara Frug <[email protected]>"