Identification of Rhetorical Roles for Segmentation and Summarization of a Legal Judgment

Saravanan, M.; Ravindran, B.

doi:10.1007/s10506-010-9087-7

Identification of Rhetorical Roles for Segmentation and Summarization of a Legal Judgment

Published: 06 May 2010

Volume 18, pages 45–76, (2010)
Cite this article

Artificial Intelligence and Law Aims and scope Submit manuscript

M. Saravanan¹ &
B. Ravindran¹

1059 Accesses
26 Citations
Explore all metrics

Abstract

Legal judgments are complex in nature and hence a brief summary of the judgment, known as a headnote, is generated by experts to enable quick perusal. Headnote generation is a time consuming process and there have been attempts made at automating the process. The difficulty in interpreting such automatically generated summaries is that they are not coherent and do not convey the relative relevance of the various components of the judgment. A legal judgment can be segmented into coherent chunks based on the rhetorical roles played by the sentences. In this paper, a comprehensive system is proposed for labeling sentences with their rhetorical roles and extracting structured head notes automatically from legal judgments. An annotated data set was created with the help of legal experts and used as training data. A machine learning technique, Conditional Random Field, is applied to perform document segmentation by identifying the rhetorical roles. The present work also describes the application of probabilistic models for the extraction of key sentences and composing the relevant chunks in the form of a headnote. The understanding of basic structures and distinct segments is shown to improve the final presentation of the summary. Moreover, by adding simple additional features the system can be extended to other legal sub-domains. The proposed system has been empirically evaluated and found to be highly effective on both the segmentation and summarization tasks. The final summary generated with underlying rhetorical roles improves the readability and efficiency of the system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Comparative Study of Summarization Algorithms Applied to Legal Case Judgments

Building a corpus of legal argumentation in Japanese judgement documents: towards structure-based summarisation

Article Open access 15 February 2019

Few-Shot Legal Text Segmentation via Rewiring Conditional Random Fields: A Preliminary Study

References

Allan J, Carbonell J, Doddington G, Yamron Y, Yang Y (1998) Topic detection and tracking pilot study final report. In: Proceedings of the DARPA broadcast news transcription and understanding workshop, pp 194–218
Beeferman D, Berger A, Lafferty J (1999) Statistical models for text segmentation. Mach Learn 34(1–3):177–210
Article MATH Google Scholar
Bhatia VK (1999) Analyzing genre: language use in professional settings. Longman, London
Google Scholar
Borkar V, Deshmukh K, Sarawagi S (2001) Automatic segmentation of text into structured records. In: Proceedings of ACM SIGMOD 2001, Santa Barbara, pp 175–186
Brandow R, Mitze K, Rau LF (1995) Automatic condensation of electronic publications by sentence selection. Inf Process Manag 31(5):675–685
Article Google Scholar
Brunk C, Pazani M (1991) An investigation of noise-tolerant relational concept learning algorithms. In: Proceedings of the eighth international workshop on machine learning, Ithaca, pp 389–393
Buckley A, Singhal A, Mitra A, Salton G (1996) New retrieval approaches using SMART. In: Proceedings of TREC-4, pp 25–48
Carbonell J, Goldstein J (1998) The use of MMR, diversity-based re-ranking for reordering documents and producing summaries. In: SIGIR ‘98: proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, Melbourne, pp 335–336
Christopher DM, Schütze H (2001) Foundations of statistical natural language processing. The MIT Press, London
Google Scholar
Church KW, Gale WA (1995) Poisson mixtures. Nat Lang Eng 1(2):163–190
Article Google Scholar
Clark P, Niblett T (1989) The CN2 induction algorithm. Mach Learn 3(1):261–283
Google Scholar
Cohen W (1995) Fast effective rule induction, in machine learning. In: Proceedings of the twelfth international conference, Morgan Kaufmann, Lake Tahoe, California, pp 335–342
Cohen W, Singer Y (1999) A simple, fast, and effective rule learner. In: Proceedings of the sixteenth national conference on artificial intelligence (AAAI-99), AAAI Press, pp 335–342
Edmundson HP (1969) New methods in automatic abstracting. J ACM 16(2):264–285
Article MATH Google Scholar
Erkan G, Radev DR (2004a) Lexpagerank: prestige in multi-document text summarization. In: Lin D, Wu D (eds) Proceedings of EMNLP 2004, Association for Computational Linguistics, Barcelona, pp 365–371
Erkan G, Radev DR (2004b) Lexpagerank: prestige in multi-document summarization. In: EMNLP
Farzindar A (2005) Résumé automatique de textes juridiques. Ph.D. Thesis, Université de Montréal et Université Paris IV-Sorbonne
Farzindar A, Lapalme G (2004) Letsum, an automatic legal text summarization system. In: Gorden T (ed) Legal knowledge and information systems, JURIX 2004: the seventeenth annual conference, IOS Press, Amsterdam, pp 11–18
Filatova E, Hatzivassiloglou V (2004) Event-based extractive summarization. In: ACL workshop text summarization branches out
Freddy Y, Choi Y (2000) Advances in domain independent linear text segmentation. In: Proceedings of the first conference on North American chapter of the association for computational linguistics, vol 4, ACM International Conference Proceeding Series, pp 26–33
Freund Y (1995) Boosting a weak learning algorithm by majority. Inf Comput 121:256–285
Article MATH MathSciNet Google Scholar
Freund Y, Schapire R (1996) Experiments with a new boosting algorithm. In: Proceedings of the 13th international conference on machine learning (ICML-96), Bari, pp 148–156
Friedmen JH, Popescu BE (2005) Predictive learning via rule ensembles. Technical Report, Stanford University
Furnkranz J, Widmer G (1994) Incremental reduced error pruning, machine learning. In: Proceedings of the eleventh international conference, New Brunswick, pp 70–77
Grover C, Hachey B (2006) Extractive summarization of legal texts. Artif Intell Law 14(4):305–345 (Kluwer Academic Publishers, USA)
Google Scholar
Grover C, Hachey B, Hughson I (2004) The HOLJ Corpus: supporting summarization of legal texts. In: Proceedings of the 5th international workshop on linguistically interpreted corpora (CLIN’04), Geneva, pp 47–54
Hajime M, Manabu O (2000) A comparison of summarization methods based on task-based evaluation. In: Proceedings of 2nd international conference on language resources and evaluation, LREC-2000, Greece, pp 633–639
Hearst MA (1994) Multi-paragraph segmentation of expository text. In: Proceedings of the 32nd meeting of the association for computational linguistics, Las Cruces, pp 9–16
Jing H, Barzilay R, Mckeown K, Elhadad M (1998) Summarization evaluation methods: experiments and analysis. Proceedings of AAAI 98 spring symposium on intelligent text summarization, pp 60–68
Jones KS, Galliers JR (1995) Evaluating natural language processing review. Springer, New York
Book Google Scholar
Katz SM (1995) Distribution of content words and phrases in text and language modeling. Nat Lang Eng 2(1):15–59
Article Google Scholar
Kozima H (1993) Text segmentation based on similarity between words. In: Proceedings of the 31st annual meeting of the association for computational linguistics, Columbus, pp 286–288
Krippendorff K (1980) Content analysis: an introduction to its methodologies. Sage publications, Beverly Hills
Google Scholar
Lafferty J, McCullam A, Pereira F (2001) Conditional random fields: probabilistic models and for segmenting and labeling sequence data. In: Proceedings of international conference machine learning, pp 282–289
Li WJ, Xu W, Wu ML, Yuan CF, Lu Q (2006) Extractive summarization using inter- and intra-event relevance. In: Proceedings of the 21st international conference computational linguistics and 44th annual meeting of ACL (ACL/COLING’06), Sydney, July 17–21, pp 369–376
Lin C (2004) ROUGE: a package for automatic evaluation of summaries. In: Proceedings of the workshop on text summarization branches out (WAS 2004), Barcelona, pp 74–81
Lin C, Hovy E (2003) Automatic evaluation of summaries using N-gram co-occurrence statistics. In: Proceedings of the human technology conference (HLTNAACL-2003), Edmonton, pp 62–69
Luhn HP (1958) The automatic creation of literature abstracts. IBM J Res Dev 2(2):159–165
Article MathSciNet Google Scholar
Mani I, House D, Klein G, Hirschman L, Orbsl L, Firmin T, Chrzanowski M, Sundheirm B (1998) The TIPSTER SUMMAC text summarization evaluation, MITRE Technical report, MTR98W0000138, The MITRE Corporation
McCullam A, Freitag D, Pereira F (2000) Maximum entropy Markov models for information extraction and segmentation. In: Proceedings of international conference machine learning, pp 591–598
McDonald R (2007) A study of global inference algorithms in multi-document summarization. In: Proceedings of the 29th European conference on information retrieval (ECIR), pp 557–564
Morris AH, Kasper GM, Adams GA (1992) The effects and limitations of automated text condensing on reading comprehension performance. Inf Syst Res 26:17–35
Article Google Scholar
Nakao Y (2000) An algorithm for one-page summarization of a long text based on thematic hierarchy detection. In: Proceedings of the 26th annual meeting of the association for computational linguistics, New Jersey, pp 302–309
Peng F, McCullam A (2006) Accurate information extraction from research papers using conditional random fields. Inf Process Manag 42(4):963–979
Article Google Scholar
Quinlan JR (1994) C4.5: programs for machine learning, Morgan Kaufmann
Radev DR, Jing H, Budzikowska M (2000) Centroids-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies. In: Proceedings of ANLP-NAACL workshop on summarization, Seattle, Washington, pp 21–30
Saravanan M, Ravindran B, Raman S (2006a) Improving legal document summarization using graphical models. In: Proceedings of 19th international annual conference on legal knowledge and information systems, JURIX 2006, Paris, pp 51–60
Saravanan M, Raman S, Ravindran B (2006b) A probabilistic approach to multi-document summarization for generating a tiled summary. Int J Comput Intell Appl 6(2):231–243 (Imperial College)
Article Google Scholar
Saravanan M, Ravindran B, Raman S (2008) Automatic identification of rhetorical roles using conditional random fields for legal document summarization. In: Proceedings of the third international joint conference on natural language processing, IJCNLP 2008, Hyderabad, pp 51–60
Schapire RE, Singer Y (1998) Improved boosting algorithms using confidence-rated predictions. In: Proceedings of the eleventh annual conference on computational learning theory, New York, pp 80–91
Siegal S, Castellan NJ (1988) Nonparametric statistics for the behavioral sciences. McGraw Hill, Berkeley
Google Scholar
Sutton C, McCallum A (2005) Piecewise training for undirected models. In: Proceedings of the 21st conference on uncertainty in artificial intelligence (UAI-05), Arlington, pp 568–575
Teufel S, Moens M (2002) Summarizing scientific articles—experiments with relevance and rhetorical status. Comput Linguist 28(4):409–445
Article Google Scholar
Van Rijsbergen CJ (1979) Information retrieval, 2nd edn. Butterworths, London
Google Scholar
Viterbi AJ (1967) Error bounds for convolution codes and asymptotically optimal decoding algorithm. IEEE Trans Inf Process 13:260–269
Article MATH Google Scholar
Wallach HM (2004) Conditional random fields: an introduction. Technical Report MS-CIS-04-21, Department of CIS, University of Pennsylvania
Wiebe JM (1994) Tracking point of view in narrative. Comput Linguist 20(2):223–287
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, IIT Madras, Chennai, India
M. Saravanan & B. Ravindran

Authors

M. Saravanan
View author publications
You can also search for this author in PubMed Google Scholar
B. Ravindran
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to M. Saravanan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Saravanan, M., Ravindran, B. Identification of Rhetorical Roles for Segmentation and Summarization of a Legal Judgment. Artif Intell Law 18, 45–76 (2010). https://doi.org/10.1007/s10506-010-9087-7

Download citation

Published: 06 May 2010
Issue Date: March 2010
DOI: https://doi.org/10.1007/s10506-010-9087-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Identification of Rhetorical Roles for Segmentation and Summarization of a Legal Judgment

Abstract

Access this article

Similar content being viewed by others

A Comparative Study of Summarization Algorithms Applied to Legal Case Judgments

Building a corpus of legal argumentation in Japanese judgement documents: towards structure-based summarisation

Few-Shot Legal Text Segmentation via Rewiring Conditional Random Fields: A Preliminary Study

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Identification of Rhetorical Roles for Segmentation and Summarization of a Legal Judgment

Abstract

Access this article

Similar content being viewed by others

A Comparative Study of Summarization Algorithms Applied to Legal Case Judgments

Building a corpus of legal argumentation in Japanese judgement documents: towards structure-based summarisation

Few-Shot Legal Text Segmentation via Rewiring Conditional Random Fields: A Preliminary Study

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation