Skip to main content
Log in

Advanced techniques for legal document processing and retrieval

  • Published:
Artificial Intelligence and Law Aims and scope Submit manuscript

Abstract

A large interest has been dedicated in recent years to the study of models for textual databases amenable to an effective integration of search and navigation functions. In the field of legal databases the need for sophisticated models is emphasised by the need to relate and combine in an effective way different types of texts, in order to solve legal problems.

In our research we have analysed several existing models, each providing specific benefits and exhibiting corresponding limitations, under both a functional and economical viewpoint.

Under a functional point of view, a distinctive feature of our model is the representation of relevant context information, aimed at improving the retrieval accuracy, in a framework in which the availability of multiple (structural, conceptual and functional) views over the legal texts emphasises the issues of the transparency of the model and of the incrementality of the search process. The model has been experimented on a significant excerpt of the Italian banking regulations and fiscal law, embodied in the NaviLex experimental system.

On the other hand, sophisticated models imply complex text encodings, which in turn entail high costs for the manual indexing/authoring task. This well-known problem, which hampers the development of large powerful systems, has been tackled with a set of specific linguistic tools, first experimented in the Esprit II project Nomos and subsequently developed in research and development projects carried out in the Finsiel Group. These tools — devoted to the automatic extraction from texts of the information structures considered in the retrieval model — use shallow techniques amenable to effective large-scale text processing in the legal domain, in order to overcome the state-of-the-art limitations of traditional 'deep' NLP techniques.

This article presents an overview of our approach, providing a general description of the representation model and processing tools, and concentrating primarily on the representation features and search improvements related to the use of the functional context information.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Aboud, M., Chrisment, C., Razouk, R., Sedes, F., and Soule-Dupuy. C. 1993. Querying a hypertext information retrieval system by the use of classification, Information Processing & Management 29(3), 387-396.

    Google Scholar 

  • Agosti, M., Colotti, R., and Gradenigo G. 1991. A two-level hypertext retrieval model for legal data, Proceedings of the 14th International Conference on Research and Development in Information Retrieval, SIGIR'91, ACM.

  • Agosti, M., Gradenigo, G., and Marchetti, P.G. 1991a. Architecture and functions for a conceptual interface to very large online bibliografic collections, in RIAO 91, Intelligent Text and Image Handling, Barcelona, April 1991.

  • Arents, H.C. and Bogaerts, W.F.L. 1993. Concept-based retrieval of hypermedia information: from term indexing to semantic hyperindexing, Information Processing & Management 29(3), 373-386.

    Google Scholar 

  • Croft, W.B., Turtle, H.R., and Lewis, D.D. 1991. The use of phrases and structured queries in information retrieval, Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval.

  • Di Giorgi, R.M. and Nannucci, R. 1992. Hypertext systems for the law. In Proceedings of International Conference Informatique et droit/Computers and Law, Montreal 30 Sept-3 Oct.

  • Evans, D.A., Ginther-Webster, K., Hart, M., Lefferts, R.G., and Monarch, I.A. 1991. Automatic indexing using selective NLP and first-order Thesauri. In RIAO 91, Intelligent Text and Image Handling, Barcelona, April.

  • Giannetti, A., Dassovich, P., Marchignoli, G., Mussetto, P., Pietrosanti, E., Azzam, S., Celnik, P., Bilon, J., Fortier, V., and Pires, F. 1992. NOMOS: knowledge acquisition for normative reasoning systems. In L. Steels and B. Lepape (eds.), Enhancing the Knowledge Engineering Process: Contributions from Esprit, Elsevier Science Publishers.

  • Graziadio, B., Mussetto, P., Pesce, E., Pietrosanti, E. 1992. A multi-layered architecture for automatic knowledge acquisition from legal texts. In Proceedings of 12th International Conference on A.I., E.S. and N.L., Avignon 92, 1-6 June.

  • Hafner, C.D. 1990. Challenges for text-based intelligent systems, Proc. of AAAI Spring Symposium Series: Text-Based Intelligent Systems, March 27-29, Stanford University.

  • Kittredge, R. and Lehrberger, J. (eds.) 1982. Sublanguage: Studies of Language in Restricted Domain, De Gruyter, Berlin.

    Google Scholar 

  • van Kralingen, R., Oskamp, E., and Reurings, E. 1993. Norm frames in the representation of laws, in Svensson, Wassink and van Buggenhout (eds.), Legal Knowledge Based Systems: JURIX '93: Intelligent Tools for Drafting Legislation, Computer-supported Comparison of Law

  • Liddy, E.D., Jourghenson, C.L., Sibert, E., and Yu, S. 1991. Sublanguage grammar in natural language processing for an expert system. In RIAO 91, Intelligent Text and Image Handling, Barcelona, April.

  • Nanard, J., Nanard, M., Massotte, A., Djemaa, A., Joubert, A., Betaille, H., and Chauchè, J. 1993. Integrating knowledge-based hypertext for task-oriented access to documents, Proceedings of the 4th International Conference on Database and Expert Systems Applications, DEXA '93, Springer-Verlag.

  • Pietrosanti, E., Mussetto, P., Marchignoli, G., Fabrizi, S., and Russo, D. 1994. Search and navigation on legal documents based on automatic acquisition of content representation, Proceedings of the Conference: RIAO 94: Intelligent Multimedia Information Retrieval Systems and Management, New York, 11-13 October.

  • Pietrosanti, E., Mussetto, P., and Marchignoli, G. 1994b. NaviLex: Integrating Search and Navigation in a Legal Hypertext based on Semi-Automatic Content Acquisition, Informatica e diritto, No. 2/94 (IDG-Firenze) — special issue on Hypertext and Hypermedia in the Law.

  • Pietrosanti, E., Dassovich, P., Giannetti, A., Marchignoli, G., and Mussetto, P. 1995. Automatic Knowledge Acquisition from Legal Texts: An Isomorphic Approach, Proceedings of the Conference Towards a Global Expert System in Law, CEDAM 95 (Italy)

  • Pietrosanti, E., Filetti, P., Marchignoli, G., Ciociano, A., and Salvatore, R. 1995b. Strumenti evoluti per il supporto alla costruzione di banche dati legislative: estrazione automatica di riferimenti normativi, in A.I.C.A.95 — Proceedings of the Annual Conference, Chia (Cagliari, Italy) September.

  • Rama, D.V. and Padmini, Srinivasan, 1993. An investigation of content representation using text grammars, ACM Transactions on Information Systems 11(1), 51-75.

    Google Scholar 

  • Rau, L.F. and Jacobs, P.S. 1991. Creating segmented databases from free text for text retrieval, Proccedings of the 14th International Conference on Research and Development in Information Retrieval, SIGIR'91, ACM.

  • Salton, G. and McGill, M.J. 1984. Introduction to Modern Information Retrieval, McGraw-Hill.

  • Wright, G.H. von 1963. Norm and action: A logical enquiry. International Library of Philosophy and Scientific Method. Routledge & Kegan Paul, London.

    Google Scholar 

  • XML 1998. Extensible Markup Language (XML), 1.0 specification. W3C Recommendation, February

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pietrosanti, E., Graziadio, B. Advanced techniques for legal document processing and retrieval. Artificial Intelligence and Law 7, 341–361 (1999). https://doi.org/10.1023/A:1008304118095

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1008304118095

Keywords

Navigation