Skip to main content
Log in

Ant: a process aware annotation software for regulatory compliance

  • Original Research
  • Published:
Artificial Intelligence and Law Aims and scope Submit manuscript

Abstract

Accurate data annotation is essential to successfully implementing machine learning (ML) for regulatory compliance. Annotations allow organizations to train supervised ML algorithms and to adapt and audit the software they buy. The lack of annotation tools focused on regulatory data is slowing the adoption of established ML methodologies and process models, such as CRISP-DM, in various legal domains, including in regulatory compliance. This article introduces Ant, an open-source annotation software for regulatory compliance. Ant is designed to adapt to complex organizational processes and enable compliance experts to be in control of ML projects. By drawing on Business Process Modeling (BPM), we show that Ant can contribute to lift major technical bottlenecks to effectively implement regulatory compliance through software, such as the access to multiple sources of heterogeneous data and the integration of process complexities in the ML pipeline. We provide empirical data to validate the performance of Ant, illustrate its potential to speed up the adoption of ML in regulatory compliance, and highlight its limitations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. https://gitlab.ulb.be/rgyori/ant_paper.

  2. Celonis is an example of such software: https://www.celonis.com/blog/process-mining-and-internal-audit-a-match-made-in-heaven/.

  3. ComplianceAlpha is an example of such software and is available at https://www.acaglobal.com/our-solutions/compliancealpha/ecomms-surveillance.

  4. https://acpr.banque-france.fr/en/tech-sprint-2022-call-applications.

  5. Tools like Apache Tika (https://tika.apache.org/) can perform the extraction tasks. Tools like Spacy (https://spacy.io/) or Spark NLP (https://nlp.johnsnowlabs.com/) are suitable for the transformation phase.

  6. https://spark.apache.org/.

  7. https://www.mongodb.com/.

  8. https://redis.io/.

  9. https://www.djangoproject.com/.

  10. https://reactjs.org/.

  11. https://www.docker.com/.

References

  • Agarwal A, Ganesan B, Gupta A, Jain N, Karanam HP, Kumar A, Madaan N, Munigala V, Tamilselvam SG (2017) Cognitive compliance for financial regulations. IT Professional 19(4):28–35

    Article  Google Scholar 

  • Al-Shabandar R, Lightbody G, Browne F, Liu J, Wang H, Zheng H (2019) The application of artificial intelligence in financial compliance management. Proc Int Conf Artif Intell Adv Manuf. https://doi.org/10.1145/3358331.3358339

    Article  Google Scholar 

  • Anagnostopoulos I (2018) Fintech and regtech: Impact on regulators and banks. J Econ Bus 100:7–25. https://doi.org/10.1016/j.jeconbus.2018.07.003

    Article  Google Scholar 

  • Arner DW, Barberis J, Buckley RP (2015) The evolution of Fintech: a new post-crisis paradigm. Geo J Int 47:1271

    Google Scholar 

  • Arner DW, Barberis J, Buckey RP (2016) FinTech, RegTech, and the reconceptualization of financial regulation. Nw J Int Bus 37:371

    Google Scholar 

  • Asthana S, Kwatra S, Pandit S (2021) ML model change detection and versioning service. IEEE Int Conf Smart Data Serv (SMDS) 2021:82–84. https://doi.org/10.1109/SMDS53860.2021.00021

    Article  Google Scholar 

  • Aziz S, Dowling M (2019) Machine learning and AI for risk management. In: Lynn T, Mooney JG, Rosati P, Cummins M (eds) Disrupting finance: FinTech and strategy in the 21st century. Springer International Publishing, pp 33–50. https://doi.org/10.1007/978-3-030-02330-0_3

    Chapter  Google Scholar 

  • Bakhshinejad N, Soltani R, Nguyen U, Messina P (2022) A survey of machine learning based anti-money laundering solutions. Researchgate preprint. Accessed 5 Aug 2023

  • Bănărescu A (2015) Detecting and preventing fraud with data analytics. Proced Econ Finance 32:1827–1836. https://doi.org/10.1016/S2212-5671(15)01485-9

    Article  Google Scholar 

  • Becker M, Merz K, Buchkremer R (2020) RegTech the application of modern information technology solutions in regulatory affairs: areas of interest in research and practice. Intell Syst Account Finance Manag 27:161–167. https://doi.org/10.1002/isaf.1479

    Article  Google Scholar 

  • Bikaun T, Stewart M, Liu W (2022) QuickGraph: a rapid annotation tool for knowledge graph extraction from technical text. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp 270–278. https://aclanthology.org/2022.acl-demo.27

  • Bizzo BC, Ebrahimian S, Walters ME, Michalski MH, Andriole KP, Dreyer KJ, Kalra MK, Alkasab T, Digumarthy SR (2022) Validation pipeline for machine learning algorithm assessment for multiple vendors. PLoS ONE 17(4):e0267213

    Article  Google Scholar 

  • Bornstein A, Cattan A, Dagan I (2020) CoRefi: a crowd sourcing suite for coreference annotation. Proc Conf Empir Methods Natl Lang Process. https://doi.org/10.18653/v1/2020.emnlp-demos.27

    Article  Google Scholar 

  • Braun D, Matthes F (2021) NLP for consumer protection: battling illegal clauses in German terms and conditions in online shopping. Proc Worksh NLP Posit Impact. https://doi.org/10.18653/v1/2021.nlp4posimpact-1.10

    Article  Google Scholar 

  • Butler T, O’Brien L (2019) Understanding RegTech for digital regulatory compliance. Disrupting finance. Palgrave Pivot, pp 85–102

    Chapter  Google Scholar 

  • Cao L (2022) Ai in finance: Challenges, techniques, and opportunities. ACM Comput Surv (CSUR) 55(3):1–38

    Article  MathSciNet  Google Scholar 

  • Cardoso J (2005) About the data-flow complexity of web processes. In: 6th International Workshop on Business Process Modeling, Development, and Support: Business Processes and Support Systems: Design for Flexibility, pp 67–74

  • Castellanos-Ardila JP, Gallina B, Governatori G (2021) Compliance-aware engineering process plans: the case of space software engineering processes. Artif Intell Law 29(4):587–627. https://doi.org/10.1007/s10506-021-09285-5

    Article  Google Scholar 

  • Chalkidis I, Fergadiotis M, Malakasiotis P, Aletras N, Androutsopoulos I (2020) LEGAL-BERT: the muppets straight out of law school. Find Assoc Comput Linguist 2020:2898–2904. https://doi.org/10.18653/v1/2020.findings-emnlp.261

    Article  Google Scholar 

  • Chalkidis I, Fergadiotis M, Androutsopoulos I (2021) MultiEURLEX - a multi-lingual and multi-label legal document classification dataset for zero-shot cross-lingual transfer. Proc Conf Empir Methods Natl Lang Process. https://doi.org/10.18653/v1/2021.emnlp-main.559

    Article  Google Scholar 

  • Chamberlain J, Poesio M, Kruschwitz U (2016) Phrase detectives corpus 1.0 crowdsourced anaphoric conference. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), pp 2039–2046. https://www.aclweb.org/anthology/L16-1323

  • Chapman P, Clinton J, Kerber R, Khabaza T, Reinartz T, Shearer C, Wirth R (2000) CRISP-DM 1.0: Step-by-step data mining guide. SPSS inc

  • Chen W-T, Styler W (2013) Anafora: a web-based general purpose annotation tool. In: Proceedings of the 2013 NAACL HLT Demonstration Session, pp 14–19. https://www.aclweb.org/anthology/N13-3004

  • Craja P, Kim A, Lessmann S (2020) Deep learning for detecting financial statement fraud. Decis Supp Syst 139:113421. https://doi.org/10.1016/j.dss.2020.113421

    Article  Google Scholar 

  • Cybulska A, Vossen P (2014) Using a sledgehammer to crack a nut? Lexical diversity and event coreference resolution. In: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), pp 4545–4552. http://www.lrec-conf.org/proceedings/lrec2014/pdf/840_Paper.pdf

  • Day D, Goldschen A, Henderson J (2000) A framework for cross-document annotation. In: Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00). http://www.lrec-conf.org/proceedings/lrec2000/pdf/201.pdf

  • de Castilho R, Mújdricza-Maydt É, Yimam SM, Hartmann S, Gurevych I, Frank A, Biemann C (2016) A web-based tool for the integrated annotation of semantic and syntactic structures. In: Proceedings of the Workshop on Language Technology Resources and Tools for Digital Humanities (LT4DH), pp 76–84. https://www.aclweb.org/anthology/W16-4011

  • De Mauro A, Greco M, Grimaldi M (2016) A formal definition of Big Data based on its essential features. Libr Rev 65(3):122–135

    Article  Google Scholar 

  • DeMarco T (2001) Structure analysis and system specification. In: Pioneers and Their Contributions to Software Engineering: Sd&m Conference on Software Pioneers, Bonn, June 28/29, 2001, Original Historic Contributions, pp 255–288

  • Dhani JS, Bhatt R, Ganesan B, Sirohi P, Bhatnagar V (2021) Similar cases recommendation using legal knowledge graphs. arXiv Preprint. ArXiv:2107.04771

  • Dongen B, Medeiros A, Verbeek H, Weijters A, Aalst W (2005) The ProM framework: a new era in process mining tool support. Lect Notes Comput Sci 3536:444–454. https://doi.org/10.1007/11494744_25

    Article  MathSciNet  MATH  Google Scholar 

  • Douka S, Abdine H, Vazirgiannis M, Hamdani RE, Amariles DR (2021) JuriBERT: a masked-language model adaptation for french legal text.

  • Dumas M, La Rosa M, Mendling J, Reijers HA (2013) Fundamentals of business process management. Springer, Berlin. https://doi.org/10.1007/978-3-642-33143-5

    Book  Google Scholar 

  • Fayyad U, Piatetsky-Shapiro G, Smyth P (1996) From data mining to knowledge discovery in databases. AI Mag 17(3):37–37

    Google Scholar 

  • FCA (2015) Regulatory sandbox. https://www.fca.org.uk/firms/innovation/regulatory-sandbox. Accessed 13 Apr 2023

  • Financial Conduct Authority (2016) Call for input on supporting the development and adopters of RegTech. Feedback Statement FS16/4, London

    Google Scholar 

  • Frigo ML, Anderson RJ (2009) A strategic framework for governance, risk, and compliance. Strateg Finance 90(8):20

    Google Scholar 

  • Gane C, Sarson T (1977) Structured systems analysis: tools and techniques. McDonnell Douglas Systems Integration Company

    Google Scholar 

  • Girardi C, Speranza M, Sprugnoli R, Tonelli S (2014) CROMER: a tool for cross-document event and entity conference. In : Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), pp 3204–3208. http://www.lrec-conf.org/proceedings/lrec2014/pdf/726_Paper.pdf

  • Goodfellow I, Bengio Y, Courville A (2016) Deep Learning. MIT Press

    MATH  Google Scholar 

  • Gozman D, Currie W (2015) Managing governance, risk, and compliance for post-crisis regulatory change: a model of IS capabilities for financial organizations. Hawaii Int Conf Syst Sci. https://doi.org/10.1109/HICSS.2015.555

    Article  Google Scholar 

  • Grosman JS, Furtado PHT, Rodrigues AMB, Schardong GG, Barbosa SDJ, Lopes HCV (2020) Eras: improving the quality control in the annotation process for natural language processing tasks. Inf Syst 93:101553. https://doi.org/10.1016/j.is.2020.101553

    Article  Google Scholar 

  • Haelterman H (2022) Breaking silos of legal and regulatory risks to outperform traditional compliance approaches. Eur J Crim Policy Res 28(1):19–36. https://doi.org/10.1007/s10610-020-09468-x

    Article  Google Scholar 

  • Hajek P, Henriques R (2017) Mining corporate annual reports for intelligent detection of financial statement fraud – a comparative study of machine learning methods. Knowl-Based Syst 128:139–152. https://doi.org/10.1016/j.knosys.2017.05.001

    Article  Google Scholar 

  • Hamdani RE, Mustapha M, Amariles DR, Troussel A, Meeùs S, Krasnashchok K (2021) A combined rule-based and machine learning approach for automated GDPR compliance checking. Proc Eighteenth Int Conf Artif Intel Law. https://doi.org/10.1145/3462757.3466081

    Article  Google Scholar 

  • Hasić F, Vanthienen J (2019) Complexity metrics for DMN decision models. Comput Stand Interfaces 65:15–37. https://doi.org/10.1016/j.csi.2019.01.006

    Article  Google Scholar 

  • Hayashi Y (2022) Emerging trends in deep learning for credit scoring: a review. Electronics 11(19):19. https://doi.org/10.3390/electronics11193181

    Article  Google Scholar 

  • He P, Gao J, Chen W (2021) DeBERTaV3: improving DeBERTa using ELECTRA-style pre-training with gradient-disentangled embedding sharing.

  • Hendrycks D, Burns C, Chen A, Ball S (2021) CUAD: an expert-annotated NLP dataset for legal contract review (arXiv:2103.06268). arXiv. http://arxiv.org/abs/2103.06268

  • Hogan A, Blomqvist E, Cochez M, d’Amato C, Melo G de, Gutiérrez C, Gayo JEL, Kirrane S, Neumaier S, Polleres A, Navigli R, Ngomo A-CN, Rashid SM, Rula A, Schmelzeisen L, Sequeda JF, Staab S, Zimmermann A (2020) Knowledge graphs. CoRR, abs/2003.02320. https://arxiv.org/abs/2003.02320

  • Hong J, Voss C, Manning C (2021) Challenges for information extraction from dialogue in criminal law. Proc Workshop NLP Posit Impact. https://doi.org/10.18653/v1/2021.nlp4posimpact-1.8

    Article  Google Scholar 

  • Hu VC, Ferraiolo D, Kuhn R, Friedman AR, Lang AJ, Cogdell MM, Schnitzer A, Sandlin K, Miller R, Scarfone K et al (2013) Guide to attribute based access control (abac) definition and considerations (draft). NIST Spec Publ 800(162):1–54

    Google Scholar 

  • Ilin I, Voronova O, Pavlov D, Kochkarov A, Tick A, Khusainov B (2023) System of project management at a medical hub as an instrument for implementation of open innovation. Systems. https://doi.org/10.3390/systems11040182

    Article  Google Scholar 

  • Joe CV, Sugi SSS (2022) Comprehensive analysis of content defined de-duplication approaches for big data storage. In: 2022 Sixth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), pp 454–458

  • Joshi JBD, Aref WG, Ghafoor A, Spafford EH (2001) Security models for web-based applications. Commun ACM 44(2):38–44. https://doi.org/10.1145/359205.359224

    Article  Google Scholar 

  • Kaur R, Chana I, Bhattacharya J (2018) Data deduplication techniques for efficient cloud storage management: a systematic review. J Supercomput 74(5):2035–2085. https://doi.org/10.1007/s11227-017-2210-8

    Article  Google Scholar 

  • Khan RQ, Corney M, Clark AJ, Mohay GM (2010) Transaction mining for fraud detection in ERP Systems. Ind Eng Manag Syst. https://doi.org/10.7232/iems.2010.9.2.141

    Article  Google Scholar 

  • Khatri V, Brown CV (2010) Designing data governance. Commun ACM 53(1):148–152. https://doi.org/10.1145/1629175.1629210

    Article  Google Scholar 

  • Kiesel J, Wachsmuth H, Al-Khatib K, Stein B (2017) WAT-SL: a customizable web annotation tool for segment labeling. In: Blunsom P, Koller A, Lapata M (eds) Software demonstrations at the 15th conference of the european chapter of the association for computational linguistics (EACL 2017). Springer, pp 13–16

    Google Scholar 

  • Kim M-Y, Rabelo J, Okeke K, Goebel R (2022) Legal information retrieval and entailment based on BM25, transformer and semantic thesaurus methods. Rev Socionetwork Strateg 16(1):157–174

    Article  Google Scholar 

  • Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. CoRR, abs/1609.02907. http://arxiv.org/abs/1609.02907

  • Kummerfeld JK (2019) SLATE: a super-lightweight annotation tool for experts. Proc Annu Meet Assoc Comput Linguist. https://doi.org/10.18653/v1/P19-3002

    Article  Google Scholar 

  • Labib N, Rizka M, Shokry A (2020) Survey of machine learning approaches of anti-money laundering techniques to counter terrorism finance. Springer. https://doi.org/10.1007/978-981-15-3075-3_5

    Book  Google Scholar 

  • Leitner E, Rehm G, Moreno-Schneider J (2019) Fine-grained named entity recognition in legal documents. In: Acosta M, Cudré-Mauroux P, Maleshkova M, Pellegrini T, Sack H, Sure-Vetter Y (eds) Semantic systems. The power of AI and knowledge graphs. Springer International Publishing, pp 272–287

    Chapter  Google Scholar 

  • Leitner E, Rehm G, Moreno-Schneider J (2020) A dataset of german legal documents for named entity recognition. In: Proceedings of The 12th Language Resources and Evaluation Conference, pp 4478–4485. https://www.aclweb.org/anthology/2020.lrec-1.551

  • Li BZ, Stanovsky G, Zettlemoyer L (2020) Active learning for coreference resolution using discrete annotation. Proc Annu Meet Assoc Comput Linguist. https://doi.org/10.18653/v1/2020.acl-main.738

    Article  Google Scholar 

  • Liaw KT (2021) The Routledge handbook of FinTech. Routledge

    Google Scholar 

  • Lopes T, Guerreiro S (2023) Assessing business process models: a literature review on techniques for BPMN testing and formal verification. Bus Process Manag J 29(8):133–162

    Article  Google Scholar 

  • Lopez de Prado M (2018) Advances in financial machine learning. John Wiley

    Google Scholar 

  • Louzada F, Ara A, Fernandes GB (2016) Classification methods applied to credit scoring: systematic review and overall comparison. Surv Oper Res Manag Sci 21(2):117–134. https://doi.org/10.1016/j.sorms.2016.10.001

    Article  MathSciNet  Google Scholar 

  • Ly LT, Maggi FM, Montali M, Rinderle-Ma S, der Aalst WMP (2015) Compliance monitoring in business processes: functionalities, application, and tool-support. Inf Syst 54:209–234. https://doi.org/10.1016/j.is.2015.02.007

    Article  Google Scholar 

  • Mahajan V, Venugopal VK, Murugavel M, Mahajan H (2020) The algorithmic audit: working with vendors to validate radiology-AI algorithms—how we do it. Acad Radiol 27(1):132–135. https://doi.org/10.1016/j.acra.2019.09.009

    Article  Google Scholar 

  • Markov A, Seleznyova Z, Lapshin V (2022) Credit scoring methods: latest trends and points to consider. J Finance Data Sci 8:180–201. https://doi.org/10.1016/j.jfds.2022.07.002

    Article  Google Scholar 

  • Martínez-Plumed F, Contreras-Ochando L, Ferri C, Hernández-Orallo J, Kull M, Lachiche N, Ramírez-Quintana MJ, Flach P (2019) CRISP-DM twenty years later: From data mining processes to data science trajectories. IEEE Trans Knowl Data Eng 33(8):3048–3061

    Article  Google Scholar 

  • Mathet Y (2017) The agreement measure γcat a complement to γ focused on categorization of a continuum. Comput Linguist 43(3):661–681. https://doi.org/10.1162/COLI_a_00296

    Article  MathSciNet  Google Scholar 

  • Mathet Y, Widlöcher A, Métivier J-P (2015) The unified and holistic method gamma (γ) for inter-annotator agreement measure and alignment. Comput Linguist 41(3):437–479. https://doi.org/10.1162/COLI_a_00227

    Article  MathSciNet  Google Scholar 

  • Mayhew S, Roth D (2018) TALEN: tool for annotation of low-resource entities. Proc ACL Syst Demonstr. https://doi.org/10.18653/v1/P18-4014

    Article  Google Scholar 

  • McCabe TJ (1976) A complexity measure. IEEE Trans Softw Eng 2(4):308–320. https://doi.org/10.1109/TSE.1976.233837

    Article  MathSciNet  MATH  Google Scholar 

  • Medvedeva M, Vols M, Wieling M (2020) Using machine learning to predict decisions of the European Court of Human Rights. Artif Intell Law 28(2):237–266. https://doi.org/10.1007/s10506-019-09255-y

    Article  Google Scholar 

  • Michelberger P, Kemendi Á (2020) Data, information and it security-software support for security activities. Probl Manag Twenthy First Century 15(2):108–124

    Article  Google Scholar 

  • Micheler E, Whaley A (2020) Regulatory technology: replacing law with computer code. Eur Bus Organ Law Rev 21:349–377

    Article  Google Scholar 

  • Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781

  • Mitchell TM et al (2007) Machine learning (vol 1). McGraw-hill New York

    Google Scholar 

  • Moreno-Schneider J, Rehm G, Montiel-Ponsoda E, Rodriguez-Doncel V, Revenko A, Karampatakis S, Khvalchik M, Sageder C, Gracia J, Maganza F (2020) Orchestrating NLP services for the legal domain. ArXiv:2003.12900 [Cs]. http://arxiv.org/abs/2003.12900

  • Narouei M, Khanpour H, Takabi H, Parde N, Nielsen R (2017) Towards a top-down policy engineering framework for attribute-based access control. In: Proceedings of the 22nd ACM on Symposium on Access Control Models and Technologies, pp 103–114

  • Navas-Loro M, Rodríguez-Doncel V, Pinto D, Singh V, Perez F (2020) Annotador: A Temporal Tagger for Spanish. J Intell Fuzzy Syst 39(2):1979–1991. https://doi.org/10.3233/JIFS-179865

    Article  Google Scholar 

  • Neves M, Ševa J (2021) An extensive review of tools for manual annotation of documents. Brief Bioinform 22(1):146–163. https://doi.org/10.1093/bib/bbz130

    Article  Google Scholar 

  • Ni Q, Bertino E, Lobo J, Brodie C, Karat C-M, Karat J, Trombeta A (2010) Privacy-aware role-based access control. ACM Trans Inf Syst Secur. https://doi.org/10.1145/1805974.1805980

    Article  Google Scholar 

  • Nicho M, Khan S, Rahman MSMK (2017) Managing information security risk using integrated governance risk and compliance. Int Conf Comput Appl (ICCA) 2017:56–66. https://doi.org/10.1109/COMAPP.2017.8079741

    Article  Google Scholar 

  • Nicholls J, Kuppa A, Le-Khac N-A (2021) Financial cybercrime: a comprehensive survey of deep learning approaches to tackle the evolving financial crime landscape. IEEE Access 9:163965–163986. https://doi.org/10.1109/ACCESS.2021.3134076

    Article  Google Scholar 

  • Oberle B (2018) SACR: a drag-and-drop based tool for coreference annotation. In: Choukri K, Cieri C, Declerck T, Goggi S, Hasida K, Isahara H, Maegaard B, Mariani J, Mazo H, Moreno A, Odijk J, Piperidis S, Tokunaga T (eds) Proceedings of the eleventh international conference on language resources and evaluation (LREC 2018). European Language Resources Association (ELRA)

    Google Scholar 

  • Oliveira D, d’Aquin M (2019) ADOG - Annotating data with ontologies and graphs. In: Jiménez-Ruiz E, Hassanzadeh O, Srinivas K, Efthymiou V, Chen J (eds) Proceedings of the semantic web challenge on tabular data to knowledge graph matching co-located with the 18th international semantic web conference, SemTab@ISWC 2019, Auckland, New Zealand, October 30, 2019. CEUR-WS.org, pp 1–6

    Google Scholar 

  • Omg OM, Parida R, Mahapatra S (2011) Business process model and notation (bpmn) version 2.0. Object Management Group. Accessed 5 Aug 2023

  • Papantoniou AA (2022) Regtech: steering the regulatory spaceship in the right direction? J Bank Financ Technol 6(1):1–16. https://doi.org/10.1007/s42786-022-00038-9

    Article  Google Scholar 

  • Parnas DL (1972) On the criteria to be used in decomposing systems into modules. Commun ACM 15(12):1053–1058

    Article  Google Scholar 

  • Paula EL, Ladeira MB, Carvalho RN, Marzagão T (2016) Deep learning anomaly detection as support fraud investigation in brazilian exports and anti-money laundering. In: 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA), pp 954–960

  • Petri CA (1962) Kommunikation mit automaten. Accessed 5 Aug 2023

  • PMI, P. M. I. (2023) RegTech Market is estimated to be US$ 57.5 billion by 2032 with a CAGR of 8.2% over the forecast period (2022–2032)—By PMI. GlobeNewswire News Room. https://www.globenewswire.com/en/news-release/2023/01/12/2587883/0/en/RegTech-Market-is-estimated-to-be-US-57-5-billion-by-2032-with-a-CAGR-of-8-2-over-the-forecast-period-2022-2032-By-PMI.html

  • Ponemon Institute (2017) The true cost of compliance with data protection regulations: Benchmark study of multinational organizations. https://static.helpsystems.com/globalscape/pdfs/guides/gs-true-cost-of-compliance-data-protection-regulations-gd.pdf

  • Poudyal P, Šavelka J, Ieven A, Moens MF, Gonçalves T, Quaresma P (2020) Echr: legal corpus for argument mining. In: Proceedings of the 7th Workshop on Argument Mining, pp 67–75

  • Rabelo J, Kim M-Y, Goebel R, Yoshioka M, Kano Y, Satoh K (2020) A Summary of the COLIEE 2019 Competition. In: Sakamoto M, Okazaki N, Mineshima K, Satoh K (eds) New frontiers in artificial intelligence. Springer International Publishing, pp 34–49

    Chapter  Google Scholar 

  • Racz N, Weippl E, Seufert A (2010a) A frame of reference for research of integrated governance, risk and compliance (GRC). In: De Decker B, Schaumüller-Bichl I (eds) Communications and multimedia security. Springer, Berlin Heidelberg, pp 106–117

  • Racz N, Weippl E, Seufert A (2010b) A frame of reference for research of integrated governance, risk and compliance (GRC). In: De Decker B, Schaumüller-Bichl I (eds) Communications and multimedia security. Springer, Berlin, pp 106–117

    Chapter  Google Scholar 

  • Rashid Z (2019) Technology-enabled collaborative compliance. The RegTech Book. John Wiley & Sons, Ltd. https://doi.org/10.1002/9781119362197.part1

    Chapter  Google Scholar 

  • Rath CK, Mandal AK, Sarkar A (2023) Data quality driven design patterns for internet of things. In: Chaki R, Cortesi A, Saeed K, Chaki N (eds) Applied computing for software and smart systems. Springer Nature, Singapore, pp 285–303

    Chapter  Google Scholar 

  • Ratner A, Bach SH, Ehrenberg H, Fries J, Wu S, Ré C (2017) Snorkel: Rapid training data creation with weak supervision. Proc VLDB Endow 11(3):269–282. https://doi.org/10.14778/3157794.3157797

    Article  Google Scholar 

  • Reichert M (2011) What BPM technology can do for healthcare process support. Artif Intell Med 13:2–13

    Article  Google Scholar 

  • Reijers HA, Vanderfeesten IT (2004) Cohesion and coupling metrics for workflow process design. Bus Process Manag 2:290–305

    Google Scholar 

  • Restrepo Amariles DR, Winkler MM (2018) US economic sanctions and the corporate compliance of foreign banks. Int Lawyer 51(3):497–536

    Google Scholar 

  • Restrepo-Amariles D, Lewkowicz G (2020) Unpacking smart law: how mathematics and algorithms are reshaping the legal code in the financial sector. Lex Electr 25(3):171–185

    Google Scholar 

  • Rodríguez-Doncel V, Montiel-Ponsoda E (2020) Lynx: towards a legal knowledge graph for multilingual Europe. Law Context 37(1):175–178

    Google Scholar 

  • Rojo MG, Rolón E, Calahorra L, García FÓ, Sánchez RP, Ruiz F, Ballester N, Armenteros M, Rodríguez T, Espartero RM (2008) Implementation of the Business Process Modelling Notation (BPMN) in the modelling of anatomic pathology processes. Diagn Pathol 3(1):S22. https://doi.org/10.1186/1746-1596-3-S1-S22

    Article  Google Scholar 

  • Ruggeri F, Lagioia F, Lippi M, Torroni P (2022) Detecting and explaining unfairness in consumer contracts through memory networks. Artif Intell Law 30(1):59–92. https://doi.org/10.1007/s10506-021-09288-2

    Article  Google Scholar 

  • Sadiq S, Governatori G, Namiri K (2007) Modeling control objectives for business process compliance. In: Alonso G, Dadam P, Rosemann M (eds) Business process management. Springer, pp 149–164. https://doi.org/10.1007/978-3-540-75183-0_12

    Chapter  Google Scholar 

  • Sandhu RS, Coyne EJ, Feinstein HL, Youman CE (1996) Role-based access control models. Computer 29(2):38–47

    Article  Google Scholar 

  • Savelka J, Westermann H, Benyekhlef K, Alexander CS, Grant JC, Amariles DR, Hamdani RE, Meeùs S, Troussel A, Araszkiewicz M, Ashley KD, Ashley A, Branting K, Falduti M, Grabmair M, Harašta J, Novotná T, Tippett E, Johnson S (2021) Lex Rosetta: transfer of predictive models across languages, jurisdictions, and legal domains. Proc Eighteenth Int Conf Artif Intell Law. https://doi.org/10.1145/3462757.3466149

    Article  Google Scholar 

  • Schröer C, Kruse F, Gómez JM (2021) A systematic literature review on applying CRISP-DM process model. Proced Comput Sci 181:526–534

    Article  Google Scholar 

  • Schwabe D, Laufer C, Casanovas P (2020) Knowledge graphs: trust, privacy, and transparency from a legal governance approach. Law Context 37:24

    Google Scholar 

  • Shearer C (2000) The CRISP-DM model: the new blueprint for data mining. J Data Warehous 5(4):13–22

    Google Scholar 

  • Shindo H, Munesada Y, Matsumoto Y (2018) PDFAnno: a web-based linguistic annotation tool for PDF documents. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). LREC 2018, Miyazaki, Japan. https://www.aclweb.org/anthology/L18-1175

  • Shulayeva O, Siddharthan A, Wyner A (2017) Recognizing cited facts and principles in legal judgements. Artif Intell Law 25(1):107–126. https://doi.org/10.1007/s10506-017-9197-6

    Article  Google Scholar 

  • Sillaber C, Mussmann A, Breu R (2019) Experience: data and information quality challenges in governance, risk, and compliance management. J Data Inf Qual. https://doi.org/10.1145/3297721

    Article  Google Scholar 

  • Singhal A (2012) Introducing the knowledge graph: things, not strings. https://blog.google/products/search/introducing-knowledge-graph-things-not/

  • Spanaki K, Papazafeiropoulou A (2013) Analysing the governance, risk and compliance (GRC) implementation process: primary insights. ECIS 2013 Completed Research, pp 58

  • Studer S, Bui TB, Drescher C, Hanuschkin A, Winkler L, Peters S, Müller K-R (2021) Towards CRISP-ML(Q): a machine learning process model with quality assurance methodology. Mach Learn Knowl Extr 3(2):392–413. https://doi.org/10.3390/make3020020

    Article  Google Scholar 

  • Sutton RT, Zaiane OR, Goebel R, Baumgart DC (2022) Artificial intelligence enabled automated diagnosis and grading of ulcerative colitis endoscopy images. Sci Rep 12(1):1–10

    Article  Google Scholar 

  • Tagarelli A, Simeri A (2021) Unsupervised law article mining based on deep pre-trained language representation models with application to the Italian civil code. Artif Intell Law. https://doi.org/10.1007/s10506-021-09301-8

    Article  Google Scholar 

  • Tang M, Su C, Chen H, Qu J, Ding J (2020) SALKG: a semantic annotation system for building a high-quality legal knowledge graph. IEEE Int Conf Big Data. https://doi.org/10.1109/BigData50022.2020.9378107

    Article  Google Scholar 

  • Teichmann F, Boticiu S, Sergi BS (2023) RegTech – Potential benefits and challenges for businesses. Technol Soc 72:100. https://doi.org/10.1016/j.techsoc.2022.102150

    Article  Google Scholar 

  • Terdalkar H, Bhattacharya A (2021) Sangrahaka: a tool for annotating and querying knowledge graphs. Proc ACM Jt Meet Eur Softw Eng Conf Symp Found Softw Eng. https://doi.org/10.1145/3468264.3473113

    Article  Google Scholar 

  • The Institute of International Finance (IIF) (2016) The Institute of International Finance (IIF). Digitizing intelligence: AI, Robots and the future of finance. https://www.iif.com/portals/0/Files/private/ai_report_copy.pdf

  • Treleaven P, Barnett J, Knight A, Serrano W (2021) Real estate data marketplace. AI Ethics 1(4):445–462. https://doi.org/10.1007/s43681-021-00053-4

    Article  Google Scholar 

  • Tsipenyuk G, Crowcroft J (2017) An email attachment is worth a thousand words, or is it? CoRR, abs/1709.00362. http://arxiv.org/abs/1709.00362

  • Uren V, Cimiano P, Iria J, Handschuh S, Vargas-Vera M, Motta E, Ciravegna F (2006) Semantic annotation for knowledge management: requirements and a survey of the state of the art. J Web Semant 4(1):14–28. https://doi.org/10.1016/j.websem.2005.10.002

    Article  Google Scholar 

  • Van Der Aalst W, Van Hee KM (2004) Workflow management: Models, methods, and systems. MIT Press

    Google Scholar 

  • Van Der Aalst W (2011a) Process mining: discovery, conformance and enhancement of business processes (vol 2). Springer

    Book  MATH  Google Scholar 

  • van der Aalst WMP (2011b) Analyzing Lasagna processes. Process mining: discovery conformance and enhancement of business processes. Springer, Berlin, pp 277–299. https://doi.org/10.1007/978-3-642-19345-3_11

    Chapter  MATH  Google Scholar 

  • van der Aalst WMP (2011c) Analyzing Spaghetti processes. Process mining: discovery conformance and enhancement of business processes. Springer, Berlin, pp 301–317. https://doi.org/10.1007/978-3-642-19345-3_12

    Chapter  MATH  Google Scholar 

  • van der Aalst WMP (2013) Business process management: a comprehensive survey. ISRN Softw Eng 2013:1–37. https://doi.org/10.1155/2013/507984

    Article  Google Scholar 

  • van der Weide T, Papadopoulos D, Smirnov O, Zielinski M, van Kasteren T (2017) Versioning for end-to-end machine learning pipelines. Proc Workshop Data Manag End-to-End Mach Learn. https://doi.org/10.1145/3076246.3076248

    Article  Google Scholar 

  • Van Liebergen B et al (2017) Machine learning: a revolution in risk management and compliance? J Financ Transform 45:60–67

    Google Scholar 

  • Vemuri A (2008) Strategic themes in risk and compliance. Finsights 2:2–5

    Google Scholar 

  • Vives X (2017) The impact of FinTech on banking. Eur Econ 2:97–105

    Google Scholar 

  • Waye V (2019) Regtech: a new frontier in legal scholarship. Adel l Rev 40:363

    Google Scholar 

  • Wegener D, Rüping S (2010) On integrating data mining into business processes. Bus Inf Syst 13:183–194

    Google Scholar 

  • Weske M (2007) Business process management-concepts, languages, architectures. Verlag, Berlin

    Google Scholar 

  • Westermann H, Savelka J, Walker VR, Ashley KD, Benyekhlef K (2019) Computer-assisted creation of boolean search rules for text classification in the legal domain. JURIX

    Google Scholar 

  • Westermann H, Šavelka J, Walker VR, Ashley KD, Benyekhlef K (2020) Sentence embeddings and high-speed similarity search for fast computer assisted annotation of legal documents. In: Villata S, Harašta J, Křemen P (eds) Frontiers in artificial intelligence and applications. IOS Press. https://doi.org/10.3233/FAIA200860

    Chapter  Google Scholar 

  • Westermann H, Savelka J, Walker V, Ashley K, Benyekhlef K (2022) Data-centric machine learning in the legal domain. arXiv preprint arXiv:2201.06653

  • Xin D, Macke S, Ma L, Liu J, Song S, Parameswaran A (2018) HELIX: Holistic optimization for accelerating iterative machine learning. Proc VLDB Endow 12(4):446–460. https://doi.org/10.14778/3297753.3297763

    Article  Google Scholar 

  • Yang D, Li M (2018) Evolutionary approaches and the construction of technology-driven regulations. Emerg Mark Financ Trade 54(14):3256–3271

    Article  Google Scholar 

  • Zaharia M, Chowdhury M, Franklin MJ, Shenker S, Stoica I et al (2010) Spark: Cluster computing with working sets. HotCloud 10(10–10):95

    Google Scholar 

  • Zhang N, Ryan M, Guelev DP (2005) Evaluating access control policies through model checking. Inf Secur 8:446–460

    MATH  Google Scholar 

  • Zhou J, Cui G, Hu S, Zhang Z, Yang C, Liu Z, Wang L, Li C, Sun M (2020) Graph neural networks: A review of methods and applications. AI Open 1:57–81. https://doi.org/10.1016/j.aiopen.2021.01.001

    Article  Google Scholar 

  • Zur Muehlen M (2004) Workflow-based process controlling: Foundation, design, and application of workflow-driven process information systems (vol 6). Michael zur Muehlen

    Google Scholar 

Download references

Funding

No funding or grants must be declared.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Raphaël Gyory.

Ethics declarations

Conflict of interest.

All authors declare that they have no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gyory, R., Restrepo Amariles, D., Lewkowicz, G. et al. Ant: a process aware annotation software for regulatory compliance. Artif Intell Law (2023). https://doi.org/10.1007/s10506-023-09372-9

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10506-023-09372-9

Keywords

Navigation