Leveraging Anomaly Detection in Business Process with Data Stream Mining

Authors

  • Gabriel Marques Tavares Universidade Estadual de Londrina (UEL)
  • Victor Guilherme Turrisi da Costa Universidade Estadual de Londrina (UEL)
  • Vinicius Eiji Martins Universidade Estadual de Londrina (UEL)
  • Paolo Ceravolo Universita degli Studi di Milano (UNIMI)
  • Sylvio Barbon Jr. Universidade Estadual de Londrina (UEL)

Keywords:

Process Mining, Business Process Modelling, Online, Fraud, Clustering

Abstract

Identifying fraudulent or anomalous business procedures is today a key challenge for organisations of any dimension. Nonetheless, the continuous nature of business activities conveys to the continuous acquisition of data in support of business process monitoring. In light of this, we propose a method for online anomaly detection in business processes. From a stream of events, our approach extract cases descriptors and applies a density-based clustering technique to detect outliers. We applied our method to a real-life dataset, and we used streaming clustering measures for evaluating performances. Exploring different combinations of parameters, we obtained promising performance metrics, showing that our method is capable of finding anomalous process instances in a vast complexity of scenarios.

Downloads

Download data is not yet available.

References

Aggarwal, C.C., Watson, T.J., Ctr, R., Han, J., Wang, J., and Yu, P.S. (2003). A Framework for Clustering Evolving Data Streams. Proc. of the 29th int. conf. on Verylarge data bases, pages 81–92.

Barbon, S., Tavares, G. M., da Costa, V. G. T., Ceravolo, P., and Damiani, E. (2018). A framework for human-in-the-loop monitoring of concept-drift detection in event logstream. In WWW ’18 Companion: The 2018 Web Conference Companion, April 23–27,2018, Lyon, France. ACM.

Becker, T. and Intoyoad, W. (2017). Context aware process mining in logistics. Procedia CIRP, 63:557 – 562. Manufacturing Systems 4.0 – Proceedings of the 50th CIRP Conference on Manufacturing Systems.

Bifet, A., Holmes, G., Kirkby, R., and Pfahringer, B. (2010). Moa: Massive online analysis. Journal of Machine Learning Research, 11(May):1601–1604.

Böhmer, K. and Rinderle-Ma, S. (2017). Anomaly detection in business process runtimebehavior–challenges and limitations.arXiv preprint arXiv:1705.06659.

Bose, R.J.C. and van der Aalst, W. M. (2010). Trace alignment in process mining: Opportunities for process diagnostics. InBPM, volume 6336, pages 227–242. Springer.

Cao, F., Estert, M., Qian, W., and Zhou, A. (2006). Density-based clustering over an evol-ving data stream with noise. InProceedings of the 2006 SIAM international conferenceon data mining, pages 328–339. SIAM.

Carmona, J. and Cortadella, J. (2010). Process mining meets abstract interpretation. In Joint European Conference on Machine Learning and Knowledge Discovery in Data-bases, pages 184–199. Springer.

Ceravolo, P., Damiani, E., Torabi, M., and Barbon, S. (2017a). Toward a new generation of log pre-processing methods for process mining. In International Conference on Business Process Management, pages 55–70. Springer.

Domingos, P. and Hulten, G. (2000). Mining high-speed data streams. In Proceedingsof the sixth ACM SIGKDD international conference on Knowledge discovery and datamining, pages 71–80. ACM.

Ester, M., Kriegel, H.-P., Sander, J., and Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, KDD’96, pages226–231. AAAI Press.

Frigge, M., Hoaglin, D. C., and Iglewicz, B. (1989). Some implementations of the box-plot. The American Statistician, 43(1):50–54.

Gray, G. L. and Debreceny, R. S. (2014). A taxonomy to guide research on the application of data mining to fraud detection in financial statement audits. International Journal of Accounting Information Systems, 15(4):357–380.

Jans, M., van der Werf, J. M., Lybaert, N., and Vanhoof, K. (2011). A business pro-cess mining application for internal transaction fraud mitigation. Expert Systems with Applications, 38(10):13351–13359.

Juhanák, L., Zounek, J., and Rohl ́ıkov ́a, L. (2017). Using process mining to analyze students’ quiz-taking behavior patterns in a learning management system. Computers in Human Behavior.

Kilpeläinen, T. and Tyrväinen, P. (2004). The degree of digitalization of the information over-flow: A case study. In ICEIS 2004, Proceedings of the 6th International Conference on Enterprise Information Systems, Porto, Portugal, April 14-17, 2004, pages367–374.

Kremer, H., Kranen, P., Jansen, T., Seidl, T., Bifet, A., Holmes, G., and Pfahringer, B.(2011). An effective evaluation measure for clustering on evolving data streams. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge disco-very and data mining, pages 868–876. ACM.

Leemans, S. J. J., Fahland, D., and van der Aalst, W. M. P. (2014). Discovering block-structured process models from event logs containing infrequent behaviour. In Loh-mann, N., Song, M., and Wohed, P., editors, Business Process Management Workshops, pages 66–78, Cham. Springer International Publishing.

Lévesque, L. (2014). Nyquist sampling theorem: understanding the illusion of a spinning wheel captured with a video camera.Physics Education, 49(6):697–705.

Mannhardt, F., de Leoni, M., Reijers, H. A., and van der Aalst, W. M. P. (2017). Data-driven process discovery - revealing conditional infrequent behavior from event logs. In Dubois, E. and Pohl, K., editors, Advanced Information Systems Engineering, pages545–560, Cham. Springer International Publishing.

Murata, T. (1989). Petri nets: Properties, analysis and applications. Proceedings of the IEEE, 77(4):541–580.

Ngai, E., Hu, Y., Wong, Y., Chen, Y., and Sun, X. (2011). The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature. Decision Support Systems, 50(3):559–569.

Reisig, W. (1985).Petri Nets: An Introduction, volume 4 of EATCS Monographs on Theoretical Computer Science. Springer.

Rosenberg, A. and Hirschberg, J. (2007). V-measure: A conditional entropy-based external cluster evaluation measure. InEMNLP-CoNLL, volume 7, pages 410–420.

Rozinat, A. and van der Aalst, W. (2008). Conformance checking of processes based on monitoring real behavior.Information Systems, 33(1):64 – 95.

Tavares, G. M., da Costa, V. G. T., Martins, V. E., Ceravolo, P., and Barbon, S. (2018). Anomaly detection in business process based on data stream mining. In SBSI’18: XIV Brazilian Symposium on Information Systems, June 4–8, 2018, Caxias do Sul, Brazil. ACM.

Valle, A. M., Santos, E. A., and Loures, E. R. (2017). Applying process mining techniques in software process appraisals. Information and Software Technology, 87:19 – 31.

van der Aalst, W., Weijters, T., and Maruster, L. (2004). Workflow mining: discovering process models from event logs. IEEE Transactions on Knowledge and Data Engineering, 16(9): 1128–1142.

van der Aalst, W. M. P. (2011). Process Mining: Discovery, Conformance and Enhancement of Business Processes. Springer Publishing Company, Incorporated, 1st edition.

Wagner, R. A. and Fischer, M. J. (1974). The string-to-string correction problem. J. ACM,21(1):168–173.

Wang, J., Wong, R. K., Ding, J., Guo, Q., and Wen, L. (2012). On recommendation of process mining algorithms. In Web Services (ICWS), 2012 IEEE 19th International Conference on, pages 311–318. IEEE.

West, J. and Bhattacharya, M. (2016). Intelligent financial fraud detection: a comprehensive review. Computers & Security, 57:47–66.

Witten, I. H., Frank, E., Hall, M. A., and Pal, C. J. (2016). Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann.

Yang, W.-S. and Hwang, S.-Y. (2006). A process-mining framework for the detection of healthcare fraud and abuse. Expert Systems with Applications, 31(1):56–68.

Downloads

Published

2019-04-17

How to Cite

Tavares, G. M., Turrisi da Costa, V. G., Martins, V. E., Ceravolo, P., & Barbon Jr., S. (2019). Leveraging Anomaly Detection in Business Process with Data Stream Mining. ISys - Brazilian Journal of Information Systems, 12(1), 54–75. Retrieved from https://seer.unirio.br/isys/article/view/7877

Issue

Section

EXTENDED VERSIONS FROM SELECTED PAPERS