Downloading event logs via API

This example demonstrates how we can easily download well-known process mining event logs from the 4TU.Centre for Research Data using the skpm.event_logs module.

The skpm.event_logs module provides a set of event logs, such as the Sepsis and BPI 2012.

The API overview

Implementing each event log as a class is a design choice that allows us to easily manipulate each of them according to their specific characteristics. One of the main challenges in process mining is the completely different nature of datasets, since each of them is composed of very particular business rules.

For instance, an unbiased split of event logs was proposed in [1]. Roughly speaking, each event log is splitted based on specific temporal characteristics, which is hard coded within each specific event log. You can check this feature in :ref:Unbiased split <sphx_glr_auto_examples_unbiased_split.py>. Now, let us see how to easily download event logs below.

Downloading the BPI 2013 event log

The BPI 2013 event log is a well-known event log that contains data about closed problems from the Volvo IT Belgium. We can easily download it as follows:

[1]:
from skpm.event_logs import BPI13ClosedProblems

bpi13 = BPI13ClosedProblems() # automatically downloads and caches the file
bpi13
[1]:
BPI13ClosedProblems Event Log
    Cases: 1,487
    Events: 6,660
    Activities: 4

Notice, the __repr__method returns a brief overview of the event log. In order to acess the dataframe, just call the dataframe attribute.

[2]:
bpi13.dataframe.head()
[2]:
org:group resource country organization country org:resource organization involved org:role concept:name impact product lifecycle:transition time:timestamp case:concept:name
0 Org line A2 INDIA se Minnie J11 2nd A2_2 Queued High PROD191 Awaiting Assignment 2006-01-11 14:49:42+00:00 1-109135791
1 Org line A2 Sweden cn Tomas M1 2nd A2_2 Accepted Medium PROD753 In Progress 2006-11-07 09:00:36+00:00 1-147898401
2 Org line A2 Sweden cn Tomas M1 2nd A2_2 Accepted Medium PROD753 In Progress 2006-11-07 12:05:44+00:00 1-147898401
3 Org line A2 Sweden cn Tomas M1 2nd A2_2 Accepted Medium PROD753 In Progress 2007-03-20 08:06:25+00:00 1-165554831
4 Org line A2 Sweden cn Tomas M1 2nd A2_2 Accepted Low PROD753 In Progress 2007-05-10 14:21:54+00:00 1-172473423

In this tutorial, we showed how to user our API to automatically download event logs from the 4TU Repository <https://data.4tu.nl/>_. We hope you find it useful for your projects.

References

[1] Hans Weytjens, Jochen De Weerdt. Creating Unbiased Public Benchmark Datasets with Data Leakage Prevention for Predictive Process Monitoring, 2021. doi: 10.1007/978-3-030-94343-1_2