View a multi-dimensional time sequence database file with the MD Sequence Database Viewer (SPMF documentation)

Multi-dimensional time sequence databases are a type of data taken as input by some algorithms offered in SPMF such as the Fournier-Viger 2008 algorithm .

Put simply, a multi-dimensional timesequence database is a set of sequences that are annotated with some attributes (called dimensions), and that also contain timestamps.

SPMF offers a tool to view the content of a multi-dimensional time sequence database. This tool is called the MD Sequence Database Viewer.

This page explains how to use this tool with an example.

How to run this example?

If you want to run this example from the graphical interface of SPMF, (1) choose the algorithm "Open_md_time_sequence_database_with_viewer", (2) choose the ContextMDSequence.txt file as input, and then (3) click "run algorithm" .

graph viewer open

What is displayed?

After running the example, the content of the file will be displayed by the tool. The picture below shows the user interface of this viewer.

The window A) show in the picture below is the main window. It displays the multi-dimensional time sequence database using a table. The table has four rows in this example. Each row is a multi-dimensional sequence that contains two parts: dimension values (called an MD-pattern) and a sequence.

Take the first row as example. It represents a multi-dimensional sequence
The cell in the first column indicates that the values for the three dimensions of this multi-dimensional sequence are 1, 1 and 1, respectively.
The cell in the second column indicates that the sequence is defined as follows. First, two items 2 and 4 were observed at time 0, followed by the item 3 at time 1, followed by the item 2 at time 2, and finally followed by the item 1 at time 3.

The other multi-dimensional sequences follow the same format.

Besides, there are buttons that provides additional features:

graph viewer database graph

What is the input?

The algorithm takes as input a multi-dimensional time sequence database, as used by algorithm such SEQ-DIM and DIM-SEQ .

The database used in this example is provided in the text file "ContextMDSequence.txt" in the package ca.pfv.spmf.tests of the SPMF distribution.

The input file format is defined as follows. It is a text file where each line represents a multi-dimensional time-extended sequence from a multi-dimensional time-extended sequence database. Each line consists of two parts.

For example, this is the content of the example file "ContextMDSequence.txt":

1 1 1 -3 <0> 2 4 -1 <1> 3 -1 <2> 2 -1 <3> 1 -1 -2
1 2 2 -3 <0> 2 6 -1 <1> 3 5 -1 <2> 6 7 -1 -2
1 2 1 -3 <0> 1 8 -1 <1> 1 -1 <2> 2 -1 <3> 6 -1 -2
* 3 3 -3 <0> 2 5 -1 <1> 3 5 -1 -2

Consider the second line. It indicates that the second multi-dimensional time-extended sequence of this database has the dimension values 1, 2 and 2. Furthermore, the first itemset is {2, 4} with a timestamp of 0. Then, the item 3 appears with a timestamp of 1. Then the item 2 appears with a timestamp of 2. Finally, the item 1 appears with a timestamp of 3. The other sequence follows the same format. Note that timestamps do not need to be consecutive integers. But they should increase for each succesive itemset within a same sequence.