The repository

This page is a repository of videos, pictures, scripts, and data to complement my PhD dissertation. Scripts are only provided as a proof of concept and they are intended solely for demonstration of the idea.

Useful Scripts

To perform some of the daily activities and frequent analysis, I created a toolbox of tiny (but handy) Python and Bash scripts. Here you can take a look at them. Feel free to use them as you wish. Python/Bash script

Repetitive patterns of system logs

Looking at the system logs in 1-hour time windows reveals a strong resemblens. Frame-by-frame

Identifiable symptoms before failures

In many cases, there are visible symptoms before the occurence of node failures. More examples

Extracting syslog patterns

System logs are automatically clustered according to their similarity, and the relevant syslog patterns are extracted. Try it

Generating Regular expression

It is also possible to generate regular expressions that represent syslog patterns with minimum errors.
Version 1: Semantic based
Version 2: fully automated

The golden interval

This interactive chart demonstrates a real example of golden intervals on Taurus. Demo

Turning Privacy Constraints into Syslog Analysis Advantage

Using an irreversible encoding method (hashing), the users privacy can be guaranteed, the log size is reduced and the data quality remains adequate for further analysis. Read more

Propagation of failures

The failures in one component may propagate to other parts of the system. The video below shows an example of such propagation (starting at 00:30) in Taurus. More examples

To play the videos, use the Gource command followed by name of the file.

Word's statistics

These statistics show the length of ALL words used in syslog messages on Taurus during 2017. Raw data

Daily Statistics of Syslog and Deamons in 2017

These statistics show the appearance frequency of each daemon and system log on Taurus during 2017. Raw data

Pattern recognition in anonymized system logs

To detect the normal patterns in anonymized system logs, the syslog entries collected in a day has been assumed as a long sentence. Then the re-appearances of each substring have been analyzed. Demo

SAM files

Among other methods, the DNA-based vizualising tools have been also tested. Here you can find an example of a SAM file that is generated from Taurus system logs. SAM file

Calculating the SG parameter

The following Python script demonstrates an easy method to calculate the SG parameter. Python script

Job failures

There was no meaningful correlation among job failures and node failures. This graph shows the status of successful/jobs on Taurus in 2017. PDF