In April please join us for another set of classes by Prague University of Economics and Business. Please see the schedule and information below.

Lecturers by prof. Jan Rauch (jan.rauch@vse.cz), David Chudán PhD (david.chudan@vse.cz

Starting points:

  • Medical data are often available in simple files, e.g. in Excel, see data for HeartBit Discovery Challenge.
  • First insight into data can be done directly in Excel. Excel can bring basic statistics, aggregation and visualizations.
  • More detailed and attractive visualizations and insight can be done by widely used and simple tools for BI – Business Intelligence (e.g. PowerBI)
  • Patterns often produced by Excel and BI: histograms, contingency tables, rules and their simple characteristics
  • GUHA is a method of data mining based on mechanizing hypothesis formation.
  • GUHA procedures generate and verify all potentially interesting patterns. Output consists of all patterns satisfying given condition of truthfulness.
  • System LISp-Miner developed at Faculty of Informatics and Statistics of Prague University of Economics and Business provides seven GUHA procedures dealing with histograms, contingency tables and rules.

Goals of lectures

  • To introduce Excel and Power BI as simple tools of data analytics.
  • To present differences between Excel and self-service BI tool on one side and data mining on the other side.
  • To introduce several GUHA procedures as tools for automated search in a space of all possibly interesting histograms, rules and contingency tables.
  • To provide enough examples of applications of the introduced methods.
  • To provide installation of the LISp-Miner system making possible to work with the Medical data set STULONG concerning atherosclerosis as well as with student’s own data.
  • To present an introduction to dealing with domain knowledge in data mining. This topic will continue in October.

Date and time

Topic

Expected students home activities

Thursday April 15

9:00 – 12:00

including  break

Lecture   

o    Introduction

o    Excel and  Power BI as tools of data analytics

o    The GUHA procedures as tools for automated search in a space of all possibly interesting histograms, rules and contingency tables

Computer class – demonstrations of

·        Excel as a data analytics tool

·        Power BI as a data analytics tool

·        Experiments with own data

·        Sending questions

·        Consultations

Thursday April 22

9:00 – 12:00

including  break

Lecture  

o   Answering sent questions

o   Introduction to the LISp-Miner system

o   Data mining with conditional histograms

o   Data mining with conditional histograms and exceptions

 

Computer class – demonstrations of

·        LISp-Miner installation

·        Data transformation

·        Data mining with conditional histograms and exceptions

·        Experiments with installed STULONG data

·        Experiments with own data

·        Sending questions

·        Consultations

Monday April 26th

9:00 – 12:00

including  break

 

 

 

Lecture  

o   Answering sent questions

o   Applying domain knowledge

o   Data mining with association rules

o   Data mining with couples of association rules

o   Data mining with association rules and exceptions

Computer class – demonstrations of

·        Data mining with association rules

·        Data mining with couples association rules

·        Data mining with association rules and exceptions

·        Experiments with installed STULONG data

·        Experiments with own data

·        Sending questions

·        Consultations

Thursday April 29th

9:00 – 12:00

including  break

Lecture

o   Answering sent questions

o   Conclusions from students experiments

o   Recapitulation of typical tasks

o   Invitation to more sophisticated dealing with domain knowledge – examples

Computer class  

·        Demonstration of dealing with domain knowledge 

·        Discussion