In April please join us for another set of classes by Prague University of Economics and Business. Please see the schedule and information below.
Lecturers by prof. Jan Rauch (jan.rauch@vse.cz), David Chudán PhD (david.chudan@vse.cz)
Starting points:
- Medical data are often available in simple files, e.g. in Excel, see data for HeartBit Discovery Challenge.
- First insight into data can be done directly in Excel. Excel can bring basic statistics, aggregation and visualizations.
- More detailed and attractive visualizations and insight can be done by widely used and simple tools for BI – Business Intelligence (e.g. PowerBI)
- Patterns often produced by Excel and BI: histograms, contingency tables, rules and their simple characteristics
- GUHA is a method of data mining based on mechanizing hypothesis formation.
- GUHA procedures generate and verify all potentially interesting patterns. Output consists of all patterns satisfying given condition of truthfulness.
- System LISp-Miner developed at Faculty of Informatics and Statistics of Prague University of Economics and Business provides seven GUHA procedures dealing with histograms, contingency tables and rules.
Goals of lectures
- To introduce Excel and Power BI as simple tools of data analytics.
- To present differences between Excel and self-service BI tool on one side and data mining on the other side.
- To introduce several GUHA procedures as tools for automated search in a space of all possibly interesting histograms, rules and contingency tables.
- To provide enough examples of applications of the introduced methods.
- To provide installation of the LISp-Miner system making possible to work with the Medical data set STULONG concerning atherosclerosis as well as with student’s own data.
- To present an introduction to dealing with domain knowledge in data mining. This topic will continue in October.
Date and time | Topic | Expected students home activities |
Thursday April 15 9:00 – 12:00 including break | Lecture o Introduction o Excel and Power BI as tools of data analytics o The GUHA procedures as tools for automated search in a space of all possibly interesting histograms, rules and contingency tables Computer class – demonstrations of · Excel as a data analytics tool · Power BI as a data analytics tool | · Experiments with own data · Sending questions · Consultations |
Thursday April 22 9:00 – 12:00 including break | Lecture o Answering sent questions o Introduction to the LISp-Miner system o Data mining with conditional histograms o Data mining with conditional histograms and exceptions
Computer class – demonstrations of · LISp-Miner installation · Data transformation · Data mining with conditional histograms and exceptions | · Experiments with installed STULONG data · Experiments with own data · Sending questions · Consultations |
Monday April 26th 9:00 – 12:00 including break
| Lecture o Answering sent questions o Applying domain knowledge o Data mining with association rules o Data mining with couples of association rules o Data mining with association rules and exceptions Computer class – demonstrations of · Data mining with association rules · Data mining with couples association rules · Data mining with association rules and exceptions | · Experiments with installed STULONG data · Experiments with own data · Sending questions · Consultations |
Thursday April 29th 9:00 – 12:00 including break | Lecture o Answering sent questions o Conclusions from students experiments o Recapitulation of typical tasks o Invitation to more sophisticated dealing with domain knowledge – examples Computer class · Demonstration of dealing with domain knowledge · Discussion |
|