Knowledge Discovery from Data Bases
Code: | CC5015 | Acronym: | KDD | Level: | 500 |
Keywords | |
---|---|
Classification | Keyword |
OFICIAL | Informatics |
Instance: 2022/2023 - 1S
Active? | Yes |
Responsible unit: | Department of Computer Science |
Course/CS Responsible: | Doctoral Program in Computer Science - MAP joint programme |
Cycles of Study/Courses
Acronym | No. of Students | Study Plan | Curricular Years | Credits UCN | Credits ECTS | Contact hours | Total Time |
---|---|---|---|---|---|---|---|
PDMAPI | 8 | Official Study Plan since 2020/2021 | 1 | - | 6 | 30 | 162 |
Teaching language
EnglishObjectives
At the end of the semester the students should be able to:
- Formulate a decision problem as a data mining problem;
- Identify the basic tasks in knowledge discovery from data bases;
- Identify and use the main methods in solving data mining problems;
- Apply the main methods and algorithms for each mining task;
- Apply the main methods and algorithms in real-world problems and adapt to new contexts
Learning outcomes and competences
Knowledge of how to formulate a problem as a problem of knowledge extraction. Ability to apply methods / algorithms to a new problem of data analysis, and evaluate the results and understand the functioning of the methods studied.Working method
PresencialProgram
- Introductory Concepts
– Introduction to Knowledge Discovery in Data Bases
∗ From OLAP to On-Line Analytical Mining;
∗ Data Mining tasks;
– Cluster Analysis
∗ Cluster Analysis: concepts and methods;
∗ Partitioning and Hierarchical Methods;
– Association Analysis
∗ Frequent pattern mining;
∗ Frequent Sequence mining;
– Predictive Data Mining: Classification and Regression.
∗ Optimization Methods: Artificial Neural Networks; Support Vector Machines.
∗ Probabilistic Methods: Bayesian Classifiers;
∗ Search based Methods: Decision Trees and Rules.
– Evaluation in Predictive Data Mining.
∗ Evaluation: goals and perspectives;
∗ Loss Functions and Cost-benefit analysis;
∗ Bias-Variance analysis;
– Ensembles and Multiple Models
∗ Concepts and methods;
∗ Combining Homogeneous Models;
∗ Combining Heterogeneous models;
- Advanced Topics
– Social Network Analysis
∗ Concepts and methods;
∗ Evolution of Networks;
– Text Mining
∗ Concepts and methods;
∗ Information retrieval;
∗ Document classification;
– Web Mining and Link Analysis
∗ Concepts and methods;
∗ Web and Structure mining;
∗ Link analysis;
– Big Data and Data stream Mining
∗ Big Data: Applications and tools
∗ Concepts and methods;
∗ Summarizing data streams;
∗ Knowledge discovery from data streams;
– Data Mining Standards and Processes
Mandatory literature
J. Gama, A. Carvalho, K. Faceli, A. Lorena, M. Oliveira; Extração de Conhecimento de Dados - Data Mining, Sílabo, 2012Jiawei Han e Micheline Kamber; Data Mining, Concepts and Techniques, Morgan Kaufmann, 2006
J. Gama; Knowledge Discovery from Data Streams, CRC Press, 2010
Teaching methods and learning activities
The teaching method consists of theoretical-practical classes.
Evaluation Type
Distributed evaluation without final examAssessment Components
designation | Weight (%) |
---|---|
Participação presencial | 10,00 |
Trabalho prático ou de projeto | 90,00 |
Total: | 100,00 |
Amount of time allocated to each course unit
designation | Time (hours) |
---|---|
Elaboração de projeto | 28,00 |
Estudo autónomo | 28,00 |
Frequência das aulas | 28,00 |
Total: | 84,00 |
Eligibility for exams
Submit assignmentCalculation formula of final grade
The evaluation consists of home-works.