|
Module/Course Title: Data Mining
|
|
Module course code
KOMS120504
|
Student Workload
119 hours
|
Credits
3 / 4.5 ETCS
|
Semester
5
|
Frequency
Odd Semester
|
Duration
16
|
|
1
|
Type
of course
Field of Study Courses
|
Contact
hours
37.50 hours of face-to-face (theoretical) class activity 8.50 hours of lab activities
|
Independent
Study
45 hours of independent activity 45 hours of structured activities
|
Class Size
30
|
|
2
|
Prerequisites
for participation (if applicable)
-
|
|
3
|
Learning Outcomes
- Students can demonstrate systematic thinking in analyzing and designing intelligent system solutions
- Students can apply effective methods in developing intelligent systems
- Students can create and evaluate intelligent systems
- Students can explain major issue in data mining
- Students can apply machine learning, pattern recognition, statistics, visualization, algorithm, database technology and high-performance computing in data mining applications
- Students can apply data mining techniques on datasets of realistic sizes using modern data analysis frameworks
|
|
4
|
Subject aims/Content
Data Mining course discusses data mining process includes data selection and cleaning, machine learning techniques to ``learn" knowledge that is ``hidden" in data, and the reporting and visualization of the resulting knowledge. This course will cover these issues and will illustrate the whole process by examples of practical applications from the life sciences, computer science, and commerce. Several machine learning topics including classification, prediction, and clustering will be covered. Study Material
Introduction to Data Mining
- Definition of data mining
- Purpose of data mining
- Data mining stages
Data
- Data Type and Quality
- Preprocessing
- Data measurement technique
Data Exploration
- Data Statistics
- Data Visualization
- Multi-dimensional data analysis & OLAP
Classification Method:
- Basic concepts of classification
- Decision Tree and Model Overfitting
Classification Technique:
- K-Nearest Neighbor
- Comparison with Decision Tree
Classification Technique:
- Naive Bayes
- Comparison with Decision Tree, and K-Nearest Neighbor
Association Method:
- Association Analysis
- FP-Growth . Algorithm
- Techniques for evaluating association patterns
- Frequent itemset generation
- Rule generation, compact representation of frequent itemset
-
Association Technique
- Handling categorical attributes and continuous attributes in association analysis
- Sequential, subgraph and infrequent patterns
Clustering
- Definition and basic concepts of clustering
- K-Means Algoritma Algorithm
Clustering:
- Hierarchical Clustering
- DBSCAN algorithm
Data anomaly
- Definition of data anomalies and statistical approaches to address data anomalies
- Detection with proximity-based outliers, detection of density-based outliers & clustering-based technique
Data Mining Apps and Trends
- Spatial & Multimedia Data Mining
- Text & Web Mining
Data Mining Apps and Trends
- Application of data mining in financial, industrial retail, telecommunications, biology, and science applications
- Data mining system products
-
Clustering
- Definition and basic concepts of clustering
- K-Means Algoritma Algorithm
|
|
5
|
Teaching methods
Synchronous:
Face-to-face meetings/online meetings
|
|
6
|
Assesment Methods
Attendance and participation
|
|
7
|
This module/course is used in the following study programme/s as well
Computer Science Study Programme
|
|
8
|
Responsibility for module/course
- I Nyoman Saputra Wahyu Wijaya, S.Kom., M.Cs
- NIDN : 0826108901
|
|
9
|
Other Information
- Introduction to Data Mining 2nd Edition, Tan, Pang-Ning; Steinbach, Michael; Kumar, Vipin, vi Pearson Education, Inc, 2015
- Data Mining Concepts and Techniques 3rd edition, Han, Jiawei; Kamber, Micheline, and Jian Pei, , Morgan Kaufmann, 2011
- Data Mining and Knowledge Discovery Handbook Second Edition,Maimon,Oded; Rocach, Lior, Springer, 2010
- I. N. S. W. Wijaya, K. A. Seputra, and W. G. S. Parwita, “Comparison of the BM25 and rabinkarp algorithm for plagiarism detection,” J. Phys. Conf. Ser., vol. 1810, no. 1, 2021, doi: 10.1088/1742-6596/1810/1/012032.
- S. V. Pandey and A. V. Deorankar, “A Study of Sentiment Analysis Task and It’s Challenges,” Proc. 2019 3rd IEEE Int. Conf. Electr. Comput. Commun. Technol. ICECCT 2019, pp. 1–5, 2019, doi: 10.1109/ICECCT.2019.8869160.
- J. Oyelade et al., “Data Clustering: Algorithms and Its Applications,” Proc. - 2019 19th Int. Conf. Comput. Sci. Its Appl. ICCSA 2019, no. ii, pp. 71–81, 2019, doi: 10.1109/ICCSA.2019.000-1.
- N. Besimi, B. Çiço, and A. Besimi, “Overview of data mining classification techniques: Traditional vs. parallel/distributed programming models,” 2017 6th Mediterr. Conf. Embed. Comput. MECO 2017 - Incl. ECYPS 2017, Proc., no. June, pp. 2–5, 2017, doi: 10.1109/MECO.2017.7977126.
|