Module/Course Title: Data Science

Module course code

KOMS120407

Student Workload
119 hours

Credits

3 / 4.5 ETCS

Semester

6

Frequency

Even Semester

Duration

16

1

Type of course

Core Study Courses

Contact hours


40 hours of face-to-face (theoretical) class activity

Independent Study


48 hours of independent activity
48 hours of structured activities

Class Size

30

2

Prerequisites for participation (if applicable)

-

3

Learning Outcomes

  1. Students can demonstrate systematic thinking in analyzing and designing intelligent system solutions
  2. Students can apply effective methods in developing intelligent systems
  3. Students can create and evaluate intelligent systems
  4. Students can explain the definition and scope of data science
  5. Students can explain the role of data science in solving real-world problems
  6. Students can explain the common methodology used in data science solutions
  7. Students can identify the stages of CRISP DM in related scientific articles
  8. Students can formulate business objectives in order to determine the technical goals of data science
  9. Students can use tools to conduct data mining stages in order to provide data science solution

4

Subject aims/Content

This course introduces data science in general and provides its application in the real world. The materials cover the stages of data science and examples of their applications, an introduction to data sources, big data, data mining stages, as well as data visualization using tools. Data preprocessing techniques (accompanied by a tutorial) such as handling missing values, correlation analysis for feature selection, sampling, and normalization. Descriptive analysis using statistics, simple visualization (accompanied by tutorial), and clustering, along with examples of their application. Predictive analysis techniques such as pattern mining, regression, and classification, along with examples of their application.

Study Material

Introduction to Data Science

Data Science Methodology

Data Science Project Tools

Business Understanding

Data Understanding

Data Visualization

Data Preparation: Data cleaning

-

Data Preparation: Data transformation

Data Preparation: Feature

Classification

Clustering

Regression

Modelling: Build the model

Model Deployment

-

5

Teaching methods

Synchronous:

Face-to-face meetings/online meetings

6

Assesment Methods

Attendance and participation

7

This module/course is used in the following study programme/s as well

Computer Science Study Programme

8

Responsibility for module/course

  • Ni Putu Novita Puspa Dewi, S.Kom., M.Cs.
  • NIDN : 0003109401

9

Other Information

This is a general reading list that you may find useful, such as:

  1. Han, Jiawei, et al. 2011. Data Mining: Concepts and Techniques. Morgan - Kauffman
  2. pythonprogramming (video), 2016, “Intro to Machine Learning with Scikit Learn and Python”, pythonprogramming.net
  3. Ahmad, N., et al., 2016, “Using Fisher information to track stability in multivariate systems”, Royal Society Open Science, 3:160582
  4. McKinney, W. (2017). Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython. O'Reilly Media.
  5. Grus, J. (2019). Data Science from Scratch: First Principles with Python. O'Reilly Media.
  6. Géron, A. (2019). Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow. O'Reilly Media.
  7. James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). Introduction to Statistical Learning. Springer.
  8. Dewi, N. P. N., & Purwanta, I. P. B. D. (2021). Big Data for Indonesian Marine Fisheries. In Proceedings of the 4th International Conference on Innovative Research Across Disciplines (ICIRAD 2021), Atlantis Press.
  9. Wibowo, B., & Rahman, F. (2016). Data Mining for Fraud Detection in Indonesian Banking Sector. Journal of Financial Analytics, 12(2), 67-78.
  10. K. Y. E. Aryanto, K. A. Seputra, I. N. S. W. Wijaya, I. W. Abyong, G. A. Pradnyana and A. A. G. Y. Paramartha, "Towards Healthcare Data Sharing: An e-Health Integration Effort in Indonesian District," 2021 International Seminar on Application for Technology of Information and Communication (iSemantic), Semarang, Indonesia, 2021, pp. 280-284, doi: 10.1109/iSemantic52711.2021.9573250.

Students should have access to at least the most recent issues of the following journals: IEEE, Springer, or any journal related to the topic of Artificial Intelligent, Data Science, Machine Learning, or Business Intelligent. Many relevant publications can be downloaded free of charge from the websites of mdpi.com. 

Students are highly recommended to access blogs and websites such as:

  • Towards Data Science (https://towardsdatascience.com/)
  • DataCamp (https://www.datacamp.com/)
  • Kaggle (https://www.kaggle.com/)
  • Analytics Vidhya (https://www.analyticsvidhya.com/)
  • machinelearningmastery.com
  • aitopics.org.
  • https://scikit-learn.org/dev/modules/neural_networks_supervised.html
  • http://networksciencebook.com/chapter/1#scientific-impact

These references cover a wide range of topics related to data science, from the basics of Python programming and statistics to more advanced machine learning techniques and applications.