Module/Course Title: Information Retrieval

Module course code

KOMS120503

Student Workload
119 hours

Credits

3 / 4.5 ETCS

Semester

5

Frequency

Odd Semester

Duration

16

1

Type of course

Field of Study Courses

Contact hours


40 hours of face-to-face (theoretical) class activity

Independent Study


48 hours of independent activity
48 hours of structured activities

Class Size

30

2

Prerequisites for participation (if applicable)

-

3

Learning Outcomes

  1. Students can demonstrate systematic thinking in analyzing and designing intelligent system solutions
  2. Students can apply effective methods in developing intelligent systems
  3. Students can create and evaluate intelligent systems
  4. Students can describe information retrieval concepts such as indexing, searching, and evaluation of search engines
  5. Students can apply text classification, clustering, and summarization approaches
  6. Students can design and create a text-based search engine
  7. Students can evaluate the performance of information retrieval with various evaluation techniques

4

Subject aims/Content

This module provides a comprehensive exploration of the field of information retrieval, starting with an introduction to its core principles. The course explores the fundamental concepts underlying information retrieval, including indexing, searching, and evaluating search engines. Starting with the concepts of indexing such as boolean retrieval, index construction, scoring and weighting, and vector space model. The module addresses the process of web crawling for efficient data collection and web search, demonstrate the techniques employed by search engines to index and retrieve information from the World Wide Web. The module also emphasizes the evaluation of search engines, providing students with insights into assessing information retrieval systems effectiveness and performance, as well as covers a range of specific topics related to information retrieval, such as relevance feedback and query expansion.

Study Material

Database vs IR (Simple IR example -Boolean query)

Text processing - text statistics

Basic Concepts of Information Gathering System

  • Basic Concepts of Information Retrieval System
  • Information Retrieval System Components
  • Differences in Information Retrieval System with other Systems

Inverted index

  • Inverted index construction
  • Indexing (manual and automatic): tokenization, stopwords, stemming, weighting,

IR Modeling

  • Boolean Model
  • Vector Space Model

IR models

  • IR Modeling
  • Boolean models
  • Vector space model

I

IR Evaluation

  • Evaluation Benchmarks
  • Recall Precision
  • Interpolation Other evaluation measures
  • Relevance

Feedback

  • Probabilistic Relevance Feedback
  • Pseudo relevance feedback
  • Query Expansion
  • Probability ranking
  • Binary independence model
  • Language model for IR

Probability ranking

  • Binary independence model
  • Language model for IR

-

Text Classification

  • Document classification
  • Probability classification Vector space classification

Text Classification

  • Document classification
  • Probability classification Vector space classification

Clustering

  • Clustering in IR
  • Flat clustering : K-means, model- based
  • Hierarchical clustering : dendogram, single-link, complete link, average link
  • Mabeling

Text Summarization

  • Document summary
  • Summary type
  • Approach: traditional, statistics


XML

Basic Concept 
Model for XML IR
XML IR model Evaluation


Model MIRS
Pattern Recognition for Multimedia Contentd Analysis
Image Processing for Feature Extraction

Question Answering System and CLIR QA vs IR

  • QAS method and evaluation
  • CLIR
  • Translation method

-

5

Teaching methods

Lectures, discussions and questions and answers

6

Assesment Methods

Attendance and Participation

7

This module/course is used in the following study programme/s as well

Computer Science Study Programme

8

Responsibility for module/course

  • A.A. Gede Yudhi Paramartha, S.Kom., M.Kom.
  • NIDN : 0022068803

9

Other Information

Books:

  1. Yates, R.B., Neto, B.R., 2009, Modern Information Retrieval, ACM Press New York, Addition Wesley.
  2. Manning, C. D., Raghavan, P., and Schutze, H., 2008, Introduction to Information Retrieval, Cambridge University Press

Publications:

  1. Paramartha, et al. Ontology-based Learning Object Searching Technique with Granular Feature Extraction, in Proceedings of the 16th International Conference on Information Integration and Web-based Applications & Services (iiWAS '14)
  2. Liu, et al. Data Mining and Information Retrieval in the 21st century: A bibliographic review, in Computer Science Review Volume 34, November 2019
  3. Lin, et al. Supporting Interoperability Between Open-Source Search Engines with the Common Index File Format, in SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval

Websites:

  1. What is Information Retrieval? (https://www.geeksforgeeks.org/what-is-information-retrieval/)

  2. NLP - Information Retrieval (https://www.tutorialspoint.com/natural_language_processing/natural_language_processing_information_retrieval.htm)