CS 412: Introduction to Data Mining

Fall, 2021


Course Objective

Provide a comprehensive overview of the fundamental concepts and techniques of data mining.

·      Be able to understand the key concepts of data mining techniques, including data preprocessing, data warehousing and cube, frequent pattern mining, classification, clustering.

·      Be able to apply the key data mining techniques to realistic setting, evaluate and analyze

the mining results.


Basic Information

Classes: Tue, Thu 09:30 am – 10:45 am

Location: 3039 Campus Instructional Facility

Instructor: Arindam Banerjee, arindamb@illinois.edu


·      Yikun Ban, yikunb2@illinois.edu

·      Dongqi Fu, dongqif2@illinois.edu

·      Zhe Xu, zhexu3@illinois.edu

·      Hang Zhang, hangz2@illinois.edu

Office hours:

·      Arindam Banerjee: Tue, Thu 6:00 – 7:00 pm (zoom)

·      Yikun Ban:

·      Dongqi Fu:

·      Zhe Xu:

·      Hang Zhang:

Online resources:

·      Canvas

·      Slack


Schedule (Tentative, subject to mild adjustments)

·      Course Outline / Chapter 1: Introduction (week 1)

·      Chapter 2: Data and Measurements (weeks 1, 2)

·      Chapter 3: Data Preparation (weeks 2, 3)

·      Chapter 4: Data Warehousing and OLAP (week 3)

·      Chapter 5: Mining Frequent Patterns, Associations, and Correlations (weeks 4, 5)

·      Chapter 6: Advanced Pattern Mining (weeks 5, 6, 7)

·      Chapter 7: Classification: Basic Concepts (weeks 8, 9, 10)

·      Chapter 8: Classification: Advanced Concepts (week 11, 12)

·      Chapter 11: Deep Learning (week 13)

·      Chapter 10: Cluster Analysis: Basic Concepts (weeks 14, 15)

Coursework and Grading

·      Assignments, Programming Assignments, and Exams

o   Written Assignments: 30% (three homework assignments expected)

o   Programming assignments: 30% (two programming assignments expected)

o   Midterm exam: 20%

o   Final exam: 20%

·      For students taking 4th credit

o   For students registering 4 credits: 25%. The overall scores will be scaled proportionally

o   Group project: 2-3 members


Key Dates

·      Assignments

o   A1: Tue, Sept 7 out, Thu, Sept 23 due

o   A2: Thu, Sept 23 out, Mon, Oct 11 due

o   A3 (programming): Mon, Oct 11 out, Mon, Nov 08 due

o   A4 (programming): Mon, Nov 08 out, Fri, Dec 03 due

o   A5: Mon, Nov 15 out, Mon, Dec 06 due

·      Exams

o   Mid-term: Thu, Oct 14, posted 6 pm, 24 hours

o   Final: Fri, Dec 10, posted 6 pm, 24 hours


·      Project (for students taking 4th credit)

o   Project proposal due: Mon, October 4

o   Mid-term report due: Wed, Nov 3

o   Final report due: Wed, Dec 8




q Following cutoffs represent what will likely be used to generate letter grades:   


     A+   >= 98%                     A  >=94% & < 98%      A-  >=90% & < 94%      

     B+   >= 85% & < 90%      B  >=80% & < 85%      B- >=77% & < 80%      

     C+   >= 74% & < 77%      C  >=70% & < 74%      C- >=67% & <70%             

         >= 60%& <67%        F  <   60% 


q The above cutoffs are tentative and may be adjusted slightly; if there is any adjustment to the above cutoffs, we will NOT curve down your letter grades

q However, there will be no general curve-fitting in assigning the final grades




Required: Jiawei Han, Micheline Kamber and Jian Pei, Data Mining: Concepts and Techniques

(4rd ed), Morgan Kaufmann, 2021


·      Charu C. Aggarwal, Data Mining: The Textbook, Springer, 2015

·      P.-N. Tan, M. Steinbach and V. Kumar, Introduction to Data Mining, Wiley, 2005 (2nd ed. 2016)

·      Mohammed J. Zaki and Wagner Meira Jr., Data Mining and Analysis: Fundamental Concepts and Algorithms, Cambridge University Press, 2014






