Course Meeting Times
Lectures: 3 sessions / week, 1 hour / session
Course Summary
Data that has relevance for managerial decisions is accumulating at an incredible rate due to a host of technological advances. Electronic data capture has become inexpensive and ubiquitous as a by-product of innovations such as the internet, e-commerce, electronic banking, point-of-sale devices, bar-code readers, and intelligent machines. Such data is often stored in data warehouses and data marts specifically intended for management decision support. Data mining is a rapidly growing field that is concerned with developing techniques to assist managers to make intelligent use of these repositories. A number of successful applications have been reported in areas such as credit rating, fraud detection, database marketing, customer relationship management, and stock market investments. The field of data mining has evolved from the disciplines of statistics and artificial intelligence.
This course will examine methods that have emerged from both fields and proven to be of value in recognizing patterns and making predictions from an applications perspective. We will survey applications and provide an opportunity for hands-on experimentation with algorithms for data mining using easy-to- use software and cases.
Course Objective
To develop an understanding of the strengths and limitations of popular data mining techniques and to be able to identify promising business applications of data mining. Students will be able to actively manage and participate in data mining projects executed by consultants or specialists in data mining. A useful takeaway from the course will be the ability to perform powerful data analysis in Excel.
Lecture Notes
Lecture notes and homework assignments will be available at the class website in SloanSpace. You will be responsible for downloading them to prepare for class as well as to submit home works.
Supplementary Readings
The following books are available as supplementary materials. Occasionally, readings from these books will be suggested to augment the lecture notes.
Hand, Mannila, and Smyth. Principles of Data Mining. Cambridge, MA: MIT Press, 2001. ISBN: 026208290X.
Berry and Linoff. Mastering Data Mining. New York, NY: Wiley, 2000. ISBN: 0471331236.
Delmater and Hancock. Data Mining Explained. New York, NY: Digital Press, 2001. ISBN: 1555582311.
Software
We will be using XLMiner, an Excel add-in, for homework assignments. To download a free version go to http://www.xlminer.com.
The free version is limited. For your home works and case assignments you will need a more powerful version that will be provided by Resampling Stats at http://www.solver.com/xlminer-data-mining: "Updated Software Version"
SAS Enterprise Miner will be available for projects that require handling large amounts of data. Instructions on using the software will be provided in recitations.
Grading
Your course grade will be based on case write-ups, homework, a team project and a mid-term exam. The weights given to these components is:
- Case write-ups and Homework (30%)
- Mid-term Exam (30%)
- Project (40%)
Class participation will be subjectively evaluated and will be used in borderline cases to determine the final grade.