Class Notes

Notes: 1. please visit this page frequently, as it will be updated constantly during the term
2. Items under Required Material are considered mandatory reading and will be tested in the exams.
3. Items under Additional Material should be useful in helping you understand the Required Material.
4. Items in the third column (Online Resources and Reference Books) are provided for reference and in order to help you explore the different topics further.
5. Links to O'Reilly's Safari Online Bookshelf are only available within pitt.edu or from outside pitt.edu using the University's VPN service (https://sremote.pitt.edu). Please remember to sign out once you finished reading -- the University has a very limited number of concurrent user licenses.

Shortcuts:

05: Data Mining (Sep 14/19, 2017)
Required Reading
(coming soon)
Additional Material
Online Resources
How Target Figured Out A Teen Girl Was Pregnant Before Her Father Did, Forbes, Feb 16, 2012

The parable of the beer and diapers, The Register, August 15, 2006

Reference Books
Data Mining Concepts and Techniques (3rd Edition), 2012

Mining of Massive Datasets (Sec 6.1, 6.2, 7.1.1, 7.1.2, 7.2.1, 7.3.1, 7.3.2)

04: Intro to Python (Sep 12, 2017)
Required Reading
(coming soon)
Additional Material
Online Resources
Python for Beginners
How to think like a computer scientist
The Hitchhiker’s Guide To Python
Google Python Course
CodeAcademy's Python Course
Reference Books
Think Python, 2nd Ed
Learning Python, 5th Ed
Introducing Python
Head First Python

03: Web Information Retrieval (Sep 7, 2017)
Required Reading
In TopHat
Additional Material
Online Resources
The Google Pagerank Algorithm and How It Works (click Cancel on login prompt)

PageRank Calculator

PageRank explained

Reference Books

02: Information Retrieval (Aug 31/Sep 5, 2017)
Required Reading
In TopHat
Additional Material
Online Resources
Online Log Base 2 Calculator
Reference Books

01: Introduction (Aug 29, 2017)
Required Reading
In TopHat
Additional Material
Online Resources
Big Data and Its Technical Challenges in Communications of the ACM (July 2014)

10min introduction to github
Github intro for students

Reference Books