Course description and objective
This course consists of three parts: 1) regression and linear models, 2) non-parametric statistics, and 3) machine learning. In 1), various aspects of linear models such as model assumption and fitting, hypothesis testing, regression diagnostics, and model selection are discussed. 2) deals with several classes of nonparametric models, including tree-based models, and covers such topics as modeling fitting, algorithms, and practical issues. In 3), several popular machine learning algorithms, such as logistic
regression, neural network (deep learning), decision trees (Random Forests), SVM are discussed.
Learning outcomes
At the conclusion of this course, students are expected to have good understanding to linear models, be familiar with concepts and modeling techniques with nonparametric statistics, as well as basic concepts in machine learning, and be able to use machine learning software packages and develop simple learning algorithms. The students will also get hands-on experience in statistical data analysis using a modern programming language such as R.
To facilitate learning by doing and learning from real world problems, students are encouraged to get involved in a consulting project. Depending on availability, the consulting project can either be a current or past consulting project or part of those (see course website for a list). Once a student enrolls in a consulting project, they have a responsibility of completing it. Doing a consulting project may take more time than a usual project, but the experience will generally be highly rewarding. The consulting project can be used as one of the two course projects.