class: center, middle, title-slide # Statistical Machine Learning ## AU STAT-427/627 ### Emil Hvitfeldt ### 2021-5-17 --- <div style = "position:fixed; visibility: hidden"> `$$\require{color}\definecolor{orange}{rgb}{1, 0.603921568627451, 0.301960784313725}$$` `$$\require{color}\definecolor{blue}{rgb}{0.301960784313725, 0.580392156862745, 1}$$` `$$\require{color}\definecolor{pink}{rgb}{0.976470588235294, 0.301960784313725, 1}$$` </div> <script type="text/x-mathjax-config"> MathJax.Hub.Config({ TeX: { Macros: { orange: ["{\\color{orange}{#1}}", 1], blue: ["{\\color{blue}{#1}}", 1], pink: ["{\\color{pink}{#1}}", 1] }, loader: {load: ['[tex]/color']}, tex: {packages: {'[+]': ['color']}} } }); </script> <style> .orange {color: #FF9A4D;} .blue {color: #4D94FF;} .pink {color: #F94DFF;} </style> # Welcome -- - Introductions - Syllabus - Material - Questions --- # Questions You are encouraged to ask questions when you have them rather than wait for me to ask for questions. If you have a question, chances are that something else has a question. --- # Attendence and Camera Both highly encouraged Due to COVID-19 both should be practiced according to what makes you safe --- # About Me - Data Analyst at Teladoc Health - R package developer, about 10 packages on CRAN (textrecipes, themis, paletteer, prismatic, textdata) - Co-author of "Supervised Machine Learning for Text Analysis in R", available for preorder --- ![](images/website.png) ??? https://emilhvitfeldt.github.io/AU-2021spring-627/index.html --- # Syllabus --- ## An Introduction to Statistical Learning with Applications in R Will be our main textbook We will cover most of the material, and follow it mostly chronologically --- ## Tidymodels labs for ISLR https://emilhvitfeldt.github.io/ISLR-tidymodels-labs/index.html --- ## The Elements of Statistical Learning: Data Mining, Inference, and Prediction Supplementary textbook I Will sometimes refer to this book when more detail is needed --- ## Tidy Modeling with R New (work in progress) I Will sometimes refer to this book when more detail is needed --- ## Syllabus Come to me before it is too late I'm here to help, my main goal for this course is to make you succeed --- # Late assignment There are some (limited) late penalties It is more important for me that you turn something in than that you give up. You will always get points (sometimes reduced) for late assignments Contact me if you are having a hard time or need to turn in late --- # Lecture 40 + 15 + 40 + 15 + 40 Focused on intuition, concepts, and statistics --- # Labs A hands-on section where we work together on the implementation side in R These should be turned in WITH explanatory text. --- # Assignments There will be 5 assignments It Will contain a mix of conceptual questions and practical coding exercises about the weekly topic --- # Final Project We end the class with a final machine learning project You will find a data set and analyze it with the tools you have learned in the class The project will be a document(25%) and a presentation given to the class(5%) Data will have to be approved by me --- ![](images/tidymodels.png) --- # tidymodels feedback Any and all feedback regarding the use of {tidymodels} is appreciated Both how I am teaching it and how it is to use --- # Material 1/2 1. Introduction, motivation, and examples. Understanding large and complex data sets. Statistical learning. First steps in R. [Chap. 1-2]. 2. Review of regression modeling and analysis; implementation in R. [Chap. 3]. 3. Classification problems and classification tools. Logistic regression and review of linear discriminant analysis. [Chap. 4] 4. Resampling methods; bootstrap. [Chap. 5 and lecture notes]. --- # Material 2/2 5. High-dimensional data and shrinkage. Ridge regression. LASSO. Model selection methods and dimension reduction. Principal components. Partial least squares. [Chap. 6] 6. Nonlinear trends and splines. [Chap. 7; 7.4-7.5] 7. Regression trees and decision trees [Chap. 8] 8. Introduction to support vector machines [Chap. 9] 9. Clustering methods [Chap. 10] 10. Additional topics and applications, if time permits. --- .pull-left[ ## What you will learn - How the foundational Machine learning models work - The intuition behind them - How to use them - Using {tidymodels} ] .pull-right[ ## What you won't - How they are coded - State of the art (SOTA) methods ] --- # Questions?