Day 09 - Logistic Regression Day 09 - Logistic Regression Oct. 6, - PowerPoint PPT Presentation

Day 09 - Logistic Regression Day 09 - Logistic Regression Oct. 6, 2020 Oct. 6, 2020

Administrative Administrative Homework 3 will be assigned Friday 10/9 and due Friday 10/23 Midterm will be given Thursday 10/29 in class

From Pre-Class Assignment From Pre-Class Assignment Useful Stu� Useful Stu� Videos from Google were helpful to understand the scope of Machine Learning I have a better understanding of train/test split Challenging bits Challenging bits I am still a little confused about why we split the data I am not sure what make_classification is doing What are redundant and informative features? How do we see them in the plots? We will be doing classi�cation tasks for a few weeks, so we will get lots of practice

Machine Learning Machine Learning

Classi�cation Classi�cation

Classi�cation Algorithms Classi�cation Algorithms Logistic Regression: The most traditional technique; was developed and used prior to ML; �ts data to a "sigmoidal" (s-shaped) curve; �t coef�cients are interpretable K Nearest Neighbors (KNN): A more intuitive method; nearby points are part of the same class; �ts can have complex shapes Support Vector Machines (SVM): Developed for linear separation (i.e., �nd the optimal "line" to separate classes; can be extended to curved lines through different "kernels" Decision Trees: Uses binary (yes/no) questions about the features to �t classes; can be used with numerical and categorical input Random Forest: A collection of randomized decision trees; less prone to over�tting than decision trees; can rank importance of features for prediction Gradient Boosted Trees: An even more robust tree-based algorithm We will learn Logisitic Regression, KNN, and SVM, but sklearn provides access to the other three methods as well.

Generate some data Generate some data make_classification lets us make fake data and control the kind of data we get. n_features - the total number of features that can be used in the model n_informative - the total number of features that provide unique information for classes say 2, so and 𝑦 0 𝑦 1 n_redundant - the total number of features that are built from informative features (i.e., have redundant information) say 1, so 𝑦 2 = 𝑑 0 𝑦 0 + 𝑑 1 𝑦 1 n_class - the number of class labels (default 2: 0/1) n_clusters_per_class - the number of clusters per class In [63]: import matplotlib.pyplot as plt plt.style.use('seaborn-colorblind') from sklearn.datasets import make_classification features, class_labels = make_classification(n_samples = 1000, n_features = 3, n_informative = 2, n_redundant = 1, n_clusters_per_class=1, random_state=201)

In [64]: ## Let's look at these 3D data from mpl_toolkits.mplot3d import Axes3D fig = plt.figure(figsize=(8,8)) ax = Axes3D(fig, rect=[0, 0, .95, 1], elev=30, azim=135) xs = features[:, 0] ys = features[:, 1] zs = features[:, 2] ax.scatter3D(xs, ys, zs, c=class_labels, ec='k') ax.set_xlabel('feature 0') ax.set_ylabel('feature 1') ax.set_zlabel('feature 2') Text(0.5, 0, 'feature 2') Out[64]:

In [65]: ## From a different angle, we see the 2D nature of the data fig = plt.figure(figsize=(8,8)) ax = Axes3D(fig, rect=[0, 0, .95, 1], elev=15, azim=90) xs = features[:, 0] ys = features[:, 1] zs = features[:, 2] ax.scatter3D(xs, ys, zs, c=class_labels, ec = 'k') ax.set_xlabel('feature 0') ax.set_ylabel('feature 1') ax.set_zlabel('feature 2') Text(0.5, 0, 'feature 2') Out[65]:

Feature Subspaces Feature Subspaces For higher dimensions, we have take 2D slices of the data (called "projections" or "subspaces")

In [66]: f, axs = plt.subplots(1,3,figsize=(15,4)) plt.subplot(131) plt.scatter(features[:, 0], features[:, 1], marker = 'o', c = class_labels, ec = 'k') plt.xlabel('feature 0') plt.ylabel('feature 1') plt.subplot(132) plt.scatter(features[:, 0], features[:, 2], marker = 'o', c = class_labels, ec = 'k') plt.xlabel('feature 0') plt.ylabel('feature 2') plt.subplot(133) plt.scatter(features[:, 1], features[:, 2], marker = 'o', c = class_labels, ec = 'k') plt.xlabel('feature 1') plt.ylabel('feature 2') plt.tight_layout()

What about Logistic Regression? What about Logistic Regression? Logistic Regression attempts to �t a sigmoid (S-shaped) function to your data. This shapes assumes that the probability of �nding class 0 versus class 1 increases as the feature changes value.

In [70]: f, axs = plt.subplots(1,3,figsize=(15,4)) plt.subplot(131) plt.scatter(features[:,0], class_labels, c=class_labels, ec='k') plt.xlabel('feature 0') plt.ylabel('class label') plt.subplot(132) plt.scatter(features[:,1], class_labels, c=class_labels, ec='k') plt.xlabel('feature 1') plt.ylabel('class label') plt.subplot(133) plt.scatter(features[:,2], class_labels, c=class_labels, ec='k') plt.xlabel('feature 2') plt.ylabel('class label') plt.tight_layout()

Questions, Comments, Concerns? Questions, Comments, Concerns?

Day 09 - Logistic Regression Day 09 - Logistic Regression Oct. 6, - PowerPoint PPT Presentation

Day 09 - Logistic Regression Day 09 - Logistic Regression Oct. 6, 2020 Oct. 6, 2020 Administrative Administrative Homework 3 will be assigned Friday 10/9 and due Friday 10/23 Midterm will be given Thursday 10/29 in class From Pre-Class

Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic

Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt

Todays lecture Logistic regression How can we use logistic regression for reranking? Shay

From Logistic Regression to Neural Networks CMSC 470 Marine Carpuat Logistic Regression What

LEARNING Outline Math Behind Logistic Regression Visualizing Logistic Regression Loss

Workshop 10.5a: Logistic regression Murray Logan August 23, 2016 Table of contents 1 Logistic

Logistic Regression using OLS1D in Excel 2013 XL4D: V0H XL4D: V0H XL4D: V0H 2015 Schield

Workshop 10.5a: Logistic regression Murray Logan 05 Sep 2016 Section 1 Logistic regression

Lecture 3: Logistic Regression Feng Li Shandong University fli@sdu.edu.cn September 21, 2020

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

XL4B: Logistic Regression using OLS1B in Excel 2013 25 Feb 2018 V0C-2x XL4B: V0C-2x XL4B: V0C-2x

Logistic regression Shay Cohen (based on slides by Sharon Goldwater) 28 October 2019 Todays

Machine Learning Logistic Regression Hamid R. Rabiee Spring 2015

Learning From Data Lecture 9 Logistic Regression and Gradient Descent Logistic Regression

Logistic regression Predict binary outcomes (success/failure) from numerical or categorical

Topics of the day Logistic regression and generalized linear models Rasmus Waagepetersen

h ( t ) y ( t ) x [ n ] h [ n ] y [ n ] Review of sampling property Discrete-time

MATH 3341: Introduction to Scientific Computing Lab Libao Jin University of Wyoming February

L19 July 19, 2017 1 Lecture 19: Introduction to Computer Vision CSCI 1360E: Foundations for

Lecture4: Plotting Lecture4: Plotting 1 Plotting in MATLAB 2D Plots Plotting Scalar functions

Workshop 2.4: Data manipulation Murray Logan April 9, 2016 Table of contents 1 Data

Editing Techniques to Take Your Writing to the Next Level! The webinar will begin shortly!

Part VI Scientific Computing in Python Alfredo Parra : Scripting with Python Compact Course @

Image formation Camera model Oct 1. 2009 Jaechul Kim, UT Austin Image formation Lets