R is a flexible and powerful open-source implementation of the language S (for statistics). R has eclipsed S and the commercially available S-Plus program for many reasons. R is free, and has a variety (nearly 4,000 at last count) of contributed packages, most of which are also free. R works on Macs, PCs, and Linux systems. In this book, you will see screens of R 2.15.1 running in a Windows 7 environment, but you will be able to use everything you learn with other systems, too.
Logistic Regression Video Tutorial
Why R programming?
- R is not just a statistics package, it’s a language.
- R is designed to operate the way that problems are thought about.
- R is both flexible and powerful.
What Is Logistic Regression?
Logistic regression enables us to predict a dichotomous (0, 1) outcome while overcoming the problems withlinear regression and discriminant analysis.
When we have a binary outcome measure, we can treat this as a binomial process, with p being the proportionof 1s, and q = 1 – p being the proportion of 0s. We call a 1 a “success” and a 0 a “failure”. As you may recall, the difference between success and failure is often a matter of choice, but sometimes the 1sreally do represent “success,” such as with student retention in the examples discussed below.
The logistic curve has a very nice property of being asymptotic to 0 and 1 andalways lying between 0 and 1. Thus, logistic regression avoids the problems of both linear regression anddiscriminant function analysis for predicting binary outcomes.
We can convert a probability to odds, and we use odds in logistic regression. If p isthe probability of success,the odds in favor of success, o, are:
Odds can also be converted to probabilities quite easily. If the odds in favor of an event are 5:1, then the
probability of the event is 5/6. Note that odds, unlike probabilities, can exceed 1. For example, if the probability ofrain is .25, the odds in favor of rain are .25 / .75 = .33, but the odds against rain are .75 / .25 = 3.00.
In logistic regression we work with a term called the logit, which is the natural logarithm of the odds. Wedevelop a linear combination of predictors and an intercept term to predict the logit:
ln(odds)= b0 + b1x1 + b2x2 +…+ bkxk
Examining this model carefully, we see that we have simply described a linear regression on the logit
Watch this logistic regression Machine Learning Video by Intellipaat:
Transform of y, where y is the proportion of success at each value of x. Remember that a 1 is considered a successand a 0 is considered a failure. Logistic regression allows us to model the probability of “success” as a function ofthe logistic curve, which is never less than 0 and never greater than 1. Because we are not accustomed to thinkingin terms of logarithms, we typically convert the logit to odds, which make more intuitive sense to us.
We can convert the logit to odds as follows:
odds = e pow(b0 +b1x1+…+bkxk)
Remember e is the base of the natural logarithms.
This tutorial gives good knowledge on R Programming implementing Logistic Regression.