In this 1 hour long project-based course, you will learn to build a logistic regression model using Pyspark MLLIB to classify patients as either diabetic or non-diabetic. We will use the popular Pima Indian Diabetes data set. Our goal is to use a simple logistic regression classifier from the pyspark Machine learning library for diabetes classification. We will be carrying out the entire project on the Google Colab environment with the installation of Pyspark.You will need a free Gmail account to complete this project. Please be aware of the fact that the dataset and the model in this project, can not be used in the real-life. We are only using this data for the educational purpose.



(22 reviews)
What you'll learn
- Learn to Build and Train Logistic Regression Classifier using Pyspark MLLIB 
- Learn to set up Pyspark on the Google Colab Environment 
- Learn to work with Pyspark Dataframe 
Skills you'll practice
Details to know

Add to your LinkedIn profile
Only available on desktop
See how employees at top companies are mastering in-demand skills

Learn, practice, and apply job-ready skills in less than 2 hours
- Receive training from industry experts
- Gain hands-on experience solving real-world job tasks
- Build confidence using the latest tools and technologies

About this Guided Project
Learn step-by-step
In a video that plays in a split-screen with your work area, your instructor will walk you through these steps:
- Introduction & Install Dependencies 
- Clone and Explore Dataset 
- Data Cleaning and Preparation 
- Correlation analysis and Feature Selection 
- Split Dataset and Build the Logistic Regression Model 
- Evaluate and Save the model 
- Model Prediction on a new set of unlabelled data 
4 project images
Instructor

Offered by
How you'll learn
- Skill-based, hands-on learning - Practice new skills by completing job-related tasks. 
- Expert guidance - Follow along with pre-recorded videos from experts using a unique side-by-side interface. 
- No downloads or installation required - Access the tools and resources you need in a pre-configured cloud workspace. 
- Available only on desktop - This Guided Project is designed for laptops or desktop computers with a reliable Internet connection, not mobile devices. 
Why people choose Coursera for their career




Learner reviews
22 reviews
- 5 stars72.72% 
- 4 stars13.63% 
- 3 stars13.63% 
- 2 stars0% 
- 1 star0% 
Showing 3 of 22
Reviewed on Aug 22, 2024
Understand the concept easily and practice it at the same time.
Reviewed on Oct 17, 2021
Thank You for making course so simple to learn how to develop prediction model
Reviewed on Nov 3, 2022
Solid introduction to pyspark MLLib but left much would have liked to see more model evaluation and comparison to at least another model.
You might also like
 - Coursera Project Network 
 - Coursera Project Network 
 Status: Free Trial Status: Free Trial
 - Coursera Project Network 

Open new doors with Coursera Plus
Unlimited access to 10,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription
Advance your career with an online degree
Earn a degree from world-class universities - 100% online
Join over 3,400 global companies that choose Coursera for Business
Upskill your employees to excel in the digital economy
Frequently asked questions
By purchasing a Guided Project, you'll get everything you need to complete the Guided Project including access to a cloud desktop workspace through your web browser that contains the files and software you need to get started, plus step-by-step video instruction from a subject matter expert.
Because your workspace contains a cloud desktop that is sized for a laptop or desktop computer, Guided Projects are not available on your mobile device.
Guided Project instructors are subject matter experts who have experience in the skill, tool or domain of their project and are passionate about sharing their knowledge to impact millions of learners around the world.


