Skip to content

Explore my diverse collection of projects showcasing machine learning, data analysis, and more. Organized by project, each directory contains code, datasets, documentation, and resources. Dive in, to discover insights and techniques in data science. Reach out for collaborations and feedback.

License

Notifications You must be signed in to change notification settings

SUKHMAN-SINGH-1612/Data-Science-Projects

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Science Projects

Welcome to my Data Science Projects Repository! This repository contains a collection of my data science projects, showcasing my skills and expertise in the field. Each project demonstrates different aspects of data analysis, machine learning, and visualization.

GitHub Repo stars GitHub forks

GitHub Page

Data-Science-Projects

Projects

  1. Breast Cancer Prediction
    • Description: The project predicts the diagnosis (M = malignant, B = benign) of the Breast Cancer
    • Technologies Used: The notebooks uses Decision Tree Classification and Logistic Regression
    • Results: The logistic regression gave 97% accuracy and decision tree gave 93.5% accuracy
  2. Red Wine Quality Prediction
    • Description: The project predicts the quality of the wine in the value 0 or 1. 1 for good quality and 0 for bad quality
    • Technologies Used: The notebooks uses logistic regression, support vector machine, decision tree and knn
    • Results: The logistic regression model performs the best with accuracy of 86.67%
  3. Heart Stroke Prediction
    • Description: The project predicts the risk of heart stroke on studying the person's demographics and medical info
    • Technologies Used: The notebooks uses logistic regression, support vector machine, decision tree and knn
    • Results: The logistic regression, SVM and KNN performs the best with 93.8 % accuracy
  4. House Price Prediction
    • Description: The project predicts the house price after studying the variables such as location, area, bredroom, bathroom count and many more.
    • Technologies Used: The notebooks uses Linear Regression, Ridge Regression and Random Forest Regressor
    • Results: The Random Forest Regressor performed best with accuracy of 87.89%
  5. Titanic Survival Prediction
    • Description: The project predicts the survival during the titanic disaster based on socio-economic measures
    • Technologies Used: The notebooks uses Descision Tree Classifier
    • Results: The Decision Tree Classifer performed well on the test dataset with an accuracy of 89.5%
  6. Diamond Price Prediction
    • Description: The project predicts the price (in US dollars) of the diamonds based on their features
    • Technologies Used: The notebooks uses Descision Tree Regressor and Random Forest Regressor
    • Results: The Decision Tree Regresor performed well on the test dataset with an accuracy of 96%
  7. Medical Cost Prediction
    • Description: The project predicts the medical treatment cost by analysing the patients age, gender, bmi, smoking habits etc.
    • Technologies Used: The notebooks uses Linear and Polynomial Regression, Decision Tree and Random Forest Regressor
    • Results: The Decision Tree Regressor and Random Forest Regressor performed well
  8. Room Occupancy Detection
    • Description: The project predicts the room occupancy by analyzing the sensor data such as temperature, light and co2 level.
    • Technologies Used: The notebooks uses Random Forest Classifier
    • Results: The Random Forest Classifier performed well with an accuracy of 98%
  9. Sleep Disorder Prediction
    • Description: The project aims to predict sleep disorders and their types by analyzing lifestyle and medical variables, such as age, BMI, sleep duration, blood pressure, and more
    • Technologies Used: The notebooks uses Random Forest Classifier and Decision Tree cLassifier
    • Results: The Random Forest Classifier performed well with an accuracy of 89%
  10. Pima Indians Diabetes Prediction
    • Description: The primary objective of the Pima Indian Diabetes Prediction project is to analyze various medical factors of female patients, to predict whether they have diabetes or not.
    • Technologies Used: The notebooks uses Logistic Regression, Random Forest Classifier and Support Vector Machine
    • Results: The Logistic Regression performed with an accuracy of 78%.
  11. Bank Customer Churn Prediction
    • Description: The main objective of the Bank Customer Churn Prediction project is to analyze the demographics in order to predict whether a customer will leave the bank or not.
    • Technologies Used: The notebooks uses Random Forest Classifier and Decision Tree Classifier
    • Results: The Random Forest Classifier and Decision Tree Classifier performed equally well with an accuracy of 87%
  12. Salary Prediction
    • Description: The main objective of the Salary Prediction project is analyze the employee's demographics such as age, experience job title, country and race to predicts the salary.
    • Technologies Used: The notebooks uses Descision Tree Regressor and Random Forest Regressor
    • Results: The Random Forest Regressor performed best with 94.6% accuracy
  13. Delhi House Price Prediction
    • Description: he primary objective is to develop a predictive model that can accurately estimate the prices of houses based on several key features present in the dataset.
    • Technologies Used: The notebooks uses Descision Tree Regressor and Random Forest Regressor
    • Results: The Random Forest Regressor performed best with 84.98% accuracy
  14. Loan Approval Prediction
    • Description: The Loan Approval Prediction project aims to predict whether a loan application will be approved by a bank.
    • Technologies Used: The notebooks uses Random Forest Classifier and Decision Tree Classifier
    • Results: The Decision Tree Classifier performed well with an accuracy of 91.4%
  15. Cardiovascular Disease Prediction
    • Description: The Cardiovascular Disease Prediction project aims to predict the occurrence of cardiovascular disease in patients based on their medical records and history.
    • Technologies Used: The notebooks uses Random Forest Classifier, Decision Tree Classifier and Logistic Regression
    • Results: The Logistic Regression performed well with an accuracy of 91.4%
  16. Belarus Car Price Prediction
    • Description: The Belarus Car Price Prediction project aims to predict the price of car in Belarus based on car features.
    • Technologies Used: The notebooks uses Decision Tree Regressor
    • Results: The Decision Tree Regressor gave an accuracy of 86.29%
  17. Warranty Claims Fraud Prediction
    • Description: The aim of this data science project is to predict the authenticity of warranty claims by analyzing various factors such as region, product category, claim value, and more.
    • Technologies Used: The notebooks uses Decision Tree Classifier, Random Forest Classifier and Logistic Regression
    • Results: All three models gave an accuracy of 91-92%
  18. E-Commerce Product Delivery Prediction
    • Description: The aim of this project is to predict whether products from an international e-commerce company will reach customers on time or not.
    • Technologies Used: The notebooks uses Decision Tree Classifier, Random Forest Classifier, Logistic Regression and KNN Classifier
    • Results: The decision tree classifier model performed best with 69% accuracy
  19. Hotel Reservations Cancellation Prediction
    • Description: The aim of this project to predict the possible reservations that are going to cancelled by the customers by analyzing various features and variables associated with the reservation.
    • Technologies Used: The notebooks uses Decision Tree Classifier, Random Forest Classifier and Logistic Regression.
    • Results: The decision tree classifier model performed best with 85% accuracy
  20. Telecom Customer Churn Prediction
    • Description: The aim of this project is to analyze customer demographics, services, tenure and other variables to predict whether a particular customer will churn or not.
    • Technologies Used: The notebooks uses Decision Tree Classifier, Random Forest Classifier and K Nearest Neighbor Classifier.
    • Results: The random forest classifier model performed best with 82% accuracy
  21. SFR Analysis
    • Description: The objective of this project is to analyze the SFR (SpaceFund Realty) of the aerospace companies and their missions in order to help the investors to make better decisions.
    • Technologies Used: The notebooks uses Decision Tree Classifier, Random Forest Classifier.
    • Results: The random forest classifier and decision tree classifier gave 87% accuracy.
  22. Indian Used Car Price Prediction
    • Description: The aim of this data science project is to predict the price of used cars in major Indian metro cities.
    • Technologies Used: The notebooks uses Decision Tree Regressor and Random Forest Regressor.
    • Results: The random forest regressor gave 87.8% accuracy
  23. Crop Yield Prediction
    • Description: The aim of this data science project is to predict crop yield using the dataset provided from Crop Yield Prediction..
    • Technologies Used: The notebooks uses Decision Tree Regressor and Random Forest Regressor.
    • Results: The random forest regressor gave 80.2% accuracy
  24. Osteoporosis Risk Prediction
    • Description: The aim of this project is to predict the risk of osteoporosis in patients using a dataset of patients' medical records.
    • Technologies Used: The notebooks uses Logistic Regression, Random Tree, Decision Tree and Support Vector Classifier.
    • Results: The Decision Tree Classifier gave 87% accuracy
  25. Calgary Crime Data Analysis and Neural Network Prediction
    • Description: The aim of this project is to analysis the crime reportings from 2018 to April 2024 in the city of Calgary and predict the crime count using neural network.
    • Technologies Used: The notebooks uses LSTM Neural Network to predict the crime count by using adam optimizer.

Directory Structure

└── SUKHMAN-SINGH-1612-Data-Science-Projects/
    ├── Loan Approval Prediction/
    │   ├── description.md
    │   ├── loan_approval_dataset.csv
    │   └── Loan Approval Prediction.ipynb
    ├── Hotel Reservations Cancellation Prediction/
    │   ├── description.md
    │   ├── Hotel Reservations Cancelation Prediction.ipynb
    │   └── Hotel Reservations.csv
    ├── Crop Yield Prediction/
    │   ├── description.md
    │   ├── crop yield prediction.ipynb
    │   └── crop yield data sheet.xlsx
    ├── Belarus Car Price Prediction/
    │   ├── cars.csv
    │   ├── description.md
    │   └── Belarus Car Price Prediction.ipynb
    ├── Heart Stroke Prediction/
    │   ├── description.md
    │   ├── Stroke detection.ipynb
    │   └── healthcare-dataset-stroke-data.csv
    ├── Delhi House Price Prediction/
    │   ├── description.md
    │   ├── MagicBricks.csv
    │   └── Delhi House Price Prediction.ipynb
    ├── Pima Indians Diabetes Prediction/
    │   ├── diabetes.csv
    │   ├── Diabetes Prediction.ipynb
    │   └── description.md
    ├── SFR Analysis/
    │   ├── description.md
    │   ├── Launch SFR.csv
    │   └── SFR Analysis.ipynb
    ├── Breast Cancer Prediction/
    │   ├── Description.md
    │   ├── data.csv
    │   └── Breast Cancer Prediction.ipynb
    ├── Indian Used Car Price Prediction/
    │   ├── Indian Used Car Price Prediction.ipynb
    │   ├── usedCars.csv
    │   └── description.md
    ├── Customer Churn Prediction/
    │   ├── description.md
    │   ├── Customer Churn Prediction.ipynb
    │   └── churn.csv
    ├── Room Occupancy Detection/
    │   ├── Room Occupancy Detection.ipynb
    │   ├── datatraining.csv
    │   ├── datatest2.csv
    │   ├── description.md
    │   └── datatest.csv
    ├── Calgary Crime Data Analysis and Neural Network Model/
    │   ├── Calgary_Crime_Data_Analysis_and_Neural_Network_Prediction.ipynb
    │   ├── description.md
    │   └── Community_Crime_Statistics_20240522.csv
    ├── Salary Prediction/
    │   ├── Salary_Data_Based_country_and_race.csv
    │   ├── description.md
    │   └── Salary Prediction.ipynb
    ├── Osteoporosis Risk Prediction/
    │   ├── description.md
    │   ├── osteoporosis.csv
    │   └── Osteoporosis Risk Prediction.ipynb
    ├── House Price Prediction/
    │   ├── description.md
    │   ├── house price.ipynb
    │   └── home_data.csv
    ├── Medical Cost Prediction/
    │   ├── description.md
    │   ├── Medical Cost Prediction.ipynb
    │   └── insurance.csv
    ├── Traffic-Flow-Prediction/
    │   ├── TrafficDataset.csv
    │   ├── description.md
    │   └── Traffic_flow_prediction.ipynb
    ├── Cardiovascular Disease Prediction/
    │   ├── description.md
    │   └── Cadivascular Disease Prediction.ipynb
    ├── E-Commerce Product Delivery Prediction/
    │   ├── E_Commerce.csv
    │   ├── description.md
    │   └── E-Commerce Product Delivery Prediction.ipynb
    ├── Telecom Customer Churn Prediction/
    │   ├── description.md
    │   ├── WA_Fn-UseC_-Telco-Customer-Churn.csv
    │   └── Telecom Customer Churn Prediction.ipynb
    ├── Warranty Claims Fraud Prediction/
    │   ├── description.md
    │   ├── df_Clean.csv
    │   └── Warranty Claims Fraud Prediction.ipynb
    ├── LICENSE
    ├── Diamond Price Prediction/
    │   ├── Diamond Price Prediction.html
    │   ├── description.md
    │   ├── diamonds.csv
    │   └── Diamond Price Prediction.ipynb
    ├── README.md
    ├── Titanic Survival Prediction/
    │   ├── description.md
    │   ├── titanic_train.csv
    │   ├── titanic_test.csv
    │   └── Titantic Prediction.ipynb
    ├── Red Wine Quality/
    │   ├── Description.md
    │   ├── winequality-red.csv
    │   └── Wine-Quality.ipynb
    ├── Sleep Disorder Prediction/
    │   ├── Sleep_health_and_lifestyle_dataset.csv
    │   ├── description.md
    │   └── Sleep Disorder Prediction.ipynb
    └── CONTRIBUTING.md

License

This project is licensed under the MIT License. You are free to use the code and resources for educational or personal purposes with citation or reference to the original code and resources used.

Contributing

Contributions are welcome! If you would like to contribute to this repository, please follow the guidelines outlined in CONTRIBUTING.md. Any improvements, bug fixes, or additional projects are greatly appreciated.

Star History

Star History Chart

👋 Join the Discussion!

We believe in the power of community and collaboration. Head over to our Discussion Page to engage with fellow data enthusiasts, share your ideas, ask questions, and contribute to our vibrant community. Whether you're a seasoned data scientist or just starting out, your voice matters! Let's learn, grow, and innovate together. See you there! 🚀

Feedback and Contact

I welcome any feedback, suggestions, or questions you may have about the projects or any kind of sponsorships for the repository. Feel free to reach out to me via email at [email protected]

Enjoy exploring my data science projects!

About

Explore my diverse collection of projects showcasing machine learning, data analysis, and more. Organized by project, each directory contains code, datasets, documentation, and resources. Dive in, to discover insights and techniques in data science. Reach out for collaborations and feedback.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published