Sleep Quality Prediction
AIM
To predict sleep quality based on lifestyle and health factors.
DATASET LINK
Sleep Health and Lifestyle Dataset
DESCRIPTION
What is the requirement of the project?
- This project aims to predict the quality of sleep using various health and lifestyle metrics. Predicting sleep quality helps individuals and healthcare professionals address potential sleep-related health issues early.
Why is it necessary?
- Sleep quality significantly impacts physical and mental health. Early predictions can prevent chronic conditions linked to poor sleep, such as obesity, heart disease, and cognitive impairment.
How is it beneficial and used?
- Individuals: Assess their sleep health and make lifestyle changes to improve sleep quality.
- Healthcare Professionals: Use the model as an auxiliary diagnostic tool to recommend personalized interventions.
How did you start approaching this project? (Initial thoughts and planning)
- Researching sleep health factors and existing literature.
- Exploring and analyzing the dataset to understand feature distributions.
- Preprocessing data for effective feature representation.
- Iterating over machine learning models to find the optimal balance between accuracy and interpretability.
Mention any additional resources used
- Research Paper: Analyzing Sleep Patterns Using AI
- Public Notebook: Sleep Quality Prediction with 96% Accuracy
LIBRARIES USED
- pandas
- numpy
- scikit-learn
- matplotlib
- seaborn
- joblib
- flask
EXPLANATION
DETAILS OF THE DIFFERENT FEATURES
Feature Name | Description | Type | Values/Range |
---|---|---|---|
Gender | Respondent's gender | Categorical | [Male, Female] |
Age | Respondent's age | Numerical | Measured in years |
Sleep Duration (hours) | Hours of sleep per day | Numerical | Measured in hours |
Physical Activity Level | Daily physical activity in minutes | Numerical | Measured in minutes |
Stress Level | Stress level on a scale | Numerical | 1 to 5 (low to high) |
BMI Category | Body Mass Index category | Categorical | [Underweight, Normal, Overweight, Obese] |
Systolic Blood Pressure | Systolic blood pressure | Numerical | Measured in mmHg |
Diastolic Blood Pressure | Diastolic blood pressure | Numerical | Measured in mmHg |
Heart Rate (bpm) | Resting heart rate | Numerical | Beats per minute |
Daily Steps | Average number of steps per day | Numerical | Measured in steps |
Sleep Disorder | Reported sleep disorder | Categorical | [Yes, No] |
WHAT I HAVE DONE
Step 1: Exploratory Data Analysis
- Summary statistics
- Data visualization for numerical feature distributions
- Target splits for categorical features
Step 2: Data Cleaning and Preprocessing
- Handling missing values
- Label encoding categorical features
- Standardizing numerical features
Step 3: Feature Engineering and Selection
- Merging features based on domain knowledge
- Creating derived features such as "Activity-to-Sleep Ratio"
Step 4: Modeling
- Model trained: Decision Tree
- Class imbalance handled using SMOTE
- Metric for optimization: F1-score
Step 5: Result Analysis
- Visualized results using confusion matrices and classification reports
- Interpreted feature importance for tree-based models
MODELS USED AND THEIR ACCURACIES
Model | Accuracy (%) | F1-Score (%) | Precision (%) | Recall (%) |
---|---|---|---|---|
Decision Tree | 74.50 | 75.20 | 73.00 | 77.50 |
CONCLUSION
WHAT YOU HAVE LEARNED
Insights gained from the data
- Sleep Duration, Stress Level, and Physical Activity are the most indicative features for predicting sleep quality.
Improvements in understanding machine learning concepts
- Learned and implemented preprocessing techniques like encoding categorical variables and handling imbalanced datasets.
- Gained insights into deploying a machine learning model using Flask for real-world use cases.
Challenges faced and how they were overcome
- Managing imbalanced classes: Overcame this by using SMOTE for oversampling the minority class.
- Choosing a simple yet effective model: Selected Decision Tree for its interpretability and ease of deployment.
USE CASES OF THIS MODEL
Application 1
A health tracker app can integrate this model to assess and suggest improvements in sleep quality based on user inputs.
Application 2
Healthcare providers can use this tool to make preliminary assessments of patients' sleep health, enabling timely interventions.
FEATURES PLANNED BUT NOT IMPLEMENTED
Feature 1
Advanced models such as Random Forest, AdaBoost, and Gradient Boosting were not implemented due to the project's focus on simplicity and interpretability.
Feature 2
Integration with wearable device data for real-time predictions was not explored but remains a potential enhancement for future work.