Music Genre Classification Model

AIM

The aim of this project is to develop a precise and effective music genre classification model using Convolutional Neural Networks (CNN), Support Vector Machines (SVM), Random Forest and XGBoost Classifier algorithms for the Kaggle GTZAN Dataset Music Genre Classification.

DATASET LINK

GTZAN Dataset

MY NOTEBOOK LINK

Music Genre Classification Model

DESCRIPTION

What is the requirement of the project?
- The objective of this research is to develop a precise and effective music genre classification model using Convolutional Neural Networks (CNN), Support Vector Machines (SVM), Random Forest and XGBoost algorithms for the Kaggle GTZAN Dataset Music Genre Classification.
Why is it necessary?
- Music genre classification has several real-world applications, including music recommendation, content-based music retrieval, and personalized music services. However, the task of music genre classification is challenging due to the subjective nature of music and the complexity of audio signals.
How is it beneficial and used?
For User : Provides more personalised music
For Developers: A recommendation system for songs that are of interest to the user
For Business: Able to charge premium for the more personalised and recommendation services provided
How did you start approaching this project? (Initial thoughts and planning)
Initially how the different sounds are structured.
Learned how to represent sound signal in 2D format on graphs using the librosa library.
Came to know about the various features of sound like
- Mel-frequency cepstral coefficients (MFCC)
- Chromagram
- Spectral Centroid
- Zero-crossing rate
- BPM - Beats Per Minute
Mention any additional resources used (blogs, books, chapters, articles, research papers, etc.).
https://scholarworks.calstate.edu/downloads/73666b68n
https://www.kaggle.com/datasets/andradaolteanu/gtzan-dataset-music-genre-classification/data
https://towardsdatascience.com/music-genre-classification-with-python-c714d032f0d8

EXPLANATION

DETAILS OF THE DIFFERENT FEATURES

There are 3 different types of the datasets.

genres_original
images_original
features_3_sec.csv
feature_30_sec.csv

The features in genres_original ['blues', 'classical', 'country', 'disco', 'hiphop', 'jazz', 'metal', 'pop', 'reggae', 'rock'] Each and every genre has 100 WAV files

The features in genres_original ['blues', 'classical', 'country', 'disco', 'hiphop', 'jazz', 'metal', 'pop', 'reggae', 'rock'] Each and every genre has 100 PNG files

There are 60 features in features_3_sec.csv

There are 60 features in features_30_sec.csv

WHAT I HAVE DONE

Created data visual reprsentation of the data to help understand the data
Found strong relationships between independent features and dependent feature using correlation.
Performed Exploratory Data Analysis on data.
Used different Classification techniques like SVM, Random Forest,
Compared various models and used best performance model to make predictions.
Used Mean Squared Error and R2 Score for evaluating model's performance.
Visualized best model's performance using matplotlib and seaborn library.

PROJECT TRADE-OFFS AND SOLUTIONS

Trade-off 1: How do you visualize audio signal
Solution:
- librosa: It is the mother of all audio file libraries
- Plotting Graphs: As I have the necessary libraries to visualize the data. I started plotting the audio signals
- Spectogram:A spectrogram is a visual representation of the spectrum of frequencies of a signal as it varies with time. When applied to an audio signal, spectrograms are sometimes called sonographs, voiceprints, or voicegrams. Here we convert the frequency axis to a logarithmic one.
Trade-off 2: Features that help classify the data
Solution:
- Feature Engineering: What are the features present in audio signals
- Spectral Centroid: Indicates where the ”centre of mass” for a sound is located and is calculated as the weighted mean of the frequencies present in the sound.
- Mel-Frequency Cepstral Coefficients: The Mel frequency cepstral coefficients (MFCCs) of a signal are a small set of features (usually about 10–20) which concisely describe the overall shape of a spectral envelope. It models the characteristics of the human voice.
- Chroma Frequencies: Chroma features are an interesting and powerful representation for music audio in which the entire spectrum is projected onto 12 bins representing the 12 distinct semitones (or chroma) of the musical octave.
Trade-off 3: Performing EDA on the CSV files
Solution:
- Tool Selection: Used the correlation matrix on the features_30_sec.csv dataset to extract most related datasets
- Visualization Best Practices: Followed best practices such as using appropriate chart types (e.g., box plots for BPM data, PCA plots for correlations), adding labels and titles, and ensuring readability.
- Iterative Refinement: Iteratively refined visualizations based on feedback and self-review to enhance clarity and informativeness.
Trade-off 4: Implementing Machine Learning Models
Solution:
- Cross-validation: Used cross-validation techniques to ensure the reliability and accuracy of the analysis results.
- Collaboration with Experts: Engaged with Music experts and enthusiasts to validate the findings and gain additional perspectives.
- Contextual Understanding: Interpreted results within the context of the music, considering factors such as mood of the users, surrounding, and specific events to provide meaningful and actionable insights.

LIBRARIES NEEDED

librosa
matplotlib
pandas
sklearn
seaborn
numpy
scipy
xgboost

SCREENSHOTS

MODELS USED AND THEIR ACCURACIES

Model	Accuracy
KNN	0.80581
Random Forest	0.81415
Cross Gradient Booster	0.90123
SVM	0.75409

MODELS COMPARISON GRAPHS

CONCLUSION

Here we can see that Accuracy plots of the different models
Here, XGB Classifier can predict most accurate results for predicting the Genre of the music

WHAT YOU HAVE LEARNED

Insights gained from the data:
Discovered a new library that help visualize audio signal
Discovered new features related to audio like STFT, MFCC, Spectral Centroid, Spectral Rolloff
Gained a deeper understanding of the features of different genres of music
Improvements in understanding machine learning concepts:
Enhanced knowledge of data cleaning and preprocessing techniques to handle real-world datasets.
Improved skills in exploratory data analysis (EDA) to extract meaningful insights from raw data.
Learned how to use visualization tools to effectively communicate data-driven findings.

USE CASES OF THIS MODEL

Application 1: User Personalisation:
Explanation: Can be used to provide more personalised music recommendation for users based on their taste in music or the various genres they listen to. This personalisation experience can be used to develop 'Premium' based business models
Application 2: Compatability Between Users:
Explanation: Based on the musical taste and the genres they listen we can identify the user behaviour and pattern come with similar users who can be friends with. This increases social interaction within the app.

HOW TO INTEGRATE THIS MODEL IN REAL WORLD

Use API to collect user information
Deploy the model using appropriate tools (e.g., Flask, Docker)
Monitor and maintain the model in production

FEATURES PLANNED BUT NOT IMPLEMENTED

Feature 1: Real-time Compatability Tracking:
Description: Implementing a real-time tracking system to view compatability between users
Reason it couldn't be implemented: Lack of access to live data streams and the complexity of integrating real-time data processing.
Feature 2: Predictive Analytics:
Description: Using advanced machine learning algorithms to predict the next song the users is likely to listen to.
Reason it couldn't be implemented: Constraints in computational resources and the need for more sophisticated modeling techniques that were beyond the current scope of the project.

YOUR NAME

Filbert Shawn

Music Genre Classification Model

AIM

DATASET LINK

MY NOTEBOOK LINK

DESCRIPTION

EXPLANATION

DETAILS OF THE DIFFERENT FEATURES

WHAT I HAVE DONE

PROJECT TRADE-OFFS AND SOLUTIONS

LIBRARIES NEEDED

SCREENSHOTS

MODELS USED AND THEIR ACCURACIES

MODELS COMPARISON GRAPHS

CONCLUSION

WHAT YOU HAVE LEARNED

USE CASES OF THIS MODEL

HOW TO INTEGRATE THIS MODEL IN REAL WORLD

FEATURES PLANNED BUT NOT IMPLEMENTED

YOUR NAME

Happy Coding 🧑‍💻

Show some ❤️ by 🌟 this repository!