Data Scientist | Data Analyst | Business Analyst
Projects
Credit Risk Assessment
Technologies:
-
Python: Used for data preprocessing, feature engineering, and model building with libraries like Pandas, NumPy, and Scikit-learn.
-
Streamlit: Built an interactive web application for real-time predictions and explainability.
-
SHAP: Added model interpretability through feature importance visualisations.
Features:
-
Data Preprocessing: Handled missing values, scaled numerical data, and encoded categorical variables.
-
Machine Learning: Trained a Gradient Boosting Classifier, achieving robust credit risk predictions evaluated using ROC-AUC and accuracy.
-
Interactive Application: Real-time predictions with visual risk scores and SHAP-based insights.
Data Source:
The dataset includes applicant demographics, loan details, and credit history, enabling a comprehensive credit risk prediction model.
Sales Forecasting using Machine Learning and Time Series Analysis
Technologies Used:
-
Python, Pandas, NumPy, Matplotlib
-
Scikit-learn, XGBoost, TensorFlow
-
Keras, LSTM, RandomForestRegressor, LinearRegression.
Key Features:​​
-
Data Preprocessing: Handles missing values, converts date formats, and creates lag features for time series analysis.
-
Time Series Analysis: Analyzes monthly sales trends and creates differencing to make the series stationary.
-
Multiple Model Implementations: Utilizes models like Linear Regression, Random Forest, XGBoost, and LSTM for predicting future sales.
-
Evaluation Metrics: Assesses model performance using metrics such as MSE, MAE, and R-squared.
-
Visualization: Compares actual vs. predicted sales through comprehensive visualizations.
Medicine Recommendation System
Introducing our cutting-edge Medicine Recommendation System—a powerful tool designed to provide personalized medication suggestions based on user input. Leveraging advanced vectorization techniques, our system efficiently analyzes and matches user profiles with the most suitable treatments. The intuitive web application, built using the Flask library, ensures a seamless and user-friendly experience. Whether you're a healthcare professional or a patient seeking tailored medical advice, our system delivers accurate and relevant recommendations with ease.
Please note that this project is intended solely for educational purposes and should not be used as a substitute for professional medical advice.
Movie Recommendation System
My new project is a sophisticated Movie Recommendation System designed to enhance user experience by suggesting movies tailored to individual preferences. Leveraging advanced vectorization methods, we analyze and represent movie data in a high-dimensional space, capturing the nuanced relationships between various attributes. This allows us to deliver precise and relevant movie recommendations.
To ensure a seamless and interactive user experience, we've developed the web interface using Streamlit. This modern, user-friendly platform enables users to input their preferences and receive instant recommendations through an intuitive and responsive interface. Whether you’re a casual viewer or a cinephile, our Movie Recommendation System aims to bring the perfect movie suggestions to your screen with ease and accuracy.
Amazon Prime Video Dashboard using Tableau
This Tableau dashboard empowers stakeholders to make data-driven decisions, optimise content strategy, and enhance overall user experience on Amazon Prime Video. The Amazon Prime Video dashboard built with Tableau offers a comprehensive visualisation of streaming data, enhancing insights into user engagement, content performance. This type of dashboard can help the company to improve their services and also take data-focused conclusion to enhance sales.
Flight Punctuality Data warehouse using SQL, OLAP and Tableau
The Flight Punctuality Data Warehouse project aims to develop a robust system for storing, analysing, and visualising flight punctuality data. This system will help airlines, airports, and regulatory authorities to monitor and improve flight punctuality, understand delay patterns, and make data-driven decisions. The project will leverage SQL for data storage and management, OLAP for multi-dimensional analysis, and Tableau for dynamic data visualisation.
Intrusion Detection System
This project focuses on the design and deployment of an Intrusion Detection System (IDS) aimed at improving network security. Key features include advanced data analysis, machine learning algorithms for pattern recognition, and a user-friendly interface for monitoring and managing alerts.
British Airways Reviews using Tableau
With the help of this dashboard we can visualise data through various filters like date, metric type, continent/region, aircraft type, traveller type. This type of dashboard can help the company to improve their services and also take data-focused conclusion to enhance sales.
London bike ride Tableau Dashboard
Technologies:
-
Python: Data cleaning, merging, and model building using libraries like Pandas, NumPy.
-
Tableau: Created an interactive dashboard for data visualization.
Features:
-
Data Preprocessing: Handled missing values, converted date formats, and created lag features for time series analysis.
-
Time Series Analysis: Analyzed monthly sales trends and applied differencing to ensure stationarity.
-
Multiple Model Implementations: Utilized various models for predicting future sales with comprehensive evaluation metrics like MSE, MAE, and R-squared.
-
Visualization: Compared actual vs. predicted sales through engaging visualisations.
Data Source:
-
Data was sourced from three main platforms:
-
TfL Cycling Data - Contains OS data © Crown copyright and database rights 2016.
-
freemeteo.com - Weather data.
-
GOV.UK - Bank holidays data.
-
The dataset spans from 1st January 2015 to 31st December 2016.
Real time face, age, gender, expression and gesture recognition
-
Technologies used:
Streamlit, ​OpenCV, DeepFace, MediaPipe
​
-
Key Features:
-
Face Detection: Detects faces and analyses attributes like age, gender, and emotion.
-
Gesture Recognition: Recognises gestures such as "Thumbs up" and "Thumbs down".
-
Animations: Triggers fun animations (e.g., balloons for thumbs up, snow for thumbs down).
-
Real-Time Feedback: Provides smooth feedback with adjustable detection intervals.
-
Webcam Support: Uses a webcam feed for real-time detection.
​​
-
​Planned Enhancements:- ​​
-
Add more gesture types using MediaPipe.
-
Implement face recognition for identifying known individuals.
-
Improve gesture animation effects.
-
Optimize performance for low-end devices.
Music Recommendation System
This web-based application provides personalised music recommendations using advanced techniques like vectorisation and cosine similarity.
Features:-
-
User-Friendly Interface: Built with Streamlit for an intuitive and interactive experience.
-
Personalised Recommendations: Uses song attributes to suggest music tailored to your preferences.
-
Real-Time Processing: Delivers instant recommendations based on user input.
How it works:-
-
Data Collection & Cleaning: Utilizes and cleans a dataset of songs.
-
Feature Extraction & Vectorization: Transforms song attributes into numerical vectors.
-
Cosine Similarity Calculation: Measures and ranks song similarity to user preferences.
-
Real-Time Recommendations: Quickly computes and displays top-ranked songs via Streamlit.
Discover new music effortlessly with the Music Recommendation System.
Stock Market Price Predictor
This project predicts stock market prices using machine learning and provides an interactive web interface via Streamlit. By leveraging historical stock data, it aims to offer insights into market behaviour and assist users in making informed decisions.
Implementation:-
-
Data Collection: Uses yfinance to fetch historical stock data based on user-input symbols.
-
Data Preprocessing: Splits data into training/testing sets and normalizes values with MinMaxScaler.
-
Model Loading: Utilizes a pre-trained Keras model for predictions.
-
Visualization: Plots stock prices and moving averages with Matplotlib.
-
User Interaction: Streamlit interface allows users to input stock symbols and view predictions.
Pneumonia detection using deep learning (Tensorflow, Keras, Matplotlib )
This project leverages advanced deep learning techniques to accurately detect pneumonia from chest X-ray images. Utilising Deep learning, the model is trained on a large dataset of labeled X-ray scans to identify patterns and anomalies indicative of pneumonia. This automated detection system aims to assist medical professionals by providing a reliable, efficient, and rapid diagnostic tool, ultimately enhancing patient care and improving clinical outcomes. The project showcases expertise in medical image analysis, deep learning model development, and the application of AI in healthcare.
Data design and Database management of a retail shop
This project focuses on designing and managing a database for a retail shop. It includes the creation of an Entity-Relationship (ER) diagram to map out the data structure, the implementation of tables based on the diagram, and the insertion of sample data. Additionally, six queries were developed to facilitate key shop operations and data retrieval, enhancing overall efficiency and decision-making.
Cleaning small dataset using Pandas
Mastering data cleaning is key for accurate analysis and insights. In this project, I utilised the power of Pandas to efficiently clean small datasets. From handling missing values to removing duplicates, Pandas offers robust solutions for data wrangling. Stay tuned for more tips on how to streamline your data preprocessing tasks
Car Sales Dashboard using Tableau
Overall, the car sales dashboard on Tableau provides a comprehensive and visually appealing way to monitor and analyze various aspects of car sales performance, enabling stakeholders to make data-driven decisions to optimize sales strategies and maximize profitability.