About
I am a Computer science undergrad from VIT, Vellore. I enjoy building algorithmic solutions to real world problems. I tweet, make projects and have a technical blog as a hobby. I am also a passionate public speaker (toastmaster).
Education
Bachelor of Technology, Computer Science and Engineering
Vellore Institute of Technology, 2018 - 2022. CGPA (till 5th semester) - 8.54
Skills
Technical Skills
Languages
- Java (Data Structures and Algorithms)
- Python for Datascience
Course Work
- Web Mining
- Data Visualisation
- Machine Learning
- Natural Language Processing
- Image processing
Other skills
- Public speaking
Personal Projects
Project 1 : COVID Query using NLP
Automatic question answering system for COVID19 queries from reliable resources using Natural Language Processing.
Exposure: BeautifulSoup, Numpy, Pandas, Textblob, Wordcloud, Bag of words, Word2Vec, GloVe, TF-IDF, Bert.
Project Overview
- Data was collected from various government websites from the FAQ section using a custom built python scraper. The dataset has two coloums, questions and answers. - Wordclouds of both the coloums were plotted, and polarity and subjectivity of two coloums were checked using textblob. - All the rows were pre-processed and tokenized. - Phrase embeddings of each row was found using 5 different NLP models - Bag of Words, Word2Vev, GloVe, TF-IDF and BERT. - Cosine similarities of the Query embedding was compared with the datset to retrieve the closest question and the answer. - Accuracies of all the answers retreived by the five models were compared using various testcases to find which model gave the best answer.
Project 2 : Netflix Wrapped (Ongoing)
Streamlit app that analyses your Netflix Viewing activity, inspired by Spotify Wrapped.
Project Overview
- My viewing history was downloaded after requesting for a download of my personal information from my account. It was processed within 10 hours. - Data was pre-processed and analysed using Seaborn and matplotlib.
Project 3 : VITian LinkedIn data Analysis
Pre-processing, Classification and geo-spatial clustering of VITian LinkedIn Profile data
Exposure, Numpy, Pandas, Sklearn, Matplotlib, Classifiers, BERT, Geo=Pandas, Silhoutee score, Wordcloud
Project Overview
- 5000 profiles of VITians were scraped from LinkedIn and labelled into 7 different domains. - Coloums were pre-processed using various techniques to reduce null values and handle inconsistencies. - EDA on the data was done using Tree Maps and Subburst Graphs to undertsand the Heirarchy in the dataset. - Classification of profiles into different domains was done using profile summary and profile headline. - BERT encoding for all the coloumns were found, and PCA was done during feature selection. - Geo-spatial clustering of the data was done using DBSCAN and plotted on a worldmap using location in the profiles. - Sillhoute score was found and Wordcloud analysis of profiles under each cluster was done, to understand more about the people in each cluster.