Raksha Kannusami

CSE undergraduate learning DSA and trying my hands on data science.

View on GitHub

About

I am a Computer science undergrad from VIT, Vellore. I enjoy building algorithmic solutions to real world problems. I tweet, make projects and have a technical blog as a hobby. I am also a passionate public speaker (toastmaster).


Education

Bachelor of Technology, Computer Science and Engineering

Vellore Institute of Technology, 2018 - 2022. CGPA (till 5th semester) - 8.54


Skills

Technical Skills
Languages
Course Work
Other skills

Personal Projects

Project 1 : COVID Query using NLP

Automatic question answering system for COVID19 queries from reliable resources using Natural Language Processing.

Exposure: BeautifulSoup, Numpy, Pandas, Textblob, Wordcloud, Bag of words, Word2Vec, GloVe, TF-IDF, Bert.
Project Overview
- Data was collected from various government websites from the FAQ section using a custom built python scraper. The dataset has two coloums, questions and answers. - Wordclouds of both the coloums were plotted, and polarity and subjectivity of two coloums were checked using textblob. - All the rows were pre-processed and tokenized. - Phrase embeddings of each row was found using 5 different NLP models - Bag of Words, Word2Vev, GloVe, TF-IDF and BERT. - Cosine similarities of the Query embedding was compared with the datset to retrieve the closest question and the answer. - Accuracies of all the answers retreived by the five models were compared using various testcases to find which model gave the best answer.


Project 2 : Netflix Wrapped (Ongoing)

Streamlit app that analyses your Netflix Viewing activity, inspired by Spotify Wrapped.

Project Overview
- My viewing history was downloaded after requesting for a download of my personal information from my account. It was processed within 10 hours. - Data was pre-processed and analysed using Seaborn and matplotlib.


Project 3 : VITian LinkedIn data Analysis

Pre-processing, Classification and geo-spatial clustering of VITian LinkedIn Profile data

Exposure, Numpy, Pandas, Sklearn, Matplotlib, Classifiers, BERT, Geo=Pandas, Silhoutee score, Wordcloud
Project Overview
- 5000 profiles of VITians were scraped from LinkedIn and labelled into 7 different domains. - Coloums were pre-processed using various techniques to reduce null values and handle inconsistencies. - EDA on the data was done using Tree Maps and Subburst Graphs to undertsand the Heirarchy in the dataset. - Classification of profiles into different domains was done using profile summary and profile headline. - BERT encoding for all the coloumns were found, and PCA was done during feature selection. - Geo-spatial clustering of the data was done using DBSCAN and plotted on a worldmap using location in the profiles. - Sillhoute score was found and Wordcloud analysis of profiles under each cluster was done, to understand more about the people in each cluster.

Get in touch

LinkedIn

Twitter

Blog