My Projects

RESUME
PUBLICATIONS

Crystallizing Semantics: Mapping the Journey of Word Meaning in Language Models

This paper explores the temporal formation of semantic structure in language models using longitudinal evaluation across multiple similarity benchmarks and correlation metrics.
Springer LNCS

Temporal and Regulatory Reasoning in Parking Sign Interpretation using Machine Learning and LoRA-Tuned LLaMA 3

This paper proposes a hybrid pipeline combining classical ML and fine-tuned LLaMA-3 for interpreting urban parking regulations, achieving high accuracy on temporally complex rule classification.
Springer ISEM

Cryptocurrency Sentiment Analysis using Bidirectional Transformation

This paper predicts sentiments of cryptocurrency news articles using the fine-tuned BERT model for 29076 samples with 89% accuracy.
IEEE Xplore

AI ENGINEERING & DATA SCIENCE

Enterprise Agentic RAG platform

Enterprise Agentic RAG platform built with FastAPI, LangGraph, OpenAI, PostgreSQL, Qdrant Cloud, Railway, and Streamlit. Features multi-agent orchestration, hybrid retrieval (Vector Search + BM25), CrossEncoder reranking, conversational memory, tool-calling, web-search augmentation, observability, and cloud deployment.
Technologies: LangGraph, LangSmith, Qdrant, PostgreSQL, Railway, and Streamlit
https://github.com/digit987/enterprise-agentic-rag-platform

Realtime Voice AI Support Platform

Realtime Voice AI platform featuring streaming speech-to-text, multi-agent conversational orchestration, Redis Streams event processing, WebSockets, OpenAI Whisper/TTS, Prometheus-Grafana observability, and session-aware memory management, and cloud deployment.
Technologies: Redis, WebSockets, OpenAI Whisper/TTS, Prometheus, and Grafana
https://github.com/digit987/realtime-voice-ai-platform

Fine Tuning LLM

Llama QLoRA fine tuning to generate python code, legal document summary and crossword solutions.
Technologies: Transformers
https://github.com/digit987/finetuning_llama

Healthcare RAG LangChain

Used Weaviate vector database and LangChain to augment research paper context and generate responses to healthcare queries.
Technologies: LangChain, Weaviate
https://github.com/digit987/healthcare_rag_langchain

Oral Disease Classification

Developed a deep learning model using Transfer Learning (EfficientNetB0) to classify oral diseases (Caries vs. Gingivitis) with a validation accuracy of 92.65% and test accuracy of 91.42%. Implemented dynamic learning rate adjustment, fine-tuning, and data preprocessing techniques to enhance performance and generalization. https://github.com/digit987/oral_disease_classification

BERT Comment Classifier

Built BERT based toxic comment classifier for ~65,000 samples with 91% accuracy. https://github.com/digit987/bert_comment_classifier_kaggle

Resume Parser

Built Resume Parser using LangChain. Hosted on Streamlit.
Technologies: LangChain, Streamlit
https://resume-parsing-langchain.streamlit.app https://github.com/digit987/resume_parser_langchain_streamlit

Web Traffic Analysis

Analyzed web traffic data using Python (Pandas, SciPy) to derive insights on pageview events, geographical traffic sources, and click-through rates (CTR). Performed statistical analysis to evaluate CTR variations, identify correlations between clicks and pageviews (engagement metrics). https://github.com/digit987/web_traffic_analysis

Text Speech Deepgram Elevenlabs

Input a video url and extract transcript using Deepgram. Converted back to speech using ElevenLabs. Deployed using Streamlit. Live at: https://textspeechdeepgramelevenlabsapp.streamlit.app/ https://github.com/digit987/text_speech_deepgram_elevenlabs_streamlit

GPT Lyrics Augmentation

Implemented text augmentation using GPT-3.5 Turbo to generate synthetic lyrics for 10 Genre Classes.
Technologies: GPT-3.5 Turbo
https://github.com/digit987/gpt_lyrics_augmentation

GPT News Summary

Used GPT-3.5 Turbo model to generate news summary for BBC news dataset and compared them with reference summaries using ROGUE scores.
Technologies: GPT-3.5 Turbo, evaluate
https://github.com/digit987/gpt_news_summary

GPT Speech Conversation

Used GPT-4 for general question answering by feeding audio prompts to Whisper-1 model of OpenAI, which converts speech to text. The text was then fed as a prompt to GPT-4 model which generated responses. The text responses were converted to audio responses using library pyttsx3.
Technologies: GPT-4, Whisper-1, pyttsx3
https://github.com/digit987/gpt_speech_conversation

GPT Powered JavaScript App for CS Question Answering

A JavaScript App powered by GPT to answer CS questions. Helpful in GATE Exam. Live at: https://gatecsdoubts.netlify.app Technologies: GPT-3.5 Turbo, JavaScript
https://github.com/digit987/gatecsdoubts_gpt_app

Dummy Demographic Data Generation

Generated dummy demographic data for identifiers namely Aadhaar No., Mobile No., House No., Pincode, Employer Details, Working Conditions etc.
Technologies: pandas
https://github.com/digit987/dummy_demographic_data_generation

DATA ENGINEERING

Data Ingestion using ElasticSearch and Node.js

Fetched earthquake data using Node.js API. Stored and ingested it using ElasticSearch pipeline, processors and indexing.
Technologies: ElasticSearch, Node.js, Express.js
https://github.com/digit987/earthquake_nodejs_elastic

Stock Scraper

Scraped a stock website every 60 seconds and store the data in pandas DataFrame.
Technologies: BeautifulSoup4, pandas, numpy
https://github.com/digit987/stocks_scraper_analysis

Music Mood Prediction from Song Lyrics

Scraped Indian song lyrics, used LDA to extract topics from them to use as input features and built supervised and unsupervised models to classify the song mood.
Technologies: BeautifulSoup4, scikit-learn, pandas, matplotlib
https://github.com/digit987/music_mood_prediction

Google News Scraper

Scraped using BeautifulSoup, pandas and Newspaper3k to get Google News.
Technologies: BeautifulSoup4, Newspaper3k, pandas
https://github.com/digit987/google_news_scraper

EDGAR Financial Report Analysis

Scraped SEC EDGAR financial reports and performed text analysis to calculate various metrics for various sections.
Technologies: BeautifulSoup4, pandas, nltk
https://github.com/digit987/edgar_financial_reports_scraping_text_analysis

Blog Scraping and Text Analysis

Scraped a list of Blogs, pre-processed and analysed them, calculated metrics and stored as CSV.
Technologies: Python, BeautifulSoup4, openpyxl, pandas, nltk
https://github.com/digit987/blog_scraping_text_analysis

Google Map Scraping using Playwright and Selenium

Scraped details of listings on Google Map using Playwright (JavaScript) and Selenium (Python).
Technologies: JavaScript, Playwright, Python, Selenium
https://github.com/digit987/google_map_listings_scraper

Job Listings Scraper

Job Listing Scraper (remote.co) using Selenium and Python.
Technologies: Selenium
https://github.com/digit987/job_listings_scraper

BigQuery PySpark ETL

Retrieved Hackernews dataset from BigQuery, did data quality check and analysis using PySpark and SQL
Technologies: PySpark, Seaborn
https://github.com/digit987/hackernews_bigquery

Stock Dashboard using Power BI.

Reliance Share Dashboard using Power BI.
Technologies: Power BI
https://github.com/digit987/stock_dashboard_powerbi

BACK-END DEVELOPMENT

Stock Technical Analysis using FastAPI

Built a FastAPI-PostgreSQL async, scalable and modular app for Reliance technical analysis.
Technologies: FastAPI, PostgreSQL
https://github.com/digit987/stock_analysis_fastapi_postgres

Music REST API

Used Node.js and Mongoose to build a Music REST API with OAuth, User Favourites, Artists, Albums, Songs and Playlists. Consumed the API using Axios client.
Technologies: Node.js, Express.js, Mongoose (MongoDB), axios, bcrypt
https://github.com/digit987/node_music_api

PICSTA REST API

Used Django and Djongo (for DB schema) to build a Photo Sharing REST API with Posts, Comments, Tags, Followers and Following.
Technologies: Django, Djongo (MongoDB)
https://github.com/digit987/django_picsta_api

Online Shopping System

Built a portal using PHP, MySQL and HTML for Online Shopping System with cart management, ratings and shop by category.
Technologies: PHP, JavaScript, HTML
https://github.com/digit987/online_shopping_system

CSV Data Extraction and Email Automation using Django

Extracted Data from CSV and automated Email sending using Django.
Technologies: Django
https://github.com/digit987/extract_csv_send_email_django

URL Shortener using FastAPI with caching

URL Shortener using FastAPI with caching.
Technologies: FastAPI
https://github.com/digit987/url_shortener_fastapi

Fashion Recommendation System

Fashion products recommendation app in Django & Cython.
Technologies: Django, Cython, HTML
https://github.com/digit987/fashion_recommendation_system

Credit Approval System

Credit Approval System with initial dataset population.
Technologies: Django, HTML
https://github.com/digit987/credit_approval_system

Flight Booking System

Django App for flight booking by customers.
Technologies: Django, HTML
https://github.com/digit987/flight_booking_system

FRONT-END DEVELOPMENT

Codeforces Web App

Built a React.js SPA to consume Codeforces API and expose its features like Contests, Problems, Blogs, and Comments.
Technologies: ReactJS, JavaScript, CSS, HTML
https://github.com/digit987/codeforces_api_react

Educational Website Homepage

Built a home page for an educational website.
Technologies: JavaScript, CSS, HTML
https://github.com/digit987/educational_website

DEPLOYMENT

AWS Glue with PySpark

Fetched CSV data from S3 bucket to Glue data catalog and processed it using PySpark.

AWS Glue ETL Job

Implemented Glue ETL job with AWS S3 bucket.

AWS Athena SQL

Fetched data from S3 using crawler and ran SQL queries using Athena.

Dockerised a Node.js App

https://github.com/digit987/node_docker

PROBLEM SOLVING

Data Structures

Solved data structure problems.
Technologies: Python
https://github.com/digit987/data_structures

Dynamic Programming

Solved problems related to the application of dynamic programming.
Technologies: Python, C
https://github.com/digit987/dynamic_programming

GRAPHS AND TREES

Solved problems related to the application of graphs and trees.
Technologies: Python
https://github.com/digit987/graphs_and_trees