Hey! I am

Chelsi Jain

I'm a

About

About Me

As a Master's student in Computer Science at Oregon State University with a 3.9 GPA, I have a robust background in Machine Learning, Data Science, MLOps, and Full-Stack Web Development. My expertise is backed by internships at TenderFix in Germany and research projects at IIT Madras and the University of Utah. Proficient in Python, AWS, TensorFlow, and PyTorch, I have led teams to deploy scalable AI models and fine-tuned state-of-the-art LLMs. Additionally, I possess strong skills in full-stack web development, enabling me to seamlessly integrate and deploy AI models into web applications. I am actively seeking a 2024 internship or full-time opportunity in Data Science, MLOps, or Full-Stack Web Development. Driven by curiosity, diligence, and a passion for solving real-world problems, I am eager to apply my skills to contribute to technological innovation and advancements.

Education

2023-2025

Master's of Science in Computer Science

Oregon State University

Pursuing a Master of Science in Computer Science with a GPA of 3.9/4.0, I specialize in Machine Learning, Data Science, and MLOps. My coursework includes advanced studies in algorithms, deep learning, and natural language processing, preparing me for cutting-edge technological innovation and research.

2019-2023

Bachelor of Science in Computer Science

College of Technology and Engineering, MPUAT

Graduated with a Bachelor of Science in Computer Science, achieving a GPA of 8.6/10.0. My education here provided a strong foundation in data structures, machine learning, and data science, and included hands-on projects in full-stack development and AI model deployment.

Experience

Feb 2023 – Aug 2023

SDE Internship, ML Engineer

TenderFix, Germany
  • Led text extraction from the images, from PDFs (Tenders) using Amazon Textract and Tesseract and created the benchmark dataset of 35k PDFs using innovative rule-based methods
  • Developed and deployed dual-layered Deep Learning models on AWS EC2 instances, employing DNN for text classification and CNN for robust image (PDF) processing, enhancing scalability
  • Fine tuned Layout Parser and state-of-the-art LLMs (FlanT5, Llama, GPT 3.5, GPT Neo, GPTJ) for pdf annotations and text classification and achieved 89% accuracy for text classification task using Layout Parser
  • Employed sentence transformers (BERT, SBERT) to calculate the text similarity between PDFs and product catalogs. Evaluated the outcomes after translation using MBart and Google Translate
  • Skills: Amazon Textract, Tesseract, AWS EC2, Deep Learning, DNN, CNN, Layout Parser, LLMs (FlanT5, Llama, GPT-3.5, GPT Neo, GPT-J), BERT, SBERT
Aug 2022 – Apr 2023

Research Internship

SuperBloom Studios, RBCDSAI IIT Madras
  • Played a significant role in an open-source initiative named Hidden Voices aimed at automating the process of adding women’s biography drafts to Wikipedia and Co-authored a research paper submitted to Wiki Workshop 2023
  • Implemented a pipeline to scrape content from the top 5 web pages for individuals and extract summaries by fine tuning GPT-3.5 & GPT Neo to enhance automation
  • Experimented with NLP techniques including Extractive Abstractive Summarization and triplet extraction to structure unstructured text into a table format
  • Skills: NLP, GPT-3.5, GPT Neo, Extractive Summarization, Abstractive Summarization, Triplet Extraction, Web Scraping
Nov 2021 – Jan 2023

Undergraduate Researcher

University of Utah
  • Co-authored a research manuscript accepted ACL Findings 2023, reflecting the substantial contributions made to the research project, showcased significant improvements in information synchronization across multilingual semi-structured tables Co-Author: Siddharth Khincha
  • Devised a two-step approach, including the Information Alignment algorithm and the Information Update rule-based system, achieving an F1 score of 87.91 for English to non English alignment
  • Created the INFOSYNC dataset with 100K entity centric tables across 14 languages, including a subset of 3.5K annotated tables, to train and evaluate the alignment and update processes
  • Demonstrated exceptional accuracy in information alignment, surpassing 50 F1 score for all language pairs and achieving a performance boost of 16 points with strict bidirectional mapping constraints
  • Conducted error analysis using metadata and live updates, resulting in a remarkable 77.28% acceptance rate for human-assisted Wikipedia edits on Infoboxes, showcasing the efficacy of the methodology
  • Skills: Information Alignment, Data Synchronization, NLP, Multilingual Data Processing, Dataset Creation, Rule-Based Systems, Error Analysis

Skills

Data Structure and Algorithms

HTML, CSS, Javascript

Python

SQL

AWS

Docker

MLFlow, Github Actions

Web Developement

Flask

React

Angular

NodeJS

MongoDB

Google Firebase

Extracurricular

Machine Learning Lead

Google Developer Student Club

Led the Machine Learning group at the in-house Google Developer Student Club in Udaipur, India, mentored over 30 students, designed roadmaps and curated resources for Machine Learning and AI enthusiasts

Technical Executive

Entrepreneurship Cell, CTAE

Orchestrated entrepreneurship programs as Technical Executive of the E-Cell, achieving 5th place in the National Entrepreneurship Challenge’2020 finals organized by E-CELL IIT Bombay

Coordinator

Robotics Club, CTAE

Served as the coordinator of the Robotics Club, driving innovation and collaboration among members. Mentored 100+ students in the club.

Projects

Projects

Check out more amazing projects on my GitHub

Virtual Sync

May 2023

Developed an AI-powered smart classroom application with functionalities including Air Canvas for real-time screen annotation using computer vision techniques. Implemented color detection and tracking methods for enhanced accuracy and efficiency. Integrated voice recognition for downloading captions and saving notes, along with a tracker to monitor student engagement during live lectures.

OpenCV ReactJS NodeJS WebSpeech API CanvasJS MongoDB
trackCTA screenshot
trackCTA screenshot

Classroom-X

Sept 2021

Created an e-classroom platform to address internet access gaps by regenerating screens with minimal data streaming. Delivered a client-side AI model that monitors student engagement via webcam feeds and alerts teachers in real-time. Achieved operational efficiency at 8 Kbps speed with only 28.8 MB hourly data consumption, significantly surpassing Google Meet's requirements, thus enhancing remote learning experiences.

Flask Socket.io HTML/CSS/JS JQuery Bootstrap Tensorflow.js MongoDB Heroku

I'm Available for freelancing

I am also available for freelancing, offering expertise in Machine Learning, Data Science, MLOps, and Full-Stack Web Development to help solve complex problems and drive technological advancements.

Hire me

Contact

Contact Me

If you want to know more about my experiences and journey, or just talk in general, get in touch! ✌️