About Me

Welcome to my professional journey! I'm Sancharika Debnath, a dynamic Data Scientist with a fervent enthusiasm for leveraging cutting-edge technology to craft innovative solutions. With a solid foundation in data science and backend development.
Graduated from Kalinga Institute of Industrial Technology University with a Bachelor's Of Technology in Information Technology has equipped me with a strong academic background to complement my practical skills. I'm eager to connect with professionals and organizations committed to driving meaningful change.
Bringing value that resonates: WHY I stand out!!
- Dynamic Data Scientist with a passion for leveraging cutting-edge technology to craft innovative solutions.
- Skilled in translating complex data into actionable insights and robust systems.
- Led the development of transformative event systems and custom APIs, achieving an impressive accuracy rate of 87.4%.
- Experienced in backend development using Python, Django, and AWS to optimize website performance and enhance user experiences.
- Pioneered a regression invoice prediction system and conducted groundbreaking research on Hyperspectral Image Compression and Classification, achieving a remarkable accuracy of 99.8%.
Skills
- Python
- R
- JavaScript
- Java
- HTML
- C++
- SQL / query languages
- Machine Learning
- Natural Language Peocessing (NLP)
- Deep Learning
- Computer Vision
- Generative AI
- Data Analysis
- Autoencoder
- Google Cloud Platform
- Amazon Web Services
- Microsoft Azure
- Neo4j Graph Database
- Tableau
- GitHub
- Ollama
- Keras
- TensorFlow
- PyTorch
- LangChain
- Llama
- Django
- PySpark
- Azure ML
- MLFlow
- Databrick
- Pandas
- NumPy
- Scikit-Learn
- OpenCV
- spaCy
- Transformers
- Hugging Face
- Transfer Learning
- Pinecone
- Cassandra
Experience
Defence Research and Development Organisation (DRDO)
January 2025 - August 2025
Graduate Apprentice (CSE)
- Secure Offline AI Deployment: Deployed and configured Large Language Models (LLMs) in a fully offline, air-gapped environment for national security applications. Ensured complete compliance with DRDO's defense protocols and stringent data privacy regulations.
- Manual Application-Level Setup: Installed, configured, and optimized LLMs manually at the application-folder level, bypassing cloud services and automated pipelines. Managed installation paths, dependencies, and environment configurations to ensure seamless offline operation.
- Automated Tender Document Generation: Built a pipeline to automatically generate tender evaluation documents in LaTeX using locally hosted AI models from OLLAMA. Integrated OCR tools (PyTesseract, EasyOCR, pdfplumber) to extract text from PDFs and scanned copies, processed data through LangChain, and generated well-formatted LaTeX outputs.
- Optimized AI for Restricted Hardware: Tuned memory allocation, inference parameters, and resource usage to run Llama3.2 and Mistral:7B models efficiently within limited hardware constraints, without compromising accuracy or output quality.
Leapon
February 2024 - Present
Product Developer
- Building Product: We started with the idea of enhancing the end customer experience in the service industry. Through our insights, we realized that enabling professionals is key to making this possible. Today, we have built a productivity tool for professionals that helps them build strong business relationships.
- Vision-Driven Product Development: Initiated with the vision of revolutionizing the service industry by enhancing the end-customer experience. Realized that empowering professionals is the cornerstone of this mission, leading to the creation of a productivity tool designed to help them build and nurture strong business relationships.
- System Architecture & Optimization: Designed and implemented a robust, modular system architecture that can scale effortlessly with growing user demand. Utilized advanced caching techniques and database optimization to enhance application performance and reduce latency.
- User-Centric Development: Continuously gathered feedback through user testing and implemented iterative improvements to enhance usability. Prioritized our user's experience by integrating intuitive design principles into the platform's functionality.
Giggr Technologies
June 2023 - January 2024
AI ML Engineer
- Event System: Orchestrated the development of a dynamic event management system incorporating cutting-edge technologies like facial recognition for attendance tracking, real-time object identification via camera data, and GPS integration. Achieved an impressive 87.4% accuracy leveraging OpenCV, YOLO, Deep-Face, Geo location API, and JavaScript.
- Neo4j Architecture Development: Collaborated with a cross-functional team to architect and implement a robust Neo4j database structure, facilitating efficient data mining operations. Leveraged Neo4j's graph database capabilities alongside REST API integration and GitHub Actions for streamlined collaboration and deployment.
- Custom Public API Development: Spearheaded the creation of a bespoke public API leveraging Elasticsearch to aggregate data for over 1.3 million Indian educational institutions through web scraping techniques. Leveraged Amazon Web Services (AWS), Elastic-Search, and Shell Scripting to ensure scalability and performance.
- Generative AI Stack Bot Design: Conceptualized and designed a sophisticated Generative AI stack Bot tailored to specific business requirements, significantly enhancing user engagement by 90%. Leveraged Dialogflow, Firebase, and Open AI alongside NLP and GPT for advanced conversational capabilities and Text-to-Speech integration.
- Technological Expertise: Demonstrated proficiency across a diverse range of technologies including Amazon Web Services, Elastic-Search, Shell Scripting, Dialogflow, Firebase, Open AI, and NLP, showcasing adaptability and versatility in tackling complex project requirements.
Leapon
February 2023 - June 2023
Back-end Developer
- Customized Back-end Development: Led the development of a tailored back-end solution using Python - Django framework, ensuring seamless functionality and performance for a website housing over 50 advisors.
- Django REST API Integration: Implemented a comprehensive RESTful API using Django, enabling efficient communication between the website's front-end and back-end systems. Leveraged industry-standard practices for API design to ensure interoperability, scalability, and security.
- Continuous Integration and Continuous Deployment (CI/CD): Implemented automated testing and deployment pipelines utilizing CI/CD tools to streamline the development process. Conducted unit testing and bug fixing to optimize features, ensuring the highest level of reliability and stability throughout the development lifecycle.
- Agile Scrum Methodology: Embraced Agile Scrum practices, including sprint planning, daily stand-ups, and iterative development cycles, to foster collaboration and adaptability within the development team. Leveraged tools like GIT for version control and JIRA for project management to facilitate efficient tracking and prioritization of tasks.
- Feature-rich Functionality: Developed various features to enhance user experience and functionality, including:
- Creation of digital cards for advisors, providing a personalized and professional touch to their profiles.
- Implementation of appointment scheduling functionality similar to Google Calendar, enabling seamless booking and management of appointments.
- Integration of mass communication/marketing email sending features, facilitating targeted outreach and engagement with clients.
- Utilization of data analytics capabilities using Google Cloud Platform (GCP) for insightful business intelligence and decision-making.
- Hosting the back-end infrastructure on Heroku, ensuring scalability, reliability, and security of the system.
Maersk Global Service Centres India Pvt. Ltd.
July 2022 - February 2023
Data Science Intern
- Agile Office Seat Pre-booking Web App: Spearheaded the development of an agile web application for office seat pre-booking, leveraging Spring Boot and Next.js technologies. This innovative solution led to an impressive 80% boost in user booking efficiency, enhancing workplace productivity and organization. Technologies utilized include Spring-Boot for backend development, GitHub Actions for continuous integration and deployment, GitHub for version control, Anchor UI for frontend design, and PostgreSQL for database management.
- Clustering and Estimation Models for Ports: Implemented advanced models to cluster ports using clustering and estimation methods within the realm of Big Data analytics. Employed techniques such as Gaussian Mixture for cost estimation, enhancing analytical capabilities for port management and optimization. Leveraged Pyspark and Databricks for large-scale data processing, alongside Microsoft Azure for cloud infrastructure and deployment.
- Predictive Analytics for Distribution Logistics: Developed a solution approach to predict container turn time and forecast attachment ratio in distribution logistics scenarios. Leveraged quantitative metrics and machine learning techniques, with FLAML (Fast, Lightweight AutoML) identified as the optimal AutoML tool for the task. Utilized ETL (Extract, Transform, Load) processes for data preprocessing, Databricks for scalable data processing, and Azure ML for machine learning model development and deployment.
- Technological Expertise: Demonstrated proficiency across a wide array of technologies including Spring Boot, Next.js, GitHub Actions, GitHub, Anchor UI, PostgreSQL, Pyspark, Databricks, Microsoft Azure, and FLAML. This comprehensive skill set enabled the successful implementation of complex solutions spanning web development, big data analytics, and machine learning.
HighRadius Corporation
January 2022 - April 2022
Data Science Winter Intern
- Regression Invoice Prediction System: Developed a robust regression-based invoice prediction system employing machine learning models and statistical concepts to streamline financial operations. Leveraging advanced techniques, the system accurately forecasts invoice amounts, aiding in budget planning and resource allocation.
- Model Utilization and Accuracy: Implemented XGBoost, a powerful gradient boosting algorithm, to train the predictive model achieving an impressive accuracy of 76.15%. Additionally, utilized frameworks such as Keras to enhance the system's capabilities, ensuring accurate predictions and optimization of financial processes.
Projects
Featured Projects
A revolutionary ai-powered document assistant, a "magic lamp" for unlocking the secrets within your files. "Genie File" is an AI-powered document assistant designed to seamlessly extract and analyze information from diverse file formats, enabling intuitive question answering and knowledge discovery. By integrating advanced retrieval-augmented generation techniques with vector databases and customizable knowledge graphs, it transforms raw data into actionable insights.
An AI-powered job hunting companion revolutionizing career advancement through resume optimization, cover letter generation, interview preparation, and personalized job recommendations.
Streamlit-based conversational RAG Q&A chatbot providing real-time financial analysis of IPOs for startup companies using Vector Database.
A robust computer-aided detection system for gastrointestinal (GI) diseases utilizing a multi-model approach. Leveraging the Kvasir dataset, the system employs deep learning models for accurate detection. Additionally, the RGB images are converted to HSV and YUV color spaces to enhance performance.
Furniture classifier leveraging deep learning and transfer learning models, deployed on AWS SageMaker, achieving 83.16% accuracy in image classification.
Other Projects
A tool for generating blog content through speech recognition or text input, employing Google's generative AI models and customizable parameters for tone, style, and language.
An advanced AI-driven summarization tool tailored for bloggers, offering effortless extraction and concise summarization of diverse document formats into engaging blog content.
Fine-Tuning large language models using state-of-the-art techniques like LoRA for text generation tasks, enabling the models to produce coherent and contextually relevant text responses based on given prompts or instructions.
This project, part of Udacity's AI Programming with Python Nanodegree program, involves developing an image classifier using PyTorch and converting it into a command-line application, and users can then use the application to predict the class of an input image along with the probability.
Operationalizing machine learning models on Amazon SageMaker, including setting up the initial environment, configuring S3 buckets, utilizing different SageMaker instance types for tuning, training, and hosting, employing EC2 instances for training, implementing Lambda functions for testing trained endpoints, managing permissions and IAM roles, and configuring concurrency and auto-scaling.
Image classification using AWS SageMaker, where a pretrained CNN model (ResNet18) is fine-tuned to classify dog breeds into 133 categories based on a Dog Breeds dataset.
A basic voice assistant using speech recognition, text-to-speech conversion, and various functionalities like playing songs on YouTube, opening applications, fetching current time, and providing information about a person from Wikipedia.
This Python project involves the implementation of various neural network models, including CNN, ANN, RBF, and few more from scratch to understand the underlying intuition behind each neural network architecture.
VisionaryNet is a web application powered by ResNet50, allowing users to upload images and receive real-time identification of objects present in the image. Leveraging Keras and Django, this application harnesses the ImageNet classification of predefined models to provide accurate object recognition.
a machine learning project focused on classifying images of cats and dogs using Convolutional Neural Networks (CNN) implemented with the Keras Sequential API with an impressive accuracy of 98.7%
Publication
The Research paper published in IJCISIM, proposes a novel lossy compression technique for hyperspectral images using deep convolutional networks and autoencoders, achieving superior results in compression ratio and Peak Signal-to-Noise Ratio (PSNR) compared to existing methods.