Kintan Saha
I am currently a third-year undergraduate (rising junior) pursuing a B.Tech in Mathematics and Computing at the Indian Institute of Science (IISc), Bengaluru. My current GPA (till end of semester 4) is 9.3/10.0
My research interests lie in Reinforcement Learning, with a focus on Stochastic Approximation methods and establishing theoretical guarantees for learning algorithms, as well as in theoretical aspects of generative models, particularly of Diffusion Models. I am also interested in 3D Computer Vision, especially in 3D Scene Reconstruction and Novel View Synthesis.
In addition, I have contributed to peer-reviewed publications submitted to top-tier venues and bring a strong theoretical grounding, supported by advanced coursework and rigorous, application-driven research projects.
You can view or download my full resume here: View / Download Resume
skills
My core technical expertise spans across -
Languages and Frameworks
research experience
|
May, 25 - Ongoing |
Reliable Critics: Monotonic Improvement and Convergence Guarantees for Reinforcement Learning Advisor
Lab
Description
This project aims to further enhance the Reliable Policy Iteration(RPI) framework to augment multiple SOTA algorithms such as PPO, TD3, DDPG etc and test it on diverse environments such as Atari, MuJoCo, MiniGrid. The goal is to establish new SOTA results on these environments using these RPI augmented algorithms.
Role
Designed a novel loss function incorporating the RPI framework to be used as a plug and play substitute in SOTA Deep RL algorithms such as PPO, TD3, DDPG etc. Also performed extensive experiments on the extreme sparse-reward MiniGrid environment and performed ablation studies to establish new baselines on MiniGrid.
Technical Stack
PyTorch, Stable-Baselines3, WandB(for logging and hyperparameter sweeps), MatplotLib, Seaborn
|
|---|---|
|
May, 25 - Ongoing |
Feed Forward Deblurring in 3DGS Advisor
Description
This project aims to create a generalisable scene-agnostic deblurring framework to be integrated into 3DGS foundation models for scene-agnostic scene deblurring.
Role
Developing a deblurring framework which can be readily plugged into SOTA 3DGS foundation models such as NoPoSplat, Dust3R etc to enable deblurring of scenes in a feed forward fashion. Current methods tackling scene deblurring within the 3DGS framework are scene-specifc methods; we aim to develop a scene agnostic framework.
Technical Stack
PyTorch, Hydra(Config Management), Blender, PyTorch Lightning, Weights and Biases
|
|
Jan, 25 - April, 25 |
Towards Uncertainty-aware Alignment Advisor
Description
Developed an Alignment framework with uncertainty quantification for Preference-based RL. This framework was extended to LLM alignment by modifying PPO(Proximal Policy Optimization) to account for uncertainty in the reward estimates of the reward models being used in the RLHF pipeline.
Role
Experimentally verified the LLM alignment framework by modifying the RLHF pipeline to include our novel uncertainty estimation framework. The framework was tested on LLMs of multiple sizes: GPT-2, Qwen2.5, Mistral-7B and mulitple reward models such as custom ensemble reward models and prompted reward models such as Gemini2.0, Deepseek-V3.
Technical Stack
HuggingFace Transformers, HuggingFace Transformers Reinforcement Learning(TRL), Weights and Biases
Status
Submitted for review at NeurIPS 2025
|
|
Jan, 24 - June, 24 |
HinglishEval: Evaluating the Effectiveness of Code-generation Models on Hinglish Prompts Advisor
Description
This project aimed to evaluate code-generation LLMs on Hinglish prompts obtained from a translated HumanEval dataset. The end goal is to evaluate the effectiveness of such code-gen LLMs in CS101 courses in the Indian context.
Role
Helped in translating the HumanEval dataset to Hinglish and evaluating multiple code gen LLMs such as GPT-4 ,Gemma, Phi-3, PolyCoder, StarCoder etc. The evaluation criteria used was pass@k and Item Response Theory (IRT).
Technical Stack
HuggingFace Transformers, OpenAI API, MatplotLib
Source Code
|
research publications
presentations
| Jun 01, 2025 | Presentation on Structure From Motion |
|---|---|
| May 23, 2025 | Presentation on Diffusion and Flow based models |
| May 19, 2025 | Presentation on 3D Scene Representation |
| Apr 14, 2025 | Presentation of course project for UMC203 - AI and ML |
| Apr 14, 2025 | Presentation of course project for E1 277 - Reinforcement Learning |
| Dec 04, 2024 | Presentation at ACM COMPUTE 2024 of my paper |
| Oct 20, 2024 | Presentation on Spectral Clustering |
volunteering activity
Notable volunteering activities are -
- I am a senior core committee member of Databased, the IISc UG computer science club.
- I am a co-convener of Rhythmica – the IISc music club. For details on my musical journey, please refer to music section. .