Kintan Saha | Reinforcement Learning & Computer Vision Research semester 5 computer science courses at IISc

course name	course details	course grade	course instructor
Topics in Stochastic Approximation Algorithms (E1 396)	Introduction to stochastic approximation algorithms, ordinary differential equation based convergence analysis, stability of iterates, multi-timescale stochastic approximation, asynchronous update algorithms, gradient search based techniques, topics in stochastic control, infinite horizon discounted and long run average cost criteria, algorithms for reinforcement learning. We are folliowing the textbook 'Stochastic Approximation: A Dynamical Systems Viewpoint' by Vivek Borkar	A+	Prof Shalabh Bhatnagar
Computational Methods of Optimisation (E0230)	Need for unconstrained methods in solving constrained problems. Necessary conditions of unconstrained optimization, Methods of line search, Goldstein and Wolfe conditions for partial line search. Global convergence theorem, Steepest descent method. Quasi-Newton methods: DFP, BFGS, Broyden family. Conjugate-direction methods: Fletcher-Reeves, Polak-Ribierre. KKT conditions, Convex programming, Duality, Linear Programming, Simplex method, Gradient Projection, Penalty methods. We are following the textbooks 'Practical Methods of Optimization' by R.Fletcher and 'Numerical Optimization' by Nocedal and Wright	A+	Prof Chiranjib Bhattacharyya
Concentration Inequalities (E2 207)	Introduction & motivation: Limit results and concentration bounds Chernoff bounds: Hoeffding’s inequality, Bennett’s inequality, Bernstein’s inequality Variance bounds: Efron-Stein inequality, Poinca ́re inequality The entropy method and bounded difference inequalities Log-Sobolev inequalities and hypercontractivity Isoperimetric inequalities (Talagrand's convex distance inequality) The transportation method Influences and threshold phenomena. We are following the textbook 'Concentration Inequalities: A Nonasymptotic Theory of Independence' by Gábor Lugosi, Pascal Massart, and Stéphane Boucheron.	A+	Prof Aditya Gopalan Prof Chandra R. Murthy
Theory of Multi-armed Bandits (E1 240)	This course introduces the theory and algorithms underlying the multi-armed bandit (MAB) problem in various settings, along with the fundamental limits of the framework (lower bounds). We are following the textbook 'Bandit Algorithms' by Csaba Svespari and Tor Lattimore	A	Prof Shubhada Agrawal