Back

Speaker "Sneha Muppalla" Details Back

 

Topic

Paper Reading on Flow Matching Policy Gradients

Abstract

This presentation details a comprehensive review of the research paper, "Flow Matching Policy Gradients," which introduces the Flow Policy Optimization (FPO) algorithm. FPO is a novel on-policy reinforcement learning approach that effectively integrates flow-based generative models (such as diffusion models) into the standard policy gradient framework, compatible with popular methods like PPO-clip.
 

Profile

Sneha Muppalla is currently studying a dual degree in Electrical Engineering and Computer Science (EECS) and Business under UC Berkeley’s MET Program. Her technical interests focus on computer vision and building generative vision systems that go beyond pattern recognition and towards physical and causal understanding. She is especially interested in the intersection of visual intelligence, generative models, and real-world perception, with experience in CS coursework, systems, and research exploration.