Date of Award


Document Type


Degree Name

Master of Science (MS)


Computer Science

Committee Chair/Advisor

Dr. Nina Christine Hubig

Committee Member

Dr. Vidya (Seyedehzahra) Samadi

Committee Member

Dr. Feng Luo


Changes in demand, various hydrological inputs, and environmental stressors are among issues that water managers and policymakers face on a regular basis. These concerns have sparked interest in applying different techniques to determine reservoir operation policy and improve reservoir release decisions. As the resolution of the analysis rises, it becomes more difficult to effectively represent a real-world system using traditional approaches for determining the best reservoir operation policy. One of the challenges is the “curse of dimensionality,” which occurs when the discretization of the state and action spaces becomes finer or when more state or action variables are taken into account. Because of the dimensionality curse, the number of state-action variables is limited, rendering Dynamic Programming (DP) and Stochastic Dynamic Programming (SDP) ineffective in handling complex reservoir optimization issues. Deep Reinforcement Learning (DRL) is an intelligent approach to overcome the aforementioned curses of stochastic optimization of reservoir system planning. This study examined various novel DRL continuous-action policy gradient methods (PGMs), including Deep Deterministic Policy Gradients (DDPG), Twin Delayed DDPG (TD3), and two different versions of Soft Actor-Critic (SAC18 and SAC19) to identify optimal reservoir operation policy for the Folsom Reservoir located in California, US. The Folsom Reservoir supplies agricultural and municipal water, hydropower, environmental flows, and flood protection to the City of Sacramento. We concluded DRL methods release decisions with respect to these demands as well as by comparing the results to standard operating policy (SOP) and base conditions using different performance criteria and sustainability indices. TD3 and SAC methods have shown promising performance in providing optimal operation policy. Experiments on continuous-action spaces of reservoir operation policy decisions demonstrated that the DRL techniques could efficiently learn strategic policies in space with the curse of dimensionality and modeling.

Author ORCID Identifier 0000-0001-9157-3397



To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.