Date of Award


Document Type


Degree Name

Doctor of Philosophy (PhD)


Automotive Engineering

Committee Chair/Advisor

Dr. Robert Prucka

Committee Member

Dr. Qilun Zhu

Committee Member

Dr. Benjamin Lawler

Committee Member

Dr. Zoran Filipi


The push for improvements in fuel economy while reducing tailpipe emissions has resulted in significant increases in automotive powertrain complexity, subsequently increasing the resources, both time and money, needed to develop them. Powertrain performance is heavily influenced by the quality of their controller/calibration with modern powertrains reaching levels of complexity where using traditional design of experiment-based methodologies to develop them can take years. Recently, reinforcement learning (RL), a machine learning technique, has emerged as a method to rapidly create optimal controllers for systems of unlimited complexity directly which creates an opportunity to use RL to reduce the overall time and monetary cost of control development. These reductions are possible because RL techniques optimize control policies by creating a direct relationship between sensor signals, control actions taken, and objective function rewards, entirely bypassing the need to model/understand underlying system dynamics removing the need for expert-level calibration and/or the derivation of control-oriented models during controller development. RL can also optimize system performance over long-term horizons while scheduling control actions at a high frequency. This makes RL an appealing candidate for controlling systems with dynamics of mixed timescales, a task many other optimal control techniques struggle with. This is particularly valuable for hybrid powertrains control as they are comprised of fast internal combustion engine dynamics and slow battery/electrical system dynamics. As a relatively new control technique, research must be conducted to investigate the best way to use RL for powertrain control development. First, prior research is examined to identify challenges in powertrain control development. After, an experiment is conducted to show how RL can learn from and control an organic Rankine cycle waste heat recovery system (ORC-WHR) that is too complex to control with other real-time optimization techniques, validating RL’s ability to bypass a control-oriented modeling and/or calibration step. Next, a RL-based powertrain control development methodology is proposed to address shortcomings found in the ORC-WHR study and additional control development challenges identified. An RL algorithm well suited for powertrain control, maximum a posteriori policy optimization (MPO), is identified as it can natively control a combined continuous-discrete action space. To prove its viability MPO is compared to an optimal rule-based controller and other standard RL algorithms. MPO is found to achieve superior performance and is the only RL algorithm examined that can learn safe engine switching operation. To optimize RL-based control performance over non-terminating scenarios using a ``direct" reward function it is found that a dual replay buffer must be utilized. The direct reward function used is similar to those used by finite-horizon global optimization techniques. Critically, RL can optimize this formulation while maintaining on-line functionality. It is shown that the direct reward function achieves superior performance compared to using functions common to receding-horizon optimizers. The proposed formulation avoids artificially restricting control authority, allowing for full exploitation of the operational range and performance potential of the evaluated hybrid electric vehicle. With the methodology's choices validated a case study is to demonstrate how the methodology can be used is conducted. The methodology is used to rapidly assess the fuel consumption performance of a tracked series hybrid electric vehicle with various battery capacities. Finally, a summary of the contributions of this work is provided.

Author ORCID Identifier


Available for download on Friday, May 31, 2024