Full Text Available

Access Repository Access Repository

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

End-to-End Autonomous Quadcopter using Reinforcement Learning

This thesis investigates the potential of Reinforcement Learning (RL) for achieving robust and adaptable quadcopter control, focusing on trajectory and attitude stabilization. We compare state-of-the-art RL algorithms, specifically Proximal Policy Optimization (PPO), against traditional Proportional...

Full description

Saved in:

Bibliographic Details
Main Author:	Chawa, Mohamed Marwan
Format:	Thesis
Published:	AUC Knowledge Fountain 2024
Subjects:	Quadcopter Quadrotor UAV Reinforcement Learning Simulation-To-Real Acoustics, Dynamics, and Controls Robotics
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1867613424193634304
access_status_str	Open Access
author	Chawa, Mohamed Marwan
author_browse	Chawa, Mohamed Marwan
author_facet	Chawa, Mohamed Marwan
author_sort	Chawa, Mohamed Marwan
collection	Thesis
description	This thesis investigates the potential of Reinforcement Learning (RL) for achieving robust and adaptable quadcopter control, focusing on trajectory and attitude stabilization. We compare state-of-the-art RL algorithms, specifically Proximal Policy Optimization (PPO), against traditional Proportional-Integral-Derivative (PID) controllers across three tasks: hovering, slow trajectory following, and fast trajectory following. To enhance realism, we employ a modified PyFlyt simulation environment with a high-fidelity Crazyflie 2.x model, accounting for motor dynamics, noise, wind disturbances, and aerodynamic drag. The challenge of operating a quadcopter can be divided into two distinct parts: planning a flight path and actually following that path. Our focus is on the latter, training a reinforcement learning controller within a highly realistic simulated setting. This environment directly links simulated sensor data to motor commands, mimicking the real-world operation of the quadcopter. This end-to-end approach allows us to train the controller on the complete process, from sensing the environment to executing actions. Our results show that end-to-end RL-based controllers, consistently outperform both gain-scheduling and cascaded PID controllers in maintaining stable hover, particularly under strong wind conditions where RL controller acheived 70% of the average reward criteria while cascaded PID and gain-scheduling PID scored 45% and 24% respectively. Traditional PID controllers, while widely used, often struggle to maintain stability in the face of external disturbances and changing system dynamics. This is particularly evident in challenging scenarios such as strong winds, where their fixed gains may not be sufficient to counteract the destabilizing forces. RL, on the other hand, learns to adapt its control strategy based on the environment, making it more robust to such disturbances. In slow trajectory following, RL demonstrates a wider range of motion and smoother angular control, achieving performance comparable to cascaded PID. Specically, RL exhibited a 15% increase in the range of motion along the x and y axes compared to gain-scheduling PID, and a 10% reduction in angular velocity fluctuations compared to both PID controllers. This improvement can be attributed to RL’s ability to learn complex control policies that optimize for both position and attitude objectives simultaneously. For fast trajectory following, RL enables significantly faster navigation compared to slower scenarios. In our experiments, RL achieved more than a 60% reduction in the time taken to complete the trajectory compared to the slow trajectory following task. This highlights RL’s ability to exploit the full capabilities of the quadcopter’s actuators when the task demands it. These findings highlight the limitations of traditional PID controllers in dynamic environments and demonstrate the superior adaptability and robustness of RL-based controllers. The ability of RL to learn and adapt to changing conditions makes it a promising approach for quadcopter control in real-world applications where the environment is often unpredictable and disturbances are common. Future research will focus on generalizing these results across different platforms, establishing formal stability guarantees, and exploring real-world applications in complex tasks such as obstacle avoidance and collaborative flight. One of the key challenges in generalizing RL to different quadcopter platforms is the variation in physical parameters and dynamics. Addressing this challenge will require developing RL algorithms that can either adapt to these variations online or learn from a diverse set of training environments. Additionally, while RL has shown promising results in simulation, ensuring stability in real-world deployments remains an open research question. Future work will explore methods for providing formal stability guarantees for RL-based controllers. Finally, extending RL to complex tasks such as obstacle avoidance and collaborative flight will require addressing issues such as sensor limitations, partial observability, and multi-agent coordination. By addressing these challenges, RL has the potential to revolutionize quadcopter control, enabling the deployment of autonomous aerial vehicles in a wide range of applications that were previously considered too complex or dangerous for traditional control methods.
format	Thesis
id	oai:fount.aucegypt.edu:etds-3431
institution	American University in Cairo (Egypt)
last_indexed	2026-06-10T12:35:55.364Z
license_str	Not specified — see source repository
provenance_str_mv	Harvested via OAI-PMH from AUC Knowledge Fountain — bepress
publishDate	2024
publishDateRange	2024
publishDateSort	2024
publisher	AUC Knowledge Fountain
publisherStr	AUC Knowledge Fountain
record_format	dspace
source_str	AUC Knowledge Fountain — bepress
spelling	oai:fount.aucegypt.edu:etds-3431 End-to-End Autonomous Quadcopter using Reinforcement Learning Chawa, Mohamed Marwan This thesis investigates the potential of Reinforcement Learning (RL) for achieving robust and adaptable quadcopter control, focusing on trajectory and attitude stabilization. We compare state-of-the-art RL algorithms, specifically Proximal Policy Optimization (PPO), against traditional Proportional-Integral-Derivative (PID) controllers across three tasks: hovering, slow trajectory following, and fast trajectory following. To enhance realism, we employ a modified PyFlyt simulation environment with a high-fidelity Crazyflie 2.x model, accounting for motor dynamics, noise, wind disturbances, and aerodynamic drag. The challenge of operating a quadcopter can be divided into two distinct parts: planning a flight path and actually following that path. Our focus is on the latter, training a reinforcement learning controller within a highly realistic simulated setting. This environment directly links simulated sensor data to motor commands, mimicking the real-world operation of the quadcopter. This end-to-end approach allows us to train the controller on the complete process, from sensing the environment to executing actions. Our results show that end-to-end RL-based controllers, consistently outperform both gain-scheduling and cascaded PID controllers in maintaining stable hover, particularly under strong wind conditions where RL controller acheived 70% of the average reward criteria while cascaded PID and gain-scheduling PID scored 45% and 24% respectively. Traditional PID controllers, while widely used, often struggle to maintain stability in the face of external disturbances and changing system dynamics. This is particularly evident in challenging scenarios such as strong winds, where their fixed gains may not be sufficient to counteract the destabilizing forces. RL, on the other hand, learns to adapt its control strategy based on the environment, making it more robust to such disturbances. In slow trajectory following, RL demonstrates a wider range of motion and smoother angular control, achieving performance comparable to cascaded PID. Specically, RL exhibited a 15% increase in the range of motion along the x and y axes compared to gain-scheduling PID, and a 10% reduction in angular velocity fluctuations compared to both PID controllers. This improvement can be attributed to RL’s ability to learn complex control policies that optimize for both position and attitude objectives simultaneously. For fast trajectory following, RL enables significantly faster navigation compared to slower scenarios. In our experiments, RL achieved more than a 60% reduction in the time taken to complete the trajectory compared to the slow trajectory following task. This highlights RL’s ability to exploit the full capabilities of the quadcopter’s actuators when the task demands it. These findings highlight the limitations of traditional PID controllers in dynamic environments and demonstrate the superior adaptability and robustness of RL-based controllers. The ability of RL to learn and adapt to changing conditions makes it a promising approach for quadcopter control in real-world applications where the environment is often unpredictable and disturbances are common. Future research will focus on generalizing these results across different platforms, establishing formal stability guarantees, and exploring real-world applications in complex tasks such as obstacle avoidance and collaborative flight. One of the key challenges in generalizing RL to different quadcopter platforms is the variation in physical parameters and dynamics. Addressing this challenge will require developing RL algorithms that can either adapt to these variations online or learn from a diverse set of training environments. Additionally, while RL has shown promising results in simulation, ensuring stability in real-world deployments remains an open research question. Future work will explore methods for providing formal stability guarantees for RL-based controllers. Finally, extending RL to complex tasks such as obstacle avoidance and collaborative flight will require addressing issues such as sensor limitations, partial observability, and multi-agent coordination. By addressing these challenges, RL has the potential to revolutionize quadcopter control, enabling the deployment of autonomous aerial vehicles in a wide range of applications that were previously considered too complex or dangerous for traditional control methods. 2024-12-25T08:00:00Z thesis application/pdf https://fount.aucegypt.edu/etds/2421 https://fount.aucegypt.edu/context/etds/article/3431/viewcontent/main.pdf Theses and Dissertations AUC Knowledge Fountain Quadcopter Quadrotor UAV Reinforcement Learning Simulation-To-Real Acoustics, Dynamics, and Controls Robotics
spellingShingle	Quadcopter Quadrotor UAV Reinforcement Learning Simulation-To-Real Acoustics, Dynamics, and Controls Robotics Chawa, Mohamed Marwan End-to-End Autonomous Quadcopter using Reinforcement Learning
title	End-to-End Autonomous Quadcopter using Reinforcement Learning
title_full	End-to-End Autonomous Quadcopter using Reinforcement Learning
title_fullStr	End-to-End Autonomous Quadcopter using Reinforcement Learning
title_full_unstemmed	End-to-End Autonomous Quadcopter using Reinforcement Learning
title_short	End-to-End Autonomous Quadcopter using Reinforcement Learning
title_sort	end to end autonomous quadcopter using reinforcement learning
topic	Quadcopter Quadrotor UAV Reinforcement Learning Simulation-To-Real Acoustics, Dynamics, and Controls Robotics
url	https://fount.aucegypt.edu/etds/2421 https://fount.aucegypt.edu/context/etds/article/3431/viewcontent/main.pdf
work_keys_str_mv	AT chawamohamedmarwan endtoendautonomousquadcopterusingreinforcementlearning

Full Text Available

End-to-End Autonomous Quadcopter using Reinforcement Learning

Similar Items