Tesi etd-10032023-145821
Link copiato negli appunti
Tipo di tesi
Dottorato
Autore
NAZEER, MUHAMMAD SUNNY
Indirizzo email
sunnynazeer511@gmail.com
URN
etd-10032023-145821
Titolo
Imitative and Adaptive Control of Soft Continuum Arms via Learning Strategies
Settore scientifico disciplinare
ING-IND/34
Corso di studi
Istituto di Biorobotica - PHD IN BIOROBOTICA
Commissione
relatore Dott. FALOTICO, EGIDIO
Membro Prof.ssa MONJE MICHARET, CONCEPCIÓN ALICIA
Membro Dott. THURUTHEL, THOMAS
Membro Prof. CIANCHETTI, MATTEO
Presidente Prof.ssa LASCHI, CECILIA
Membro Prof.ssa MONJE MICHARET, CONCEPCIÓN ALICIA
Membro Dott. THURUTHEL, THOMAS
Membro Prof. CIANCHETTI, MATTEO
Presidente Prof.ssa LASCHI, CECILIA
Parole chiave
- Imitation Learning
- Reinforcement Learning
- Adaptive Control
- Real-time control
- Soft robotics
- Self-healing soft robots
- self-healing soft sensors
Data inizio appello
19/03/2024;
DisponibilitÃ
parziale
Riassunto analitico
The thesis explores the application of various Machine Learning (ML) algorithms in the control of soft robotic arms that deform in a continuum manner. Traditional ML approaches require significant number of samples for effective learning. The aim of this study is to learn control solutions, by using as few samples as possible, in online mode and still be robust to diverse behaviors exhibited by soft robots.
The first class of ML chosen was Imitation Learning (IL); It demonstrates proficiency in learning sophisticated tasks within limited time and samples. However, it relies on expert demonstrations, leading to sample-efficient solutions with limited generalization capabilities. The ability to generalize well is exacerbated for soft robots, in particular, due to their unpredicable behavior. Moreover, if IL includes human kinesthetic demonstrations, the task illustrations may include substantial variability in them due to the flexible morphology of soft robots which further reduces the reliability of trained solutions. We proposed a new algorithm, called Soft DAgger, equipped with dealing with these challenges by using Transfer Learning (TL) and a sample efficient synthetic agent, while complementing the vanilla DAgger algorithm for its existing challenges in the literature. Three case studies were conducted under IL. The first case study tests a simpler variant of Soft DAgger with TL and a kinematic behavioral map of a plant-inspired tendon-driven soft arm, in kinematics domain for learning from plant gravitropic movements. Subsequently, Soft DAgger was employed in a more evolved task in dynamic domain of a pneumatically actuated soft arm where the task is learned from unconstrained human kinesthetic demonstrations only. The algorithm exhibited improved generalization capability despite the stochasticity of the soft robot and the variability in expert demonstrations. An additional case study was undertaken that involves learning by observing where the learner and the observer has completely different morphologies.
It was concluded from the case studies on IL, that they are more applicable if qualitative task learning is required; Precision in task execution requires additional exploratory agents, such as Reinforcement Leanring (RL)-inspired agents. RL solutions, though inherently robust, suffer from sample inefficiency, making them impractical for direct online learning on complex platforms like soft robots. This problem has been mitigated in literature by modelling soft robot behavior using data-driven recurrent architectures. However, these models do not account for the stochasticity in soft robots and accumulate error over time. That's why, the RL solutions trained on such models still exhibit significant performance disparity from training environment or data-driven model to real platform (we refer to as training-to-reality gap here). In the light of this, neuroscience-inspired recurrent cerebellar architecture in open and closed loop manner, and IL-by-coaching approach are employed in conjuction with offline RL-training algorithms to overcome the training-to-reality gap. These approaches were tested on two different types of pneumatic soft arms for adaptive and precise task execution with a special focus on online and sample-efficient learning. Two case studies were undertaken under RL here; First case study regarding overcoming stochasticity, training-to-reality gap, and constant external stress on a two-module soft arm for trajectory tracking with $\leq1\%$ tracking error and the second case study on overcoming stochasticity, training-to-reality gap, and recovering from a variety of incidents of damage deliberately instigated to a three-module soft arm for obstacle avoidance and dynamic reaching with $\leq1\%$ reaching error.
The stratregies are built on the strengths of IL and RL approcahes while complementing the existing challenges in the state-of-the-art. These solutions introduce imitative and adaptive behaviors in the control of soft robots. We wish to deploy these algorithms in a class of soft robotics involving functional materials like self-healing materials, contributing to a new adaptive modality in soft continuum robotics. As a first step towards this aim, an additional area was targetted in this thesis; Design and fabrication of three independent designs of self-healing soft robots --- two pneumatically actuated and one tendon-driven. The capabilities of these robots are characterized and outlined. Finally, the adaptive learning capability combined with behavioral-recovering self-healing soft robots, we aspire to also instigate self-sufficiency in these platforms by, simultaneously, introducing local deformation sensing capability. Towards that goal, a study on two different designs of self-healing strain sensors is presented at the end of the thesis. The two designs were fabricated with three different sensing modalities, i.e., with liquid-metal (GALISTAN), laser-induced graphene, and conductive hydrogel, resulting in six different self-healing strain sensors.
The first class of ML chosen was Imitation Learning (IL); It demonstrates proficiency in learning sophisticated tasks within limited time and samples. However, it relies on expert demonstrations, leading to sample-efficient solutions with limited generalization capabilities. The ability to generalize well is exacerbated for soft robots, in particular, due to their unpredicable behavior. Moreover, if IL includes human kinesthetic demonstrations, the task illustrations may include substantial variability in them due to the flexible morphology of soft robots which further reduces the reliability of trained solutions. We proposed a new algorithm, called Soft DAgger, equipped with dealing with these challenges by using Transfer Learning (TL) and a sample efficient synthetic agent, while complementing the vanilla DAgger algorithm for its existing challenges in the literature. Three case studies were conducted under IL. The first case study tests a simpler variant of Soft DAgger with TL and a kinematic behavioral map of a plant-inspired tendon-driven soft arm, in kinematics domain for learning from plant gravitropic movements. Subsequently, Soft DAgger was employed in a more evolved task in dynamic domain of a pneumatically actuated soft arm where the task is learned from unconstrained human kinesthetic demonstrations only. The algorithm exhibited improved generalization capability despite the stochasticity of the soft robot and the variability in expert demonstrations. An additional case study was undertaken that involves learning by observing where the learner and the observer has completely different morphologies.
It was concluded from the case studies on IL, that they are more applicable if qualitative task learning is required; Precision in task execution requires additional exploratory agents, such as Reinforcement Leanring (RL)-inspired agents. RL solutions, though inherently robust, suffer from sample inefficiency, making them impractical for direct online learning on complex platforms like soft robots. This problem has been mitigated in literature by modelling soft robot behavior using data-driven recurrent architectures. However, these models do not account for the stochasticity in soft robots and accumulate error over time. That's why, the RL solutions trained on such models still exhibit significant performance disparity from training environment or data-driven model to real platform (we refer to as training-to-reality gap here). In the light of this, neuroscience-inspired recurrent cerebellar architecture in open and closed loop manner, and IL-by-coaching approach are employed in conjuction with offline RL-training algorithms to overcome the training-to-reality gap. These approaches were tested on two different types of pneumatic soft arms for adaptive and precise task execution with a special focus on online and sample-efficient learning. Two case studies were undertaken under RL here; First case study regarding overcoming stochasticity, training-to-reality gap, and constant external stress on a two-module soft arm for trajectory tracking with $\leq1\%$ tracking error and the second case study on overcoming stochasticity, training-to-reality gap, and recovering from a variety of incidents of damage deliberately instigated to a three-module soft arm for obstacle avoidance and dynamic reaching with $\leq1\%$ reaching error.
The stratregies are built on the strengths of IL and RL approcahes while complementing the existing challenges in the state-of-the-art. These solutions introduce imitative and adaptive behaviors in the control of soft robots. We wish to deploy these algorithms in a class of soft robotics involving functional materials like self-healing materials, contributing to a new adaptive modality in soft continuum robotics. As a first step towards this aim, an additional area was targetted in this thesis; Design and fabrication of three independent designs of self-healing soft robots --- two pneumatically actuated and one tendon-driven. The capabilities of these robots are characterized and outlined. Finally, the adaptive learning capability combined with behavioral-recovering self-healing soft robots, we aspire to also instigate self-sufficiency in these platforms by, simultaneously, introducing local deformation sensing capability. Towards that goal, a study on two different designs of self-healing strain sensors is presented at the end of the thesis. The two designs were fabricated with three different sensing modalities, i.e., with liquid-metal (GALISTAN), laser-induced graphene, and conductive hydrogel, resulting in six different self-healing strain sensors.
File
Nome file | Dimensione |
---|---|
Ci sono 1 file riservati su richiesta dell'autore. |