Views: 6 | Downloads: 9
Robots that are supposed to perform human-like tasks must possess appropriate skills
to carry them out. In unstructured environments and for complex tasks, these skills are
difficult to pre-program due to the complexity of the real world. It is therefore advantageous
if robots have the ability to acquire the necessary skills by learning.
Dynamic movement primitives (DMPs) have proven to be an effective movement representation
for motor skill learning. In this thesis, we propose a new approach for training
deep neural networks to synthesize DMPs. The distinguishing property of our approach is
that it can utilize a novel loss function that measures the physical distance between movement
trajectories as opposed to measuring the distance between the parameters of DMPs
that have no physical meaning. This was made possible by deriving differential equations
that can be applied to compute the gradients of the proposed loss function, thus enabling
backpropagation to optimize the parameters of the underlying deep neural network.
The choice of an appropriate representation is important when reconstructing motion.
A version of DMPs called arc-length dynamic movement primitive (AL-DMP) can separate
the spatial from temporal aspects of motion and is more suitable for processing data that do
not contain temporal information. We therefore extended our approach to neural networks
that can synthesize spatial paths represented by AL-DMPs.
While the developed approaches are applicable to any neural network architecture,
they were evaluated on two different architectures based on encoder–decoder networks and
convolutional neural networks. Moreover, we proposed deep neural network architectures
that support the processing of variable-size images and images with cluttered background.
The developed approaches were applied for the reproduction of handwritten digits from
single images. Our results show that the minimization of the proposed loss functions leads
to better results than when more conventional loss functions are used. Our experiments
also show that the network can be applied to input images of sizes that are different from
the size of training images. Finally, the proposed approaches were successfully applied for
reproducing real handwritten digits with a humanoid robot.
Just like humans, robots can improve their performance by practicing, i. e. by performing
the desired behavior many times and updating the underlying skill representation
using the newly gathered data. In this thesis, we propose to implement robot practicing
by applying statistical and reinforcement learning (RL) in a latent space of the selected
skill representation. The latent space is computed by a deep autoencoder neural network,
with the data to train the network generated in simulation. However, we show that the
resulting latent space representation is useful also for learning on a real robot.
Our simulation and real-world results demonstrate that by exploiting the latent space
of the underlying motor skill representation, a significant reduction of the amount of data
needed for effective learning by Gaussian Process Regression (GPR) can be achieved. Similarly,
the number of RL epochs can be significantly reduced. Finally, it is evident from our
results that an autoencoder-based latent space is more effective for these purposes than a
latent space computed by Principal Component Analysis (PCA).