We aim to combine the strengths of modal-based algorithmic reasoning and model-free deep learning for robust robot decision-making under uncertainty. Key to our approach is a unified neural network policy representation, encoding both a learned system model and an algorithm that solves the model. The network is fully differentiable and can be trained end-to-end, circumventing the difficulties of direct model learning in a partially observable setting. In contrast with conventional deep neural networks, our network representation imposes the model and algorithmic priors on the neural network architecture for improved generalization of the learned policy.
P. Karkus, D. Hsu, and W. S. Lee. QMDP-Net: Deep learning for planning under partial observability. In Advances in Neural Information Processing Systems, 2017.
QMDP-net employs algorithm priors on a neural network for planning under partial observability. The network encodes a learned POMDP model together with QMDP, a simple, approximate POMDP planner, thus embedding the solution structure of planning in a network learning architecture. We train a QMDP-net on different tasks so that it can generalize to new ones in the parameterized task set, and “transfer” to other similar tasks beyond the set. Interestingly, while QMDP-net encodes the QMDP algorithm, it sometimes outperforms the QMDP algorithm in the experiments, as a result of end-to-end learning.
Navigation Network (Nav-Net)
P. Karkus, D. Hsu, and W. S. Lee. Integrating Algorithmic Planning and Deep Learning for Partially Observable Navigation. In MLPC Workshop, International Conference on Robotics and Automation, 2018.
This work extends the QMDP-net, and encodes all components of a larger robotic system in a single neural network: state estimation, planning, and control. We apply the idea to a challenging partially observable navigation task: a robot must navigate to a goal in a previously unseen 3-D environment without knowing its initial location, and instead relying on a 2-D floor map and visual observations from an onboard camera.
Particle Filter Network (PF-Net)
P. Karkus, D. Hsu, and W. S. Lee. Particle Filter Networks with Application to Visual Localization. arXiv preprint arXiv:1805.08975, 2018.
In this work, we encode the Particle Filter algorithm in a differentiable neural network. PF-net enables end-to-end model learning, which trains the model in the context of a specific algorithm, resulting in improved performance, compared with conventional model-learning methods. We apply PF-net to visual robot localization. The robot must localize in rich 3-D environments, using only a schematic 2-D floor map. PF-net learns effective models that generalize to new, unseen environments. It can also incorporate semantic labels on the floor map.