Watch all our videos in full size on our YouTube channel.


Learning to swing-up and balance a cart-pole. No prior information was given. The system learns only by interaction with the environment.


Learning the ball dribbling behavior on a real robot by reinforcement learning.


Evolution of the Brainstormers Tribots. See the development from the student project to the award winning autonomous robot soccer team. See the project page for more information or have a look at the gallery.


Best of scenes of the final match of the world championships tournament RoboCup 2007 in Atlanta (USA). Brainstormers (Germany) play against Wright Eagle (China).


Machine learning meets a good old space shooter: This video shows the results of applying several reinforcement learning approaches (batch-mode RL and, in particular, the Neural Fitted Q-Iteration algorithm) for the game Star Ships, in the scope of the Star Ships Learning Framework (SSLF). These results were obtained as part of the Bachelor thesis by Christian Lutz.


Video from the programming workshop with apprentices from our sponsor Harting. The participants programmed a robot to autonomously collect tennis balls. See the project page for more infos.


The NeuroRacer slotcar racer uses reinforcement learning to learn the maximal speed it can go on any part of the track without falling out. The system has ben demonstrated several times at the Hannover fair. See here for images.


Visual servoing using a Katana robot arm. Reinforcement learning is used to train the robot to grasp a colored die on the table.


The controller learns to swing up and balance an under-motorized pole system using only raw visual information, with features being extracted directly through deep learning.


In the context of the BrainLinks-BrainTools cluster we use electroencephalography to communicate higher-level intentions to an autonomous controller learned via Neural Fitted Q-Iteration.


Learning a controller for the Ms. Pac-Man arcade game by trial and failure. The learning algorithm is TD lambda using a neural network for value function representation and advanced features as input.


Showcase of the prototype system developed during the first stage of the NeuroBots project (https://www.brainlinks-braintools.uni…) as a collaboration with the iEEG Lab. Imagined motor commands are used for high-level remote control of an autonomous, reinforcement-learning-based robotic system for reaching and grasping several kilometers away.


In this project we present a fully automated system for (approximate) optimal control of non-linear systems. Our approach jointly learns a non-parametric model of the system dynamics – based on Gaussian Process Regression (GPR) – and performs receding horizon control using an adapted iterative LQR formulation. This results in an extremely data-efficient learning algorithm that can operate under real-time constraints. When combined with an effective exploration strategy based on GPR variance, our algorithm successfully learns to control two benchmark problems in simulation (two-link manipulator, cart-pole) as well as a real cart-pole system. Learning succeeds from-scratch (without prior knowledge provided by an expert), in less than 10 episodes of interaction with the system.