Top 5 open-source reinforcement learning frameworks

The continuous reward and reprimand system of reinforcement learning has come a long way from its initial days. While the technique has taken a while to develop and doesn’t have the simplest application, it is one of the most important advancements in AI, like leading self-driving software in autonomous vehicles and AI racking up games in poker. Reinforcement learning algorithms like AlphaGo and AlphaZero were able to excel at a game like Go by just playing it on its own. Rather than challenging reinforcement learning, it is the method that is most important to human cognitive learning. Fortunately, aside from the gaming domain, which is more competitive and cutting-edge, there are a growing number of reinforcement learning frameworks that are publicly available now.

DeepMind’s OpenSpiel

DeepMind is one of the most active contributors to open-source deep learning stacks. Even back in 2019, Alphabet’s DeepMind launched a reinforcement learning framework that was game-oriented, called OpenSpiel. Basically, the framework contains a set of environments and algorithms that can help with general reinforcement learning, especially in the context of games. OpenSpiel provides tools for searching and planning games as well as analyzing learning dynamics and other common evaluation metrics.

The framework supports more than 20 single and multi-agent game types, including collaborative, zero-sum games, one-shot games and sequential games. That is, in addition, strictly turn-taking games, auction games, matrix games, and simultaneous-move games, plus perfect games (where players are perfectly informed of all the events that they have made when making a decision) and imperfect information. games (where decisions are made simultaneously).

The developers keep the simplicity and minimalism of the main ethos while building OpenSpiel, due to which it is fully optimized and high performing codes. The framework also has minimal dependencies and keeps footprints at a minimum, cutting down on the possibility of compatibility issues. The framework is also easily installed and easy to understand and extend.

Source: Research PaperGame implementations in OpenSpiel

Games in OpenSpiel are written in C ++, while some custom RL environments are available in Python.

Open AI’s Gym

Source: OpenAI GymSample of a game for a gym environment

OpenAI created Gym with the intention to maintain a toolkit that develops and compares reinforcement learning algorithms. It is a Python library that contains a vast number of testing environments so that users can write general algorithms and test them on Gym’s RL agent algorithms’ shared interface. Gym has specific environments that are arranged in an environment-agent style. This means that the framework gives the user access to an agent that can perform certain actions in an environment. Once it performs the action, the gym gets the action and the action taken as a reward.

The environments that Gym offers are: Algorithmic, Atari, classic control and toy text, 2D and 3D robots. The gym was created to fill in the gap that was present in many of the standardized environments used. In a small tweak, the definition of the problem, such as the reward or the actions, can up the difficulty level. Besides, there was also a need for better benchmarks such as pre-existing open-source RL frameworks that were not diverse enough.

TensorFlow’s TF-Agents

TensorFlow’s TF-Agents was built as an open source infrastructure paradigm to help build parallel RL algorithms on TensorFlow. The framework provides various components that match the main portions of an RL problem to help users design and implement algorithms easily.

Instead of making singular observations, the platform simulates two parallel environments and performs the neural network computation instead of a batch. These removes the need for manual synchronization and the TensorFlow engine for parallelise computation. Within all environments the framework is built using separate Python processes.

Meta AI’s ReAgent

Source: MetaAIReAgent’s serving platform

Meta AI released ReAgent in 2019 as a toolkit to build models that could be used for guiding decision-making in real-life situations. Combining the terms ‘reasoning’ and ‘agents,’ the framework is currently used by social media platforms to make decisions every day.

ReAgent is used for three major resources: models that make decisions on the basis of how to evaluate an offline module before they go into production, and a platform that deploys models at scale, collectors feedback and iterates the models fast.

Horizon is the first open-source end-to-end RL platform that ReAgent was built on. While Horizon could only be employed in the existing models, the ReAgent was created as a tiny C ++ library and could be embedded into any application.

Uber AI’s Fiber

Source: Uber Engineering, A Computer Cluster on Fiber Works

As machine learning tasks have multiplied, so has the need for computation power. To help address this issue, Uber AI released Fiber, a Python-based library that works with computer clusters. Fiber was developed by Uber itself within a large number of parallel computing projects.

Fiber is comparable to ipyparallel, which is the equivalent of iPython for parallel computing, spark and the regular Python multiprocessing library. The research conducted by Uber AI showed that when tasks were shorter, Fiber outperformed its alternatives. To be able to run on different types of cluster management systems, Fiber was divided into three layers: API layer, backend layer and cluster layer.

Fiber is also an adept at error handling. Once a new pool is created, an associated task queue, result queue and a pending table are created. Every fresh task is added to the queue that is then shared between the master and worker processes. A user grabs a task from the queue and then runs the task within that task. Once a task is done from the task queue, there is an entry added to the pending table.

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button
AGADIR-GROUP