reagent.gym.envs.dynamics package


reagent.gym.envs.dynamics.linear_dynamics module

A simple linear dynamic system

class reagent.gym.envs.dynamics.linear_dynamics.LinDynaEnv

Bases: gym.core.Env

A linear dynamical system characterized by A, B, Q, and R.

Suppose x_t is current state, u_t is current action, then:

x_t+1 = A x_t + B u_t Reward_t = x_t’ Q x_t + u_t’ R u_t

static is_pos_def(x)

Resets the environment to an initial state and returns an initial observation.

Note that this function should not reset the environment’s random number generator(s); random variables in the environment’s state should be sampled independently between multiple calls to reset(). In other words, each call of reset() should yield an environment suitable for a new episode, independent of previous episodes.


the initial observation.



observation (object)


Run one timestep of the environment’s dynamics. When end of episode is reached, you are responsible for calling reset() to reset this environment’s state.

Accepts an action and returns a tuple (observation, reward, done, info).


action (object) – an action provided by the agent


agent’s observation of the current environment reward (float) : amount of reward returned after previous action done (bool): whether the episode has ended, in which case further step() calls will return undefined results info (dict): contains auxiliary diagnostic information (helpful for debugging, and sometimes learning)



observation (object)

