Environment

AECEnv

class maenvs4vrp.core.env.AECEnv(instance_generator_object: InstanceBuilder, obs_builder_object: ObservationBuilder, agent_selector_object: BaseSelector, reward_evaluator: RewardFn, seed: int = None, device: str | None = None, batch_size: Size | None = None)[source]

Environment base class.

maenvs4vrp.core.env.AECEnv.__init__(self, instance_generator_object: InstanceBuilder, obs_builder_object: ObservationBuilder, agent_selector_object: BaseSelector, reward_evaluator: RewardFn, seed: int = None, device: str | None = None, batch_size: Size | None = None)

Constructor

Parameters:
  • instance_generator_object (InstanceBuilder) – Generator instance.

  • obs_builder_object (ObservationBuilder) – Observations instance.

  • agent_selector_object (BaseSelector) – Agent selector instance

  • reward_evaluator (RewardFn) – Reward evaluator instance.

  • seed (int) – Random number generator seed. Defaults to None.

  • device (str, optional) – Type of processing. It can be “cpu” or “gpu”. Defaults to None.

  • batch_size (torch.Size) – Batch size. Defaults to None.

maenvs4vrp.core.env.AECEnv._set_seed(self, seed: int | None)

Set the random seed used by the environment.

Parameters:

seed (int, optional) – Seed to be set.

Returns:

None.

maenvs4vrp.core.env.AECEnv.observe(self, is_reset=False) TensorDict

Compute the environment.

Parameters:

is_reset (bool) – If the environment is on reset. Defauts to False.

Returns

TensorDict: Current agent observaions and masks dictionary.

maenvs4vrp.core.env.AECEnv.sample_action(self, td: TensorDict) TensorDict

Compute a random action from avaliable actions to current agent.

Parameters:

td (TensorDict) – Environment instance tensor.

Returns:

Tensor environment instance with updated action.

Return type:

TensorDict

maenvs4vrp.core.env.AECEnv.reset(self) TensorDict

Reset the environment to a starting state and return infos dict.

Parameters:

n/a.

Returns:

Environment information.

Return type:

TensorDict

maenvs4vrp.core.env.AECEnv.step(self, td: TensorDict) TensorDict

Perform an environment step for active agent.

Parameters:

td (TensorDict) – Environment tensor instance.

Returns:

Updated tensor environment instance.

Return type:

TensorDict