Environment¶
AECEnv¶
- class maenvs4vrp.core.env.AECEnv(instance_generator_object: InstanceBuilder, obs_builder_object: ObservationBuilder, agent_selector_object: BaseSelector, reward_evaluator: RewardFn, seed: int = None, device: str | None = None, batch_size: Size | None = None)[source]¶
Environment base class.
- maenvs4vrp.core.env.AECEnv.__init__(self, instance_generator_object: InstanceBuilder, obs_builder_object: ObservationBuilder, agent_selector_object: BaseSelector, reward_evaluator: RewardFn, seed: int = None, device: str | None = None, batch_size: Size | None = None)¶
Constructor
- Parameters:
instance_generator_object (InstanceBuilder) – Generator instance.
obs_builder_object (ObservationBuilder) – Observations instance.
agent_selector_object (BaseSelector) – Agent selector instance
reward_evaluator (RewardFn) – Reward evaluator instance.
seed (int) – Random number generator seed. Defaults to None.
device (str, optional) – Type of processing. It can be “cpu” or “gpu”. Defaults to None.
batch_size (torch.Size) – Batch size. Defaults to None.
- maenvs4vrp.core.env.AECEnv._set_seed(self, seed: int | None)¶
Set the random seed used by the environment.
- Parameters:
seed (int, optional) – Seed to be set.
- Returns:
None.
- maenvs4vrp.core.env.AECEnv.observe(self, is_reset=False) TensorDict¶
Compute the environment.
- Parameters:
is_reset (bool) – If the environment is on reset. Defauts to False.
- Returns
TensorDict: Current agent observaions and masks dictionary.
- maenvs4vrp.core.env.AECEnv.sample_action(self, td: TensorDict) TensorDict¶
Compute a random action from avaliable actions to current agent.
- Parameters:
td (TensorDict) – Environment instance tensor.
- Returns:
Tensor environment instance with updated action.
- Return type:
TensorDict
- maenvs4vrp.core.env.AECEnv.reset(self) TensorDict¶
Reset the environment to a starting state and return infos dict.
- Parameters:
n/a.
- Returns:
Environment information.
- Return type:
TensorDict
- maenvs4vrp.core.env.AECEnv.step(self, td: TensorDict) TensorDict¶
Perform an environment step for active agent.
- Parameters:
td (TensorDict) – Environment tensor instance.
- Returns:
Updated tensor environment instance.
- Return type:
TensorDict