Environment

CVRPTW environment operations.

Environment settings are defined in file env.py.

Environment

class maenvs4vrp.environments.cvrptw.env.Environment(instance_generator_object: InstanceBuilder, obs_builder_object: ObservationBuilder, agent_selector_object: BaseSelector, reward_evaluator: RewardFn, seed=None, device: str | None = None, batch_size: Size | None = None)[source]

CVRPTW environment generator class.

__init__(instance_generator_object: InstanceBuilder, obs_builder_object: ObservationBuilder, agent_selector_object: BaseSelector, reward_evaluator: RewardFn, seed=None, device: str | None = None, batch_size: Size | None = None)[source]

Constructor.

Parameters:
  • instance_generator_object (InstanceBuilder) – Generator instance.

  • obs_builder_object (ObservationBuilder) – Observations instance.

  • agent_selector_object (BaseSelector) – Agent selector instance

  • reward_evaluator (RewardFn) – Reward evaluator instance.

  • seed (int) – Random number generator seed. Defaults to None.

  • device (str, optional) – Type of processing. It can be “cpu” or “gpu”. Defaults to None.

  • batch_size (torch.Size) – Batch size. Defaults to None.

check_solution_validity()[source]

Check if solution is valid according to CVRPTW constraints.

Parameters:

N/a.

Returns:

None. Raises AssertionError if invalid.

observe(is_reset=False) TensorDict[source]

Retrieve agent environment observations.

Parameters:

is_reset (bool) – If the environment is on reset. Defauts to False.

Returns

td_observations(TensorDict): Current agent observaions and masks dictionary.

reset(num_agents: int | None = None, num_nodes: int | None = None, capacity: int | None = None, service_times: float | None = None, instance_name: str | None = None, sample_type: str = 'random', instance_dict: Dict = None, force_visit: bool = False, batch_size: Size | None = None, n_augment: int | None = None, seed: int | None = None, device: str | None = 'cpu') TensorDict[source]

Reset the environment.

Parameters:
  • num_agents (int, optional) – Total number of agents. Defaults to None.

  • num_nodes (int, optional) – Total number of nodes. Defaults to None.

  • capacity (int, optional) – Total capacity for each agent. Defaults to None.

  • service_times (float, optional) – Service time in the nodes. Defaults to None.

  • instance_name (str, optional) – Instance name. Defaults to None.

  • sample_type (str) – Sample type. It can be “random”, “augment” or “saved”. Defaults to “random”.

  • force_visit (bool) – It forces the agent to visit all feasible nodes before going back to depot. Defaults to True.

  • batch_size (torch.Size, optional) – Batch size. Defaults to None.

  • n_augment (int, optional) – Data augmentation. Defaults to None.

  • seed (int, optional) – Random number generator seed. Defaults to None.

Returns:

Environment information dictionary.

Return type:

TensorDict

sample_action(td: TensorDict) TensorDict[source]

Compute a random action from avaliable actions to current agent.

Parameters:

td (TensorDict) – Environment instance tensor.

Returns:

Environment instance tensor with updated action.

Return type:

td(TensorDict)

step(td: TensorDict) TensorDict[source]

Perform an environment step for active agent.

Parameters:

td (TensorDict) – Environment tensor instance.

Returns:

Updated environment tensor instance.

Return type:

td(TensorDict)