Comparison between the following works:

## 1 Problem Encoding

### Network Planning

Node-link transformation

- Transforms links into nodes.
- We care about edges instead of nodes, but GNNs are most mature for handling node features and performing node tasks.
- Directly using GNNs cannot handle parallel links.

Generates graph representations directly from network topologies after node-link transformation:

- Topologies (edges and nodes) as is.
- Node feature is the current IP capacity of the node.

### Mixed Integer Program

- Encodes MIP as bipartite graph:

## 2 ILP Structure

### Network Planning

Problem formulation

- Eq. 2: Flow conservation constraint
- Eq. 3: Link capacity constraint
- Eq. 4: Spectrum consumption constraint
- Eq. 5: Existing topology constraint

Corresponding network

## 3 Workflow

### Network Planning

- The input of NeuroPlan consists of five components, i.e., traffic demand, network topology, failure scenarios, reliability policy and cost model. The RL agent only needs to encode the network topologies. The other four components are handled by the plan evaluator.
- The RL agent interacts with the plan evaluator to learn to generate network plans that minimize the network cost while satisfying the traffic demand under the reliability policy. The plan evaluator generates rewards to the RL agent. It receives the network plan from the RL agent, checks whether the traffic demand is satisfied under different failure scenarios, and uses the cost model to compute a cost of the plan.
- The reliability policy specifies the demand of flows with which Classes of Service (CoS) has to be satisfied under which subset of failure scenarios. A reward is calculated based on a combination of whether the demand is satisfied and the cost of the plan.
- After the learning process is completed, the RL agent outputs an initial plan for the first stage. At the second stage, the ILP solver uses ILP to find the final plan in the pruned search space near the initial plan defined by the relax factor $\alpha$.