The Stochastic Steiner Tree Problem

This page contains further informations and benchmark data for the Stochastic Steiner Tree Problem and our work on this topic as described in

Solving Two-Stage Stochastic Steiner Tree Problems by Two-Stage Branch-and-Cut
Immanuel Bomze, Markus Chimani, Michael Jünger, Ivana Ljubic, Petra Mutzel, and Bernd Zey
in: 21st International Symposium on Algorithms and Computation (ISAAC 2010)
Lecture Notes in Computer Science 6506, Springer-Verlag, 2010, pp. 427-439

or the related technical report

TR 2010–03: Solving Two-Stage Stochastic Steiner Tree Problems by Two-Stage Branch-and-Cut
Immanuel Bomze, Markus Chimani, Michael Jünger, Ivana Ljubic, Petra Mutzel, Bernd Zey


Problem Description

For an undirected graph G=(V,E) with edge costs c_e and a set of terminal nodes R ⊆ V, the Steiner Tree Problem (STP) aks for a minimum-cost subgraph connecting all terminals.

Here, we consider the Two-Stage Stochastic Steiner Tree Problem (SSTP) with fixed recourse and finitely many scenarios in which the terminal set and the edge costs are subject to uncertainty.

This means there are two points in times (also called stages): In the first stage (now), we only have probabilistic information in terms of possible outcomes (scenarios) in the future (second stage).

A scenario thereby specifies a terminal set, edge costs, and a probability for it to be realized. Edge costs are higher in the second stage and based on these informations, we may buy some edges at a lower price now and in the second stage, when one of the given scenarios is realized, we have to purchase additional edges in order to interconnect the (now known) terminal nodes.

The goal is to make a decision about edges to be purchased in the first stage, while minimizing the expected cost of the full solution (i.e., after the second stage). More formally, let $E^0$ be the selected first-stage edges and $E^k$ the edges of the $k$th scenario. Then the objective is $\min \sum_{e \in E^0} c_e^0 + \sum_k p^k \sum_{e \in E^k} c_e^k$.


Instances

Deterministic Steiner tree input graphs G=(V,E) with edge costs c_e are transformed into the SSTP instances as follows:

  • For each instance we generate several stochastic instances with K scenarios: K ∈ {5,10,20,50,100,200,500,1000}. Update: The new instances are closely related: the smaller instances depend on the bigger ones: For each deterministic STP instance we generated 1000 scenarios and an SSTP instance with $k\le 1000$ scenarios simply contains the first $k$ scenarios with probabilities scaled appropriately. Therefore the scenario set of a, e.g., 10-scenario instance is a superset of the 5-scenario instance.
  • To obtain scenario probabilities p_k, we distribute 10000 points (corresponding to the probability of 1/10000, each) among these scenarios randomly (ensuring that each scenario has at least probability 1/10000).
  • For each scenario k, we construct the set of terminal nodes R_k by independently picking each terminal or Steiner node with probability 0.3 or 0.05, respectively.
  • Each second stage edge cost is randomly (independent, uniform) drawn from the interval [1.1c_e, 1.3c_e].

(Deterministic) (Prize-Collecting) Steiner Tree Instances

The deterministic instances can be found here:

Stochastic Steiner Tree Instances - SSTPLib

The benchmark instances used in the computational comparison can be found here:

  • K: 77 instances total, 6.2 MB (.zip)
  • P: 35 instances total, 5.3 MB (.zip)
  • lin01-lin06: 42 instances total, 2 MB (.zip)

New instances:

  • K100: 154 instances total (based on 11 deterministic instances), 10.7 MB (.zip)
  • P100: 70 instances total (based on 5 deterministic instances), 9.1 MB (.zip)
  • Lin01-10: 140 instances total (based on 10 deterministic instances), 9.3 MB (.zip)
  • wrp: 196 instances total (based on 14 deterministic instances), 7.4 MB (.zip)
  • These instances are “better” for computational studies because the smaller instances depend on the bigger ones: For each deterministic STP instance we generated 1000 scenarios and an SSTP instance with $k\le 1000$ scenarios simply contains the first $k$ scenarios with probabilities scaled appropriately. Therefore the scenario set of a 10-scenario instance is a superset of the 5-scenario instance.

The .zip files contain several text files of type .sstp; each file is a single instance. The number in the filename before the suffix “.sstp” describes the number of scenarios. For example K100.8.con.red-100s.sstp is a stochastic instance with 100 scenarios derived from the deterministic steiner tree instance K100.8. If the filename contains the string “red” then the steiner tree instance is already preprocessed.

Each file has the same structure and consists of 4 important parts

  • general: number of scenarios and id of the root node
  • probabilities: the probabilities for each scenario
  • node: each line represents a node with the following informations:
    • identifier (starts at 1)
    • x- and y-coordinate (not important)
    • a (0,1)-vector indicating whether the vertex is a terminal in each scenario
  • link: each line represents an edge with the following informations:
    • identifier (starts at 1)
    • id of the source and target node
    • a vector with the weights for this edge in the first stage and each scenario

All values are separated by a tabulator (“\t”).

Lines starting with a “#” are comments. In general, each file starts with some informations about the deterministic instance, the probability distribution, etc.

Example:

# file generated autmatically from file lin01.stp 
# 53 nodes, 80 edges 
# nr Scenarios=5 
# <some more informations> 
general 
# nr scenarios | root node 
5	1 
probabilities 
# probability for each scenario (p_k) 
0.19720	0.17320	0.12910	0.11940	0.38110 
node  
#id | x | y | terminals Scenario 0 | terminals Scenario 1 | ... 
1	0	0	1	1	1	1	1 
2	0	0	0	0	0	0	0 
<etc.> 
link 
# id	| source | target | weight 1st stage | weight Scenario 1 | weight Scenario 2 | ... 
1	1	32	46.000	63.000	68.000	65.000	63.000	64.000 
2	1	25	26.000	37.000	38.000	38.000	37.000	38.000 
<etc.> 

Solutions

Solutions for the K- and P-instances can be found in this Excel-table.

Results for the lin-instances are summarized in the following table:

EF2-Stage B&C (BC*)
Instance K R_avg OPT* t[s] b&b gap #iter  t[s] b&b gap #iter
lin01 5 4.6 797.0 0.2 1 34 2.2 1 61
lin01 10 4.2 633.2 0.7 3 59 2.5 3 50
lin01 20 4.6 753.9 5.7 3 63 6.9 3 52
lin01 50 4.7 768.9 33.4 3 70 10.4 3 36
lin02 5 4.6 476.2 0.1 1 24 1.1 1 45
lin02 10 5.3 739.1 1.0 1 33 3.0 1 47
lin02 20 5.3 752.2 4.9 1 69 4.3 1 37
lin02 50 5.1 732.6 31.2 1 70 10.7 1 35
lin03 5 4.4 653.0 0.5 1 80 1.9 1 55
lin03 10 5.2 834.7 3.8 7 90 8.7 7 91
lin03 20 5.8 854.9 10.8 1 92 7.3 1 41
lin03 50 5.5 895.7 103.1 3 106 21.3 3 43
lin04 5 10.4 1922.1 140.4 3 315 959.2 47 567
lin04 10 9.8 1959.1 415.8 7 244 989.2 7 339
lin04 20 9.3 1954.9 5498.7 11 833 3016.7 13 575
lin04 50 9.8 2097.7 (2h) 1 19.5 185 5330.2 11 269
lin05 5 10.2 2215.5 282.0 53 722 2681.2 35 1558
lin05 10 11.4 2210.2 1866.7 5 1130 4096.0 35 1502
lin05 20 11.1 2412.2 (2h) 11 5.6 1060 (2h) 17 4.7 890
lin05 50 11.6 2297.0 (2h) 1 21.3 210 3627.4 1 159
lin06 5 11.0 1975.8 212.8 53 797 760.9 19 834
lin06 10 10.6 1918.7 501.7 5 260 808.4 3 306
lin06 20 14.0 2457.6 (2h) 11 1099 3222.9 11 459
lin06 50 12.6 2186.8 (2h) 1 22.5 221 2795.5 11 215

description:

  • K: number of scenarios
  • R_avg: average number of terminals in each scenario
  • OPT*: the best found solution (optimum, if gap is 0)
  • t[s]: running time in seconds, (2h) means that the algorithm didn't find the optimum solution within the timelimit of two hours
  • b&b: number of branch&bound nodes
  • gap: gap between lower and upper bound. “—” means that there is no gap
  • #iter: number of iterations

Example

An optimum solution (1st and 3rd scenario) for an instance with 50 vertices and 5 scenarios is presented in the following pictures. The green vertices are the terminals in the scenario, the red vertex is the root node, the solid lines are the edges bought in the first stage, and the dashed lines the edges of the scenario.

As one can easily see, the first stage is disconnected and furthermore, in scenario 2 there are Steiner leaves.

Scenario 1 of the optimum solution with n=50 Scenario 3 of the optimum solution with n=50

The images are generated by storing the solution as .gml file, loading it with yEd, and exporting it as a .png.

 
Last modified: 2015-09-11 10:08 (external edit)
DokuWikiRSS-Feed