pgRouting Manual (2.3)

Traveling Sales Person

«  pgr_ksp   ::   Contents   ::   pgr_TSP  »


Traveling Sales Person

Note

These signatures are being deprecated

-- (1)
pgr_costResult[] pgr_tsp(sql text, start_id integer)
pgr_costResult[] pgr_tsp(sql text, start_id integer, end_id integer)

-- (2)
record[] pgr_tsp(matrix float[][], start integer)
record[] pgr_tsp(matrix float[][], start integer, end integer)

Use pgr_eucledianTSP insteadi of (1). Use pgr_TSP instead of (2).

General Information

Origin

The traveling sales person problem was studied in the 18th century by mathematicians
Sir William Rowam Hamilton and Thomas Penyngton Kirkman.

A discussion about the work of Hamilton & Kirkman can be found in the book Graph Theory (Biggs et al. 1976).

  • ISBN-13: 978-0198539162
  • ISBN-10: 0198539169

It is believed that the general form of the TSP have been first studied by Kalr Menger in Vienna and Harvard. The problem was later promoted by Hassler, Whitney & Merrill at Princeton. A detailed description about the connection between Menger & Whitney, and the development of the TSP can be found in On the history of combinatorial optimization (till 1960)

Problem Definition

Given a collection of cities and travel cost between each pair, find the cheapest way for visiting all of the cities and returning to the starting point.

Characteristics

  • The travel costs are symmetric:
    • traveling costs from city A to city B are just as much as traveling from B to A.
  • This problem is an NP-hard optimization problem.
  • To calculate the number of different tours through \(n\) cities:
    • Given a starting city,
    • There are \(n-1\) choices for the second city,
    • And \(n-2\) choices for the third city, etc.
    • Multiplying these together we get \((n-1)! = (n-1) (n-2) . . 1\).
    • Now since our travel costs do not depend on the direction we take around the tour:
      • this number by 2
      • \((n-1)!/2\).

TSP & Simulated Annealing

The simulated annealing algorithm was originally inspired from the process of annealing in metal work.

Annealing involves heating and cooling a material to alter its physical properties due to the changes in its internal structure. As the metal cools its new structure becomes fixed, consequently causing the metal to retain its newly obtained properties.

Pseudocode

Given an initial solution, the simulated annealing process, will start with a high temperature and gradually cool down until the desired temperature is reached.

For each temperature, a neighbouring new solution snew is calculated. The higher the temperature the higher the probability of accepting the new solution as a possible bester solution.

Once the desired temperature is reached, the best solution found is returned

Solution ← initial_solution;

temperature ← initial_temperature;
while (temperature > final_temperature) {

    do tries_per_temperature times {
        snew ← neighbour(solution);
        If P(E(solution), E(snew), T) ≥ random(0, 1)
            solution ← snew;
    }

    temperature ← temperature * cooling factor;
}

Output: the best solution

pgRouting Implementation

pgRouting’s implementation adds some extra parameters to allow some exit controls within the simulated annealing process.

To cool down faster to the next temperature:

  • max_changes_per_temperature: limits the number of changes in the solution per temperature
  • max_consecutive_non_changes: limits the number of consecutive non changes per temperature

This is done by doing some book keeping on the times solution ← snew; is executed.

  • max_changes_per_temperature: Increases by one when solution changes
  • max_consecutive_non_changes: Reset to 0 when solution changes, and increased each try

Additionally to stop the algorithm at a higher temperature than the desired one:

  • max_processing_time: limits the time the simulated annealing is performed.
  • book keeping is done to see if there was a change in solution on the last temperature

Note that, if no change was found in the first max_consecutive_non_changes tries, then the simulated annealing will stop.

Solution ← initial_solution;

temperature ← initial_temperature;
while (temperature > final_temperature) {

    do tries_per_temperature times {
        snew ← neighbour(solution);
        If P(E(solution), E(snew), T) ≥ random(0, 1)
            solution ← snew;

        when max_changes_per_temperature is reached
            or max_consecutive_non_changes is reached
            BREAK;
    }

    temperature ← temperature * cooling factor;
    when no changes were done in the current temperature
        or max_processing_time has being reached
        BREAK;
}

Output: the best solution

Choosing parameters

There is no exact rule on how the parameters have to be chose, it will depend on the special characteristics of the problem.

  • Your computational time is crucial, then put your time limit to max_processing_time.
  • Make the tries_per_temperture depending on the number of cities, for example:
    • Useful to estimate the time it takes to do one cycle: use 1
      • this will help to set a reasonable max_processing_time
    • \(n * (n-1)\)
    • \(500 * n\)
  • For a faster decreasing the temperature set cooling_factor to a smaller number, and set to a higher number for a slower decrease.
  • When for the same given data the same results are needed, set randomize to false.
    • When estimating how long it takes to do one cycle: use false

A recommendation is to play with the values and see what fits to the particular data.

Description Of the Control parameters

The control parameters are optional, and have a default value.

Parameter Type Default Description
start_vid BIGINT 0 The greedy part of the implementation will use this identifier.
end_vid BIGINT 0 Last visiting vertex before returning to start_vid.
max_processing_time FLOAT +infinity Stop the annealing processing when the value is reached.
tries_per_temperature INTEGER 500 Maximum number of times a neighbor(s) is searched in each temperature.
max_changes_per_temperature INTEGER 60 Maximum number of times the solution is changed in each temperature.
max_consecutive_non_changes INTEGER 100 Maximum number of consecutive times the solution is not changed in each temperature.
initial_temperature FLOAT 100 Starting temperature.
final_temperature FLOAT 0.1 Ending temperature.
cooling_factor FLOAT 0.9 Value between between 0 and 1 (not including) used to calculate the next temperature.
randomize BOOLEAN true

Choose the random seed

  • true: Use current time as seed
  • false: Use 1 as seed. Using this value will get the same results with the same data in each execution.

Deprecated functionality

The old functionality is deprecated:

  • User can not control the execution.
  • Not all valuable information is returned.
  • Some returned column don not have meaningful names.
Example:

Using the old functionality, for example

  • id can not be of type BIGINT.
  • id1 and id2 are meningless column names.
  • Needs an index as parameter for the starting node.
SELECT * FROM pgr_TSP(
    $$
    SELECT id::INTEGER, st_X(the_geom) AS x, st_Y(the_geom)AS y  FROM edge_table_vertices_pgr
    $$
    , 1);
NOTICE:  Deprecated Signature pgr_tsp(sql, integer, integer)
 seq | id1 | id2 |       cost        
-----+-----+-----+-------------------
   0 |   1 |   1 |                 1
   1 |   2 |   2 |                 1
   2 |   5 |   5 |                 1
   3 |   8 |   8 |                 1
   4 |   7 |   7 |  1.58113883008419
   5 |  14 |  14 |  1.58113883008419
   6 |  13 |  13 |               0.5
   7 |  15 |  15 |               0.5
   8 |  10 |  10 |                 1
   9 |  11 |  11 |  1.11803398874989
  10 |  17 |  17 |  1.11803398874989
  11 |  12 |  12 | 0.860232526704263
  12 |  16 |  16 |  0.58309518948453
  13 |   6 |   6 |                 1
  14 |   9 |   9 |                 1
  15 |   4 |   4 |                 1
  16 |   3 |   3 |   1.4142135623731
(17 rows)

With the new functionality:

  • id can be of type BIGINT .
  • There is an aggregate cost column.
  • Instead of an index it uses the node identifier for the starting node.
SELECT * FROM pgr_eucledianTSP(
    $$
    SELECT id, st_X(the_geom) AS x, st_Y(the_geom)AS y  FROM edge_table_vertices_pgr
    $$,
    1,
    randomize := false
);
 seq | node |       cost        |     agg_cost     
-----+------+-------------------+------------------
   1 |    1 |   1.4142135623731 |                0
   2 |    3 |                 1 |  1.4142135623731
   3 |    4 |                 1 | 2.41421356237309
   4 |    9 |                 1 | 3.41421356237309
   5 |    6 |  0.58309518948453 | 4.41421356237309
   6 |   16 | 0.860232526704263 | 4.99730875185763
   7 |   12 |  1.11803398874989 | 5.85754127856189
   8 |   17 |  1.11803398874989 | 6.97557526731178
   9 |   11 |                 1 | 8.09360925606168
  10 |   10 |               0.5 | 9.09360925606168
  11 |   15 |               0.5 | 9.59360925606168
  12 |   13 |  1.58113883008419 | 10.0936092560617
  13 |   14 |  1.58113883008419 | 11.6747480861459
  14 |    7 |                 1 | 13.2558869162301
  15 |    8 |                 1 | 14.2558869162301
  16 |    5 |                 1 | 15.2558869162301
  17 |    2 |                 1 | 16.2558869162301
  18 |    1 |                 0 | 17.2558869162301
(18 rows)
Example:

Using the old functionality, for example

  • id, source, target can not be of type BIGINT.
  • It does not return the cost column.
  • Needs an index as parameter for the starting node.
  • The identifiers in the result does not correspond to the indentifiers given as input.
SELECT * FROM pgr_TSP(
    (SELECT * FROM pgr_vidsToDMatrix(
            'SELECT id::INTEGER, source::INTEGER, target::INTEGER, cost, reverse_cost FROM edge_table',
            (SELECT array_agg(id) from edge_table_vertices_pgr WHERE id < 14)::INTEGER[], false , true, true)
    ),
    1
);
 seq | id 
-----+----
   0 |  1
   1 |  2
   2 |  3
   3 |  8
   4 | 11
   5 |  5
   6 | 10
   7 | 12
   8 |  9
   9 |  6
  10 |  7
  11 |  4
  12 |  0
(13 rows)

With the new functionality:

  • id, source, target can be of type BIGINT,
  • There is an aggregate cost column and a cost column in the results.
  • Instead of an index it uses the node identifier for the starting node.
SELECT * FROM pgr_TSP(
    $$
    SELECT * FROM pgr_dijkstraCostMatrix(
        'SELECT id, source, target, cost, reverse_cost FROM edge_table',
        (SELECT array_agg(id) from edge_table_vertices_pgr WHERE id < 14), false)
    $$,
    1,
    randomize := false
);
 seq | node | cost | agg_cost 
-----+------+------+----------
   1 |    1 |    3 |        0
   2 |    4 |    1 |        3
   3 |    9 |    1 |        4
   4 |   12 |    1 |        5
   5 |   11 |    2 |        6
   6 |   13 |    1 |        8
   7 |   10 |    1 |        9
   8 |    5 |    2 |       10
   9 |    7 |    1 |       12
  10 |    8 |    2 |       13
  11 |    6 |    1 |       15
  12 |    3 |    1 |       16
  13 |    2 |    1 |       17
  14 |    1 |    0 |       18
(14 rows)

«  pgr_ksp   ::   Contents   ::   pgr_TSP  »