A modified version of the stochastic traveling salesperson
problem is described and used to prove a rate
of convergence result.
1. Introduction. Consider a special finite collection C of graphs. (Here vertices will be points in Euclidean space, and an edge between two vertices will be the line segment between the two points.) Associated with each graph in C is a number which is the sum of its edge lengths. A common optimization problem is to find the minimum such sum. Well-known examples include the traveling salesperson problem, which asks for the length of a minimal cycle through a collection of vertices, and the minimal spanning tree problem, which is self-explanatory.
Over the last 15 years at least, there has been considerable interest in an interface between problems such as these and probability. Vertices are considered to be realizations of some random process, and what is of interest is the behavior (both average and probability one behavior) of such minimal sums as the number of vertices increases.
The remarks below are not intended to be a survey, nor are they
intended to be completely rigorous. Our goal is to give the reader
a small taste of the appealing nature of some of the problems
and solutions often encountered in this area. In particular,
we wish to describe how a new boundary rooting technique can be
used to prove a rate of convergence result. For the sake of a
simple exposition, we will limit our comments to the traveling
salesperson problem in dimension 2 where the vertices have the
uniform distribution on the unit square. We will only address
expected, or average, behavior and leave the statements of probability
one results in full generalization (other dimensions, problems,
and distributions) to section 5.
2. Two Questions.Let
be a sequence of points distributed uniformly and independently
in the unit square. Let
be the length
of a minimal cycle through
. (See illustration
1.)
is a random variable, and it is
natural to inquire about its expectation,
,
which we will abbreviate
One question
which immediately comes to mind is "How fast does
grow as
?" It is not difficult the
show that
(2.1)
where are positive constants which do
not depend upon n. The above display can be rewritten
as
(2.2) .
This raises the further problem of whether or not the sequence
converges, which we will call question
#1.
Question #1: Does
converge?
This question was answered in 1959 by Beardwood, Halton, and
Hammersley.
Theorem #1. There exists a positive constant
such that .
Once a limit theorem such as this has been proved, it makes sense
to inquire into the rate of such convergence.
Question #2: How fast does
approach zero?
Beardwood, Halton, and Hammersley conjectured that this convergence
was at least as fast as (From this point
on, C will represent a constant not depending on n,
the value of which may change from line to line.) The truth of
this conjecture was not established until 1994 when Alexander
proved it.
Theorem #2: There exists a constant
C not depending on n such that
Essential to the proof of the rate theorem #2 is a property of
called subadditivity. The proof requires
establishing upper and lower bounds for the quantity
.
Subadditivity yields the upper bound easily. Most of the work
is in establishing the lower bound.
What we intend for the remainder of this note is to give an argument
for the lower bound which is different from that of Alexander's.
The argument uses a technique called "boundary rooting"
which was introduced by Redmond and Yukich (1994). However, it
is important for the reader to first understand a bit about subadditivity.
3. Subadditivity. Note that if n points are distributed
uniformly and independently in a square of edge length 1/m,
then the average length of a minimal cycle through these points
is Now, if n points are distributed
uniformly and independently in the unit square which has been
subdivided into
subsquares of edge length
1/m, then "about"
points
fall in a subsquare. (n here is divisible by
.)
Therefore, the average length of a minimal cycle through the
points falling in a subsquare is "about"
(See illustration 2.)
With some work it is possible to show that
the average length of a minimal cycle through all n points
is approximately less than the sum of the average lengths of the
minimal cycles through the points in each subsquare. Precisely
we can say the following.
(3.1)
In the display above, is the number of
subsquares,
is the approximate average
length of a minimal cycle through the points in a subsquare, and
Cm is an error term accounted for by the fact that there
are not exactly
points in each subsquare
and the fact that the minimal cycles in the subsquares must be
patched together before (3.1) can be established.
Now, in (3.1), replace n with
The result is the following.
(3.2) .
This holds for any positive integers n and m. Dividing
by we obtain
(3.3) .
Notice that the right side of this inequality does not depend
upon m. Now, if we let by theorem
#1 we obtain
which, rewritten, is
(3.4) .
Thus, subadditivity yields easily the upper bound necessary
for establishing theorem #2. We now take a different approach
to that of Alexander's and show how the boundary rooting technique,
in an appealing way, furnishes the necessary lower bound.
4. The Boundary Rooted Traveling Salesperson Tour. Imagine
a salesperson traveling along a path in the unit square. Let
us define the "cost" of such a path to be the length
of that portion of the path which does not intersect the boundary.
Hence, whenever the salesperson travels along the boundary, he/she
is incurring no cost. Now the salesperson wishes to visit each
vertex exactly once and return to the starting point while minimizing
the cost of the trip. It may occasionally be more beneficial
to go from one vertex to another by exiting to the boundary, traveling
the boundary for free, and then reentering to the second vertex.
What we now have is a modification of the original traveling
salesperson tour, the "boundary rooted" version, where
free travel on the boundary is permitted. In this case, let us
denote the average length of a minimal boundary rooted cycle through
n uniformly and independently distributed points by
What is nice about is that it is superadditive.
Imagine n points distributed uniformly and independently
in the unit square. Put a minimal boundary rooted cycle through
these points. Now subdivide the unit square into
subsquares of edge length 1/m. One sees that within each
subsquare there is a boundary rooted cycle through the points
in that subsquare. (See illustrations 4 and 5.) Replace the
existing boundary rooted cycle in each subsquare with a minimal
one. This only reduces the cost of the cycle within each subsquare,
yielding
(4.1)
Again, the error term is due to the fact that there are not exactly
points in each subsquare.
The argument in section 3 can now be followed exactly to obtain
(4.2) .
(One must first show that , which is not
hard to do.) Thus, superadditivity yields the lower bound in
the boundary rooted case. What is left to do is to show that
are sufficiently close to one another.
5. The Relationship between the Original and Boundary Rooted
Versions. It is possible to show that
(5.1)
From this, (3.4), and (4.2) it follows that
(5.2)
thus finishing the proof of theorem #2.
The key to the proof of (5.1) is to show that in a minimal boundary
rooted cycle, the average number of points rooted to the boundary
is not more that This is the only part
of the argument that we will show.
Let Q denote the unit square, and let
be the square centered within Q with edge length
Between the boundaries of Q and
we have constructed a "mote" of width
.
The average number of points falling within the mote is not more
than
. Therefore, the average number
of points connected to the boundary from within the mote in a
minimal boundary rooted cycle is not more than
.
Now subdivide into horizontal columns,
each of width not more than
, so that
there are not more than
of these columns.
Let F be one of the vertical sides of Q. Now suppose
that a and b are vertices within a column. Note
that it is impossible for the minimal boundary rooted cycle to
go from a to F, proceed along F, and then
enter back to b. Because the mote would be crossed twice,
the cost of this part of the cycle would be at least
more than the distance between a and b. This argument
shows that there can be at most two points connected to F
from within each column. Therefore, the total number of
points from within
connected to F
is not more than 2
points. Repeating this
argument for each of the 4 sides of Q we obtain that the
average total number of points connected to the boundary is not
more than
6. Some Generalizations. Theorem #1 can be extended and
generalized in many different directions. The work of Beardwood,
Halton, and Hammersley (1959) along with that of Steele (1981)
shows that if is a sequence of points
independently distributed in the d-dimensional cube (
),
each with the same distribution, then
(6.1) .
where f is the density of the absolutely continuous part
of the probability law of , a.s. means
"almost surely" or "with probability one",
and is a positive constant depending on d and L.
L here is a functional defined on finite sets of points
in the d-dimensional cube that satisfies certain conditions.
L could be the length of a minimal cycle, but it could
also be the length of a minimal spanning tree, a minimal matching,
etc.
Theorem #2 can also be extended and generalized. Alexander (1994)
and Redmond and Yukich (1994) show that if
is a sequence of points independently and uniformly distributed
in the d-dimensional cube, then
(6.2)
where L again is a functional satisfying certain conditions.
6. Concluding Remarks. Not only has the boundary rooting technique been useful in providing another interesting solution to the Beardwood, Halton, and Hammersley conjecture, but it has helped to unify much of the theory in a natural and simple way. For more on the boundary rooting technique we refer the reader to the Redmond-Yukich and Yukich references. Unfortunately, a recent, general survey of the interface between probability and combinatorial optimization is not yet available. Several are expected in the near future.
Below, in the references, we mention only those articles to which
we have referred directly. We are leaving out an immense number
of important works in this area and many authors to which this
subject owes its present state of development. We hope that what
we have listed is enough to point the interested reader in the
right direction.
ALEXANDER, K. (1994). Rates of convergence of means for distance-minimizing subadditive Euclidean functionals. Ann. Appl. Probab. 4 902-922.
BEARDWOOD, J., HALTON, J. H. and HAMMERSLEY, J. M. (1959). The shortest path through many points. Proc. Cambridge Philos. Soc. 55 299-327.
REDMOND, C. and YUKICH, J. E. (1994). Limit theorems and rates of convergence for Euclidean functionals. Ann. Appl. Probab. 4 1057-1073.
REDMOND, C. and YUKICH, J.E. (1996). Asymptotics for Euclidean functionals with power-weighted edges. Stochastic Processes and Their Applications. To appear.
STEELE, J. M. (1981). Subadditive Euclidean functionals and non-linear growth in geometric probability. Ann. Probab. 9 365-376.
YUKICH, J. E. (1995). Asymptotics for the stochastic TSP with power-weighted edges. Prob. Theory and Related Fields. 102 203-220.
YUKICH, J. E. (1995). Quasi-additive Euclidean functionals, in: D. Aldous, P. Diaconis, J. Spencer and J. M. Steele, eds., Discrete Probability and Algorithms (Springer, Berlin). 149-158.
YUKICH, J. E. (1995). Minimal triangulations resemble minimal tours. Preprint.
YUKICH, J. E. (1995). Worst case growth rates for some classical optimization problems. Combinatorica. To appear.
YUKICH, J. E. (1995). Ergodic theorems for some
classical optimization problems. Ann. Appl. Probab.
To appear.
Department of Mathematics and Computer Systems
Mercyhurst College
Erie, PA 16546
chad@paradise.mercy.edu