Letters

Dr. Dobb's Journal October 1999

Analyzing Algorithms

Dear DDJ,

Like those two readers who wrote letters to the editor in the August 1999 issue, I also enjoy Jon Bentley's articles very much. I am sure there could be improvement to his TSP algorithm in his April 1999 "Algorithm Alley." But both of the letter writers did it wrong. The self swapping is not really a problem compared with his extra test, which will put a bubble in the processor's pipeline. Besides, it only happens once in the loop. The other problem is that eliminating the sqrt function doesn't preserve the order of total distance. I am sure Jon is well aware of those issues. These again show that optimization is not as easy as it looks.

wgao7611@my-deja.com

Dear DDJ,

The optimization to the Traveling Salesman Problem (TSP) proposed by Peter Roth in his letter to the editor in the August 1999 issue can, unfortunately, result in erroneous results. In particular, his optimization relies on the assertion that:

IF: sum of (n₀)² to (n_i)² > sum of (m₀)² to (m_i)²

THEN: sum of (n₀) to (n_i) > sum of (m₀) to (m_i)

This is, however, not true. As an example, pick a pair of ns 1 and 5 and a pair of ms 3 and 4. The sum of the squares for the ns works out to 26, for the ms we get 25. On the other hand, the sum of the ns is 6, which is less than the sum of the ms, which is 7.

Toby Everett

tua@tua.arctic.net

Dear DDJ,

Peter Roth wrote an interesting letter that you published in the August, 1999 issue of DDJ, discussing how to speed up the distance calculations for the Traveling Salesman Problem. However, there is a much greater speedup available, which could be applied to either the squares of the distances or to the distances themselves: that is to not calculate the distances at all during the search.

How can this be done? By precalculating all the intercity distances once and for all and saving them in a table. Since the number of distances between cities is fixed, it need only be calculated once; and since the number of distances increases only as the square of the number of cities, instead of factorially, doing it once at the beginning of the program becomes a relatively trivial exercise compared to the main algorithm. Then, when executing the main algorithm, you only need retrieve the value in the i.j entry of the table to get the distance between cities i and j.

In fact, even that calculation can be cut by more than half, since the distance(i,i) =0 and distance(i.j)=distance(j,i), so that less than half the table need be calculated, and the rest filled in from known values.

Howard Mark

hlmark@j51.com

Peter responds: Yes indeed! The more folks look at the code, the faster it can be made. Trading storage for computation is another classic speedup. Thanks for your note, Howard.

Dear DDJ,

I found Jon Bentley's April 1999 "Analyzing Algorithms" column interesting. I have found that trying to come up with a good algorithm is when programming gets exciting for me. Jon piqued my curiosity about whether he had fully optimized his algorithm. The following is a technique I came up with for determining the minimum number of calculations for the TSP. I am not a mathematician nor do I study algorithms, so I would not be surprised if the following is old, useless information.

Identify a tour first by its points, then all possible links; I will use a four-stop tour as example. Given four points (1,2,3, and 4), there are, of course, six possible links: A=1-2, B=1-3, C=1-4, D=2-3, E=2-4, and F=3-4. That is, for N points, there are (N×(N-1))/2 links. These link lengths must all be calculated, so there are (N×(N-1))/2 distance calculations.

Next, list all of the possible unique circuits, ignoring starting point and direction. That is, consider a circuit that goes 1 to 2 to 3 to 4 and back to 1 the same as one that goes 3 to 2 to 1 to 4 and back to 3. Except, when listing the circuit, list the links instead of the points. For example, the tour just given would be ADFC (1-2, 2-3, 3-4, 4-1). For four points, this would result in three unique circuits: ADFC, AEFB, and BDCE. I have found there to be ((N-1)×(N-2)×(N-3))/2 possible circuits for a given tour. I am pretty sure that a simple program can be worked out to list the possible circuits.

Last, sum the segments of each circuit and take the lowest; of course there may be more than one lowest. This means that there would be (N-1)×((N-1)×(N-2)× (N-3))/2 sums and ((N-1)×(N-2)×(N-3))/2 comparisons.

I have not had time to compare these results to Jon's but thought he might be interested. My apologies if this is just common knowledge reinvented by an old hack. Then again, I may not have necessarily hit on the minimum either.

Ron Verbruggen

aesoft@sprintmail.com

The Version Control Process

Dear DDJ,

I very much enjoyed Aspi Havewala's article "The Version Control Process" (DDJ, May 1999). Most of it was really excellent advice. Aspi mentioned the situation of needing to make experimental changes to a set of files, where the changes may or may not be kept and the number of files is not predetermined. I have found that the best way to do this is with branch and merge. If you branch just the files you need (as you need them) into an experimental subproject, you can easily merge the changes back if you decide to keep them. Meanwhile, you gain all the benefits of version control in the experimental subprojectincluding change history and backupwithout worrying about your changes affecting the mainstream development of the project. This approach allows daily checkinsa habit that seems to me to be a good one.

Indeed, for vertical corporate apps that are not so version-driven as shrink-wrapped or widely distributed apps, the paradigm of branching and merging a subset of files for a particular project can be more effective than branching a whole source tree.

Robert Patterson

rpatterson@PROMUS.com

Aspi responds: Robert, thanks for the feedback. I certainly see the benefits of what you are proposing. My only concern with this approach is that not all developers are created equal. This is a euphemism for saying that quite a few developers are likely to bungle the management of a private branch. This was the primary reason I refrained from suggesting this kind of a scheme. Although productive in the hands of an experienced and conscientious developer, it can be problematic in the hands of a novice.

Dear DDJ,

I enjoyed Aspi Havewala's excellent exposition of the major issues relating to source code version control (DDJ, May 1999). However, there are a couple of additional points which I feel deserve mention, since in my experience they significantly ease the problems involved in team software development.

The first relates to the scenario described by Aspi in the "Monitoring and Refining Usage Patterns" section, where a developer forgets to refresh and lock a file before beginning work on it. Our team deals with this in a fool-proof way by using the feature of the version control system whereby all files are read-only until checked out. If, additionally, the programmer's editor or IDE is set to disallow editing (not just saving) of read-only files, there is no possibility of inadvertently losing changes.

Second, we handle the inconvenience of contention for files by setting the version control system to allow files to be checked out by multiple programmers simultaneously. While this may sound dangerous, in practice, with version-control systems that are able to perform automatic three-way merges at check-in time (comparing the changes made by each programmer and merging them together if they don't conflict), it is quite safe and painless.

Allowing multiple check-outs introduces one extra scenario which again is handled painlessly by the automatic merge feature. Programmer A checks out files 1 and 2 and begins work on them. Programmer B then checks out file 1 simultaneously and begins his modifications. Programmer A then checks in his changes to files 1 and 2. Programmer B then realizes he needs to change file 2, so he checks it out and gets Programmer A's changes. Programmer B's build is now broken because his version of file 1 contains his changes, but not A's changes. B cannot simply refresh file 1 to get A's changes because he would lose his own mods. We solve this by setting the version control system to automatically merge A's changes into B's version of file 1 during the refresh process. Because the merge occurs to the file on B's hard disk rather than the database version, there is no risk of B breaking the main build with incomplete changes.

Some years ago, when I evaluated several version control systems for potential use by our development team, only Visual SourceSafe had the automatic merge feature, and it was the clincher. I assume that by now most systems will have incorporated it.

Kit Adams

kit@adi.co.nz

Electronic Music

Dear DDJ,

In his column "Jukebox: Covering the Basses" (DDJ, June 1999), Al Stevens does a great job in the first paragraph explaining why someone would prefer acoustic music over electronic music. I just wanted to turn him on to some electronic music that might change his mind somewhathttp://www.asphodel.com/releases/ 0966/index.shtml.

This is one of my favorites for purely electronic ambient music, with industrial undertones. There is more to electronic music than "artificial fireplaces, plastic houseplants, and disposable appliances." Electronic music can be about synthesizing new audio soundscapes for our minds and emotions to explore. This freedom of expression assists freedom of the human spirit.

There is some very interesting stuff that has been going on in the field of electronic music synthesis. For instance, I've been playing with a public domain package called CSound, originating out of Leeds University in the UK, and plan on developing some stuff in Java for it.

Ed Remmell

eremmell@evolving.com

DDJ