ACO: Ant System vs Ant Colony System

ACO is the umbrella method. AS and ACS are two ACO variants. This card covers pheromone memory, elitism, conceptual extremes, exploration/exploitation, plot behavior, and HW#6 wording.

Core exam idea: ACO is constructive search with memory. Under ACO, the two variants we care about are AS and ACS: AS is democratic and more explorative; ACS is elitist and more exploitative. The doctor is more likely to ask what happens to behavior than to ask for exact parameter values.

1. The One-Line Model

ACO = GRASP + pheromone memory

Like GRASP, ants build a solution component by component using a greedy/stochastic rule. Unlike GRASP, ACO remembers good components using pheromone.

ACO is adaptation, not GA-style replacement. The component set stays the same; what changes is the pheromone value attached to each component. Components do not die and reproduce like GA individuals; their historical attractiveness adapts.

NameWhat it isExam meaning
ACOUmbrella family: Ant Colony Optimization.Constructive search with pheromone memory.
ASVariant of ACO: Ant System.Democratic pheromone update; more exploration.
ACSVariant of ACO: Ant Colony System.Elitist best-trail update + q0 rule + modified evaporation; more exploitation.
ConceptMeaning in Bin Packing
ComponentItem-bin assignment, or item-pair co-location such as "items A and D are in the same bin."
PheromoneMemory that this component appeared in good past solutions. This is indirect communication between ants.
HeuristicImmediate greedy desirability, such as fit quality, remaining capacity, or item size priority.
Solution constructionBuild a complete packing by repeatedly choosing feasible components.
SchemaA repeated good component pattern stored in pheromone, such as a TSP edge or BPP item-pair.
Doctor-style contrast: GA, DE, and PSO usually manipulate complete encoded solutions. ACO constructs a solution from pieces. The reusable learning is outside the ant, in the pheromone table. The ant's trail disappears, but the components it used become more or less attractive for future ants.
First-cycle behavior: If all pheromones start equal, early ants are guided mostly by the heuristic and stochastic selection. After completed trails are evaluated, pheromone begins to bias later cycles.

2. AS vs ACS: The Big Difference

Ant System (AS)

More exploration

  • All ants construct solutions.
  • All ants deposit pheromone after evaporation.
  • Better ants deposit more; worse ants deposit less.
  • Unused components only decay; components used by many good trails get multiple deposits.
  • No best-so-far ant dominates the pheromone field alone.

Concept: democratic update. Multiple basins remain alive.

Ant Colony System (ACS)

More exploitation

  • Uses q0 greedy/stochastic transition rule.
  • Uses modified evaporation to keep weak components alive.
  • Best-trail update is elitist: only best-so-far or best ant reinforces pheromone.
  • Converges faster, but may lock into an early trail.

Concept: elitist memory. One best trail can become a bully.

3. The Three Differences You Must Memorize

DifferenceASACSBehavioral Effect
Transition rule Pure proportional probability. With probability q0, choose the best component greedily; otherwise use proportional probability. ACS has an explicit exploitation knob.
Evaporation / decay Evaporate pheromone from components. Use a modified decay/lifeline term: p_i = (1 - beta)p_i + beta*gamma. ACS gives weak components a chance instead of letting them die instantly.
Deposit / reinforcement All ants deposit. Only best-so-far / best ant reinforces its components. AS is democratic; ACS is elitist.

Mini Example 1: Transition Rule in AS vs ACS

Assume an ant is currently at city A. It can go next to B, C, or D. The algorithm gives each edge a desirability score based on pheromone and heuristic value.

score(edge) = pheromone^epsilon * heuristic^delta
Candidate EdgeScoreMeaning
A-B60Most attractive edge
A-C30Second best
A-D10Weakest

AS: Always Proportional Selection

AS converts the scores into probabilities. Total score is 60 + 30 + 10 = 100.

EdgeProbability in ASInterpretation
A-B60 / 100 = 60%Most likely, but not guaranteed.
A-C30 / 100 = 30%Still has a real chance.
A-D10 / 100 = 10%Weak, but still possible.

Behavior: AS is stochastic every time. Strong edges are favored, but weaker edges can still be tried.

ACS: Greedy With Probability q0

Assume q0 = 0.8. ACS first draws a random number q in [0, 1].

Random DrawACS ActionChosen EdgeBehavior
q = 0.35q <= q0, choose the best edge greedily.A-BExploitation
q = 0.92q > q0, use proportional selection like AS.A-B 60%, A-C 30%, A-D 10%Exploration
Exam takeaway: AS always uses proportional stochastic selection. ACS usually chooses the best component greedily when q <= q0, and only sometimes falls back to proportional selection. Therefore, higher q0 means more exploitation and higher lock-in risk.

Selection Detail the Doctor Emphasized

At each construction step, the ant does not search all complete solutions. It forms a feasible candidate set and selects one component from it.

C' = feasible components not yet in the partial solution choose one component from C' using pheromone + heuristic desirability
CaseSafe Exam Explanation
Maximization problem Higher value can be used directly in proportional selection.
Minimization problem Convert cost into desirability, such as 1 / cost, or use a selection method where lower cost wins.
Too greedy selection If the fittest component always wins, this behaves like an elitist bully and hurts exploration.
Softer selection Roulette or a less aggressive tournament still favors good components but gives weaker feasible components a chance.

Mini Example 2: What ACS Does to Pheromone

Use the numbers only to understand direction. The likely exam point is not the symbol names; it is "forget old pheromone, then reinforce the best trail."

Assume two TSP edges are competing: A-B and A-C. Initial pheromone is gamma = 10. Before the update:

EdgeCurrent PheromoneMeaning
A-B20Popular / strong edge
A-C5Weak edge

Step 1: Modified Evaporation for All Edges

p_i = (1 - beta)p_i + beta*gamma beta = 0.3, gamma = 10
EdgeCalculationAfter DecayInterpretation
A-B0.7*20 + 0.3*1017Strong edge is reduced.
A-C0.7*5 + 0.3*106.5Weak edge gets a lifeline.

Step 2: Only Best Trail Gets Reinforced

Now suppose the best trail used edge A-B, but did not use A-C. Let alpha = 0.4 and Fitness(Best) = 30.

If edge is in Best: p_i = (1 - alpha)p_i + alpha*Fitness(Best)
EdgeCalculationFinal PheromoneInterpretation
A-B0.6*17 + 0.4*3022.2Winner edge becomes stronger.
A-CNo best-trail reinforcement6.5Still alive, but weaker.
Exam takeaway: ACS first applies modified evaporation to all components, then reinforces only the best trail. So non-best components do not disappear immediately, but the best components become much more attractive. This is why ACS is exploitative and can converge quickly.
Scale warning: The reward added to pheromone must be normalized. If pheromone values are small and the fitness reward is huge, one update can overpower all future choices. That becomes premature convergence by scaling mistake, not intelligent learning.
Course-specific correction: Some textbook ACS descriptions include a local update during construction. In the doctor's slide/transcript, the emphasized ACS updates are different: after trails are built, decrease all pheromones a bit using the modified evaporation rule, then increase only the components used in the best trail.

4. ACS Pheromone Updates in This Course

Transcript correction: Incremental construction is an ACO idea, not an ACS-only idea. AS and ACS both build solutions component by component. ACS is different because it makes this construction more exploitative through greedy selection, modified evaporation, and best-trail reinforcement.

Think of ACS as repeated cycles. In one cycle, many ants build complete solutions one component at a time. If there are 100 ants, then all 100 ants construct complete tours or packings during that cycle. After those complete solutions exist, pheromones are updated.

UpdateWhen It HappensWho Causes It?EffectBehavior
Modified evaporation After the trails are built and evaluated. All components. Applies p_i = (1 - beta)p_i + beta*gamma. High pheromone is pulled down; very weak pheromone gets a small lifeline toward the initial value. Forgetting / anti-death
Best-trail reinforcement After all ants finish their complete solutions. Only the best ant, usually best-so-far in ACS. Increases pheromone on the best solution's components using the elitist learning rate. Exploitation / elitism

One ACS cycle with 100 ants

  1. Start with a pheromone table over components. In TSP, components are edges. In bin packing, components can be item-bin assignments or item co-locations.
  2. Each ant builds a complete solution incrementally by selecting feasible components.
  3. After all 100 ants finish, ACS evaluates the complete solutions and identifies the best trail.
  4. ACS applies modified evaporation to all components: old pheromone is partly forgotten, but weak components are pulled back slightly toward the initial pheromone gamma.
  5. ACS then reinforces only the components used in the best trail. The winner gets the pheromone increase.
Same-iteration question: If 100 ants are in the same cycle, Ant 1 does not usually change the pheromone table for Ant 2 immediately in the doctor's version. The colony learns after the cycle, when all complete trails have been evaluated. That updated pheromone guides the next cycle.
Exam answer: In the doctor's ACS explanation, pheromone is updated after complete trails are built. First, all pheromones are decayed with a lifeline toward the initial value. Then only the best trail's components are reinforced. This is why ACS is more exploitative than AS, while still giving non-best components a small chance not to disappear.

5. Why ACO Is Incremental

ACO, including both AS and ACS, is different from many metaheuristics because it does not start with a complete solution and then tweak it. It constructs the solution gradually.

Start: S = empty partial solution Repeat: find feasible components C' that can be added to S Choose one component from C' Stop: when S is a complete feasible solution
Algorithm FamilyUsual Search ObjectHow It Moves
HC / SA / TS Complete solution Tweak a complete solution using a neighbor move.
GA / DE / PSO Complete individual or particle Modify, combine, or move complete encoded solutions.
ACO Partial solution being built Choose one feasible component at a time until the solution is complete.

In TSP, an ant starts at one city, chooses the next city, then the next, until a full tour is built. In bin packing, an ant can build a packing by choosing item placements or item groupings step by step while respecting capacity.

Optional exploitation step: After a complete trail is constructed, the algorithm may fine-tune it with a local search or hill-climbing step. That is not the core ACO idea. The core idea is still incremental construction plus pheromone feedback.

BPP Example: Building a Partial Solution

Capacity is 10. Items are placed one by one: A=6, B=4, C=5, D=3, E=2.

StepCurrent Partial PackingNext ItemFeasible ChoicesAnt's Construction Decision
1 No bins yet A=6 Open a new bin Bin 1: A, load 6
2 Bin 1: A, load 6 B=4 Put in Bin 1 because 6+4 <= 10, or open Bin 2 Choose Bin 1 if pheromone/heuristic says A with B is good.
3 Bin 1: A,B, load 10 C=5 Bin 1 is infeasible because 10+5 > 10; must open Bin 2 Bin 2: C, load 5
4 Bin 1: A,B, load 10; Bin 2: C, load 5 D=3 Bin 2 feasible because 5+3 <= 10; Bin 1 infeasible; or open Bin 3 Likely choose Bin 2 if fit heuristic favors filling space.
5 Bin 1: A,B, load 10; Bin 2: C,D, load 8 E=2 Bin 2 feasible because 8+2 <= 10; Bin 1 infeasible; or open Bin 3 Choose Bin 2, giving Bin 2: C,D,E, load 10

The complete solution after construction is:

Bin 1: A(6), B(4) = 10 Bin 2: C(5), D(3), E(2) = 10 Total bins = 2
Exam takeaway: In BPP, an ACO ant does not start with a complete packing and then repair it. It builds a partial packing step by step. At each step, infeasible placements are excluded, and the ant chooses among feasible placements using pheromone memory plus a greedy heuristic such as remaining capacity or item fit.

BPP Example: Best-So-Far Configuration and Pheromone Reward

Assume the pheromone is stored on item pairs that share a bin. This avoids depending on arbitrary bin labels like "Bin 1" or "Bin 2".

When placing a new item, the ant can score a candidate bin by combining a fit heuristic with the pheromone between the new item and items already in that bin.

desirability(item i into bin B) = fit_heuristic(i, B) * sum pheromone(i, item j already in B)

This is why item-pair pheromone is useful in BPP: it remembers which items worked well together, not the arbitrary number of a bin.

Capacity is still 10. Suppose three ants build these complete packings:

AntComplete PackingBins UsedStatus
Ant 1 {A,B}, {C,D,E} 2 Best-so-far
Ant 2 {A,D}, {C,E}, {B} 3 Not best
Ant 3 {A,E}, {B,C,D} 2 Could tie, but assume worse fill/tie-break than Ant 1

If Ant 1 is treated as the best-so-far solution, ACS rewards the item-pair components inside Ant 1's bins:

Best-So-Far BinRewarded Item-Pair ComponentsWhy Rewarded?
{A,B} AB A and B co-occurred in a good bin.
{C,D,E} CD, CE, DE Every pair inside this good bin gets reinforced.

Pairs that are not together in the best-so-far packing do not get the best-trail reward:

Rewarded: AB, CD, CE, DE Not rewarded: AC, AD, AE, BC, BD, BE

For a numeric support example, assume beta = 0.3, gamma = 10, alpha = 0.4, and the converted reward for the best packing is Fitness(Best) = 20. Do not memorize these symbols or values; the exam idea is which components get reinforced.

Item PairBefore UpdateAfter DecayBest Reward?After RewardFinal Effect
AB120.7*12 + 0.3*10 = 11.4Yes0.6*11.4 + 0.4*20 = 14.84A with B becomes more likely.
CD90.7*9 + 0.3*10 = 9.3Yes0.6*9.3 + 0.4*20 = 13.58C with D becomes more likely.
CE80.7*8 + 0.3*10 = 8.6Yes0.6*8.6 + 0.4*20 = 13.16C with E becomes more likely.
DE70.7*7 + 0.3*10 = 7.9Yes0.6*7.9 + 0.4*20 = 12.74D with E becomes more likely.
AD150.7*15 + 0.3*10 = 13.5No13.5Decays only; no best-trail reward.
Exam takeaway: In BPP with item-pair pheromone, the best-so-far configuration is a complete packing. ACS rewards the item pairs that appear inside the bins of that best packing. This means future ants are biased to reconstruct useful item groupings, not to copy a bin label.
Doctor-style wording: ACO searches incrementally because the ant is making a sequence of construction decisions. Pheromone does not describe a whole solution directly; it guides which component should be added next.
Do not say: "ACS is the one that is incremental." The safer answer is: "ACO is incremental. ACS is a more exploitative ACO variant."

6. Elitism: Does ACS Use It?

Yes, but not like GA.

GA elitism copies the best individual into the next generation. ACS elitism happens through memory:

Only the best-so-far ant deposits pheromone globally.

So the best solution's components become more likely to be chosen by future ants. This is elitist because one best trail dominates the search direction.

Exam answer: ACS uses elitism in the pheromone update. Only the best-so-far ant deposits pheromone globally, so its components are reinforced and future ants are pulled toward that trail. This increases exploitation but can cause premature lock-in.

7. Conceptual Extremes the Doctor Is Likely to Ask

Exam framing: Do not treat this as hyperparameter tuning. The doctor is more likely to ask what pheromone evaporation, pheromone reward, greedy selection, and elitism do to exploration/exploitation.
ConceptExtreme ScenarioBehaviorExplore / Exploit
Greedy transition ACS always chooses the currently best component. Ants keep following the strongest pheromone/heuristic edge or item grouping. Maximum exploitation
Greedy transition removed ACS always uses proportional stochastic selection. Strong components are still favored, but weaker feasible components still have a chance. More exploration
Evaporation / forgetting No evaporation, or pheromone does not evaporate enough. Early paths stay strong for too long. If the early best is bad, the colony keeps repeating it. Exploitation / lock-in risk
Evaporation / forgetting Full evaporation, or pheromone evaporates too much. The algorithm forgets what it learned. It behaves closer to greedy/random construction with weak memory. Exploration / weak learning
Pheromone vs heuristic Pheromone has no influence. The method loses historical memory and behaves closer to GRASP: constructive, greedy/stochastic, but not really learning from past trails. No learned exploitation
Pheromone vs heuristic Pheromone dominates everything. The colony follows historically strong components even if the immediate heuristic says they are poor. This can become rich-get-richer lock-in. Over-exploitation
Pheromone reward All ants reward their components, as in AS. Many trails remain alive. Good trails get more influence, but weaker trails are not erased immediately. Broader exploration
Pheromone reward Only the best-so-far trail rewards its components, as in ACS. The best trail becomes a strong attractor. Future ants are pulled toward the same component set. Fast exploitation
Best-trail reward strength Reward is too strong. One trail becomes a bully. Fast early improvement, then premature plateau if it was not truly best. Over-exploitation
Best-trail reward strength Reward is too weak. The best trail barely affects future construction. Ants rely mostly on heuristic/stochastic choice. Weak exploitation
Doctor-style answer: Pheromone is memory. Evaporation is forgetting. Rewarding is reinforcement. If pheromone never fades, early bad trails can dominate and cause premature convergence. If pheromone fades too fast, the algorithm forgets what it learned. AS rewards many ants, so it stays broader. ACS rewards only the best trail, so it exploits faster but can lock in.

8. Plot Behavior

AS Plot

Broader search

AS usually improves more gradually and keeps more run-to-run variance. That is expected because all ants deposit pheromone and multiple basins remain reinforced.

ACS Plot

Fast lock-in risk

ACS often improves quickly, then becomes flat. BSF-only deposit creates a strong attractor; if the early trail is suboptimal, the algorithm locks in.

Plot exam answer: AS should show slower, broader improvement because pheromone is updated by all ants. ACS should show faster early improvement and a flatter later BSF because best-so-far deposit concentrates pheromone around one trail.

CB vs BSF in ACO Plots

CurveWhat It MeansExpected Behavior
Current best / iteration best The best trail constructed in the current cycle. Can fluctuate, because each cycle rebuilds new trails from pheromone and stochastic choices.
Best-so-far The best trail ever seen up to this cycle. Must be monotonic in the correct direction. For BPP as bins-used minimization, it should step down or stay flat.
Gap between CB and BSF How far the current colony behavior is from the best memory. AS may keep a larger gap because it explores more. ACS often shrinks the gap after lock-in because ants keep reconstructing the BSF basin.
Doctor-style plot sentence: If ACS is over-exploiting, CB quickly approaches BSF and both plateau. If AS is still exploring, CB may stay noisier while BSF improves more slowly.

9. Why AS Can Beat ACS

ACS can lose when the early best trail is not globally good.

ACSAS
Early best trail receives all global deposit. All ants deposit, so alternatives survive.
Pheromone concentrates fast. Pheromone field stays broader.
Good for easy instances where greedy basin is near-optimal. Better for hard instances with distant basins.
Risk: premature lock-in. Risk: slower convergence.
Exam answer: AS can beat ACS because AS is less elitist. ACS reinforces only the best-so-far trail, so if the early best trail is suboptimal, future ants keep reconstructing the same basin. AS lets all ants deposit, keeping several basins alive and giving the search more exploration.

10. Toy TSP Example: Where Pheromone Goes

Use a tiny TSP with four cities: A, B, C, D. The component is an edge, such as AB or CD. Pheromone lives on edges.

Initial pheromone on every edge = 1.00 Evaporation = 20%, so old pheromone becomes 0.80 * old Deposit amount = 1 / tour_length

One iteration: three ants build three tours

AntTourEdges UsedTour LengthDeposit = 1/length
Ant 1A-B-C-D-AAB, BC, CD, AD200.050
Ant 2A-B-D-C-AAB, BD, CD, AC220.045
Ant 3A-C-B-D-AAC, BC, BD, AD280.036

AS update: all ants deposit

After evaporation, every edge starts the update at 0.80. Then every ant deposits on the edges it used.

EdgeUsed ByNew Pheromone in ASInterpretation
ABAnt 1 + Ant 20.80 + 0.050 + 0.045 = 0.895Strong, but not alone.
BCAnt 1 + Ant 30.80 + 0.050 + 0.036 = 0.886Alternative edge survives.
CDAnt 1 + Ant 20.80 + 0.050 + 0.045 = 0.895Strong, reinforced by two ants.
ADAnt 1 + Ant 30.80 + 0.050 + 0.036 = 0.886Still competitive.
BDAnt 2 + Ant 30.80 + 0.045 + 0.036 = 0.881Not discarded.
ACAnt 2 + Ant 30.80 + 0.045 + 0.036 = 0.881Not discarded.

AS effect: best edges get slightly more pheromone, but edges from weaker tours also remain alive. The pheromone field stays broad.

ACS update: only best ant deposits

Ant 1 is the best tour because length 20 is shortest. In ACS, only Ant 1 deposits globally.

EdgeIn Best Tour?New Pheromone in ACSInterpretation
ABYes0.80 + 0.050 = 0.850Best trail reinforced.
BCYes0.80 + 0.050 = 0.850Best trail reinforced.
CDYes0.80 + 0.050 = 0.850Best trail reinforced.
ADYes0.80 + 0.050 = 0.850Best trail reinforced.
BDNo0.800No deposit.
ACNo0.800No deposit.

ACS effect: pheromone becomes concentrated on the best tour's edges. This is exploitation. If Ant 1 is truly good, ACS converges fast. If Ant 1 was just an early lucky local optimum, ACS may lock in too early.

Next iteration: what changes?

QuestionASACS
Which edges are more likely? Several edges: AB/CD strongest, but BC/AD/BD/AC still close. Mainly best-tour edges: AB, BC, CD, AD.
Search behavior More exploration; several tours remain plausible. More exploitation; ants are pulled toward A-B-C-D-A.
Plot expectation Slower improvement, more variance. Fast early improvement, then possible flat lock-in.
Exam takeaway: In TSP, pheromone is placed on edges. AS deposits on every edge used by every ant, weighted by tour quality, so many possible paths remain alive. ACS deposits only on the best tour's edges, so it reinforces one path strongly. That is why AS is more explorative and ACS is more exploitative.

11. HW#6 Story

Your HW#6 story should not be just "ACS got good numbers." Say why the behavior happened.

ObservationExam Meaning
ACS matched or beat FFD on most HW#6 instances. BPP works well with constructive search because the algorithm can avoid infeasible placements while building the solution.
Pheromone helped most clearly on the instance where useful co-location patterns were not already captured by FFD. ACO helps when memory adds information beyond the greedy heuristic.
On some instances AS could beat ACS. ACS can lock into a good-looking early grouping, while AS keeps alternative groupings alive longer.
ACS can be effective but not necessarily efficient. The exam distinction is effectiveness vs efficiency: good final quality does not mean low runtime.
HW-style answer: In HW#6, ACS performed well because it directly constructed packings using a size-based heuristic and pheromone memory over item co-location. This matched the structure of bin packing better than random-key DE/PSO. However, AS sometimes beat ACS on harder cases because AS deposited pheromone democratically, preserving alternative groupings instead of locking into the first good ACS trail.

12. Common Exam Prompts

Q: Did ACS use elitism?

Yes. ACS uses elitism through BSF-only pheromone deposit. The best trail's components dominate future construction.

Q: What does q0 = 1 do?

It makes ACS choose the best component greedily every time. Maximum exploitation, high premature lock-in risk.

Q: What does q0 = 0 do?

It removes the greedy branch and uses proportional selection. More exploration. But it is only AS-like in transition, not fully AS if ACS still uses modified evaporation and BSF-only reinforcement.

Q: Why do pheromones evaporate?

Evaporation is forgetting. Without it, early trails can stay strong forever and cause premature convergence. With too much evaporation, ACO forgets what it learned.

Q: What does rewarding do?

Rewarding increases pheromone on components found in good solutions. In AS, many ants reward, so search stays broader. In ACS, only the best rewards, so search becomes more exploitative.

Q: Which is more explorative?

AS, because all ants deposit and the pheromone field stays broad.

Q: Which is more exploitative?

ACS, because q0 enables greedy selection and BSF-only deposit reinforces one best trail.

Q: Is ACO generational?

Not in the GA replacement sense. Ants construct new trails each cycle, but the stable adaptive object is the pheromone table over components.

Q: What is death in ACO?

A component is not physically removed. Its pheromone can decay so much that it is rarely selected. That is the ACO version of practical death.

Q: What is the schema in ACO?

The schema is a useful repeated component pattern stored in pheromone, such as a TSP edge or BPP item-pair that appears in good solutions.

Q: More ants or more cycles?

For ACO, more cycles are usually more important than one huge ant batch, because pheromone needs repeated feedback to become useful memory.

13. Final One-Paragraph Answer

ACO is constructive search with indirect memory: ants build a complete solution one component at a time, then pheromone adapts the attractiveness of the components used in good trails. AS is democratic: all ants deposit pheromone after evaporation, so the search remains broader and more explorative. ACS is elitist: it uses a q0 greedy transition rule, modified evaporation, and best-so-far reinforcement. This makes ACS faster and more exploitative, but it can prematurely lock into an early suboptimal trail. In HW#6, ACS worked well because bin packing benefits from constructive item-pair memory and a size heuristic, but AS sometimes beat ACS because democratic deposit preserved alternative item groupings.

← Hub