ACO: Ant System vs Ant Colony System
ACO is the umbrella method. AS and ACS are two ACO variants. This card covers pheromone memory, elitism, conceptual extremes, exploration/exploitation, plot behavior, and HW#6 wording.
1. The One-Line Model
Like GRASP, ants build a solution component by component using a greedy/stochastic rule. Unlike GRASP, ACO remembers good components using pheromone.
ACO is adaptation, not GA-style replacement. The component set stays the same; what changes is the pheromone value attached to each component. Components do not die and reproduce like GA individuals; their historical attractiveness adapts.
| Name | What it is | Exam meaning |
|---|---|---|
| ACO | Umbrella family: Ant Colony Optimization. | Constructive search with pheromone memory. |
| AS | Variant of ACO: Ant System. | Democratic pheromone update; more exploration. |
| ACS | Variant of ACO: Ant Colony System. | Elitist best-trail update + q0 rule + modified evaporation; more exploitation. |
| Concept | Meaning in Bin Packing |
|---|---|
| Component | Item-bin assignment, or item-pair co-location such as "items A and D are in the same bin." |
| Pheromone | Memory that this component appeared in good past solutions. This is indirect communication between ants. |
| Heuristic | Immediate greedy desirability, such as fit quality, remaining capacity, or item size priority. |
| Solution construction | Build a complete packing by repeatedly choosing feasible components. |
| Schema | A repeated good component pattern stored in pheromone, such as a TSP edge or BPP item-pair. |
2. AS vs ACS: The Big Difference
Ant System (AS)
More exploration
- All ants construct solutions.
- All ants deposit pheromone after evaporation.
- Better ants deposit more; worse ants deposit less.
- Unused components only decay; components used by many good trails get multiple deposits.
- No best-so-far ant dominates the pheromone field alone.
Concept: democratic update. Multiple basins remain alive.
Ant Colony System (ACS)
More exploitation
- Uses
q0greedy/stochastic transition rule. - Uses modified evaporation to keep weak components alive.
- Best-trail update is elitist: only best-so-far or best ant reinforces pheromone.
- Converges faster, but may lock into an early trail.
Concept: elitist memory. One best trail can become a bully.
3. The Three Differences You Must Memorize
| Difference | AS | ACS | Behavioral Effect |
|---|---|---|---|
| Transition rule | Pure proportional probability. | With probability q0, choose the best component greedily; otherwise use proportional probability. |
ACS has an explicit exploitation knob. |
| Evaporation / decay | Evaporate pheromone from components. | Use a modified decay/lifeline term: p_i = (1 - beta)p_i + beta*gamma. |
ACS gives weak components a chance instead of letting them die instantly. |
| Deposit / reinforcement | All ants deposit. | Only best-so-far / best ant reinforces its components. | AS is democratic; ACS is elitist. |
Mini Example 1: Transition Rule in AS vs ACS
Assume an ant is currently at city A. It can go next to B, C, or D. The algorithm gives each edge a desirability score based on pheromone and heuristic value.
| Candidate Edge | Score | Meaning |
|---|---|---|
A-B | 60 | Most attractive edge |
A-C | 30 | Second best |
A-D | 10 | Weakest |
AS: Always Proportional Selection
AS converts the scores into probabilities. Total score is 60 + 30 + 10 = 100.
| Edge | Probability in AS | Interpretation |
|---|---|---|
A-B | 60 / 100 = 60% | Most likely, but not guaranteed. |
A-C | 30 / 100 = 30% | Still has a real chance. |
A-D | 10 / 100 = 10% | Weak, but still possible. |
Behavior: AS is stochastic every time. Strong edges are favored, but weaker edges can still be tried.
ACS: Greedy With Probability q0
Assume q0 = 0.8. ACS first draws a random number q in [0, 1].
| Random Draw | ACS Action | Chosen Edge | Behavior |
|---|---|---|---|
q = 0.35 | q <= q0, choose the best edge greedily. | A-B | Exploitation |
q = 0.92 | q > q0, use proportional selection like AS. | A-B 60%, A-C 30%, A-D 10% | Exploration |
q <= q0, and only sometimes falls back to proportional selection. Therefore, higher q0 means more exploitation and higher lock-in risk.
Selection Detail the Doctor Emphasized
At each construction step, the ant does not search all complete solutions. It forms a feasible candidate set and selects one component from it.
| Case | Safe Exam Explanation |
|---|---|
| Maximization problem | Higher value can be used directly in proportional selection. |
| Minimization problem | Convert cost into desirability, such as 1 / cost, or use a selection method where lower cost wins. |
| Too greedy selection | If the fittest component always wins, this behaves like an elitist bully and hurts exploration. |
| Softer selection | Roulette or a less aggressive tournament still favors good components but gives weaker feasible components a chance. |
Mini Example 2: What ACS Does to Pheromone
Use the numbers only to understand direction. The likely exam point is not the symbol names; it is "forget old pheromone, then reinforce the best trail."
Assume two TSP edges are competing: A-B and A-C. Initial pheromone is gamma = 10. Before the update:
| Edge | Current Pheromone | Meaning |
|---|---|---|
A-B | 20 | Popular / strong edge |
A-C | 5 | Weak edge |
Step 1: Modified Evaporation for All Edges
| Edge | Calculation | After Decay | Interpretation |
|---|---|---|---|
A-B | 0.7*20 + 0.3*10 | 17 | Strong edge is reduced. |
A-C | 0.7*5 + 0.3*10 | 6.5 | Weak edge gets a lifeline. |
Step 2: Only Best Trail Gets Reinforced
Now suppose the best trail used edge A-B, but did not use A-C. Let alpha = 0.4 and Fitness(Best) = 30.
| Edge | Calculation | Final Pheromone | Interpretation |
|---|---|---|---|
A-B | 0.6*17 + 0.4*30 | 22.2 | Winner edge becomes stronger. |
A-C | No best-trail reinforcement | 6.5 | Still alive, but weaker. |
4. ACS Pheromone Updates in This Course
Think of ACS as repeated cycles. In one cycle, many ants build complete solutions one component at a time. If there are 100 ants, then all 100 ants construct complete tours or packings during that cycle. After those complete solutions exist, pheromones are updated.
| Update | When It Happens | Who Causes It? | Effect | Behavior |
|---|---|---|---|---|
| Modified evaporation | After the trails are built and evaluated. | All components. | Applies p_i = (1 - beta)p_i + beta*gamma. High pheromone is pulled down; very weak pheromone gets a small lifeline toward the initial value. |
Forgetting / anti-death |
| Best-trail reinforcement | After all ants finish their complete solutions. | Only the best ant, usually best-so-far in ACS. | Increases pheromone on the best solution's components using the elitist learning rate. | Exploitation / elitism |
One ACS cycle with 100 ants
- Start with a pheromone table over components. In TSP, components are edges. In bin packing, components can be item-bin assignments or item co-locations.
- Each ant builds a complete solution incrementally by selecting feasible components.
- After all
100ants finish, ACS evaluates the complete solutions and identifies the best trail. - ACS applies modified evaporation to all components: old pheromone is partly forgotten, but weak components are pulled back slightly toward the initial pheromone
gamma. - ACS then reinforces only the components used in the best trail. The winner gets the pheromone increase.
100 ants are in the same cycle, Ant 1 does not usually change the pheromone table for Ant 2 immediately in the doctor's version. The colony learns after the cycle, when all complete trails have been evaluated. That updated pheromone guides the next cycle.
5. Why ACO Is Incremental
ACO, including both AS and ACS, is different from many metaheuristics because it does not start with a complete solution and then tweak it. It constructs the solution gradually.
| Algorithm Family | Usual Search Object | How It Moves |
|---|---|---|
| HC / SA / TS | Complete solution | Tweak a complete solution using a neighbor move. |
| GA / DE / PSO | Complete individual or particle | Modify, combine, or move complete encoded solutions. |
| ACO | Partial solution being built | Choose one feasible component at a time until the solution is complete. |
In TSP, an ant starts at one city, chooses the next city, then the next, until a full tour is built. In bin packing, an ant can build a packing by choosing item placements or item groupings step by step while respecting capacity.
BPP Example: Building a Partial Solution
Capacity is 10. Items are placed one by one: A=6, B=4, C=5, D=3, E=2.
| Step | Current Partial Packing | Next Item | Feasible Choices | Ant's Construction Decision |
|---|---|---|---|---|
| 1 | No bins yet | A=6 |
Open a new bin | Bin 1: A, load 6 |
| 2 | Bin 1: A, load 6 |
B=4 |
Put in Bin 1 because 6+4 <= 10, or open Bin 2 |
Choose Bin 1 if pheromone/heuristic says A with B is good. |
| 3 | Bin 1: A,B, load 10 |
C=5 |
Bin 1 is infeasible because 10+5 > 10; must open Bin 2 |
Bin 2: C, load 5 |
| 4 | Bin 1: A,B, load 10; Bin 2: C, load 5 |
D=3 |
Bin 2 feasible because 5+3 <= 10; Bin 1 infeasible; or open Bin 3 |
Likely choose Bin 2 if fit heuristic favors filling space. |
| 5 | Bin 1: A,B, load 10; Bin 2: C,D, load 8 |
E=2 |
Bin 2 feasible because 8+2 <= 10; Bin 1 infeasible; or open Bin 3 |
Choose Bin 2, giving Bin 2: C,D,E, load 10 |
The complete solution after construction is:
BPP Example: Best-So-Far Configuration and Pheromone Reward
Assume the pheromone is stored on item pairs that share a bin. This avoids depending on arbitrary bin labels like "Bin 1" or "Bin 2".
When placing a new item, the ant can score a candidate bin by combining a fit heuristic with the pheromone between the new item and items already in that bin.
This is why item-pair pheromone is useful in BPP: it remembers which items worked well together, not the arbitrary number of a bin.
Capacity is still 10. Suppose three ants build these complete packings:
| Ant | Complete Packing | Bins Used | Status |
|---|---|---|---|
| Ant 1 | {A,B}, {C,D,E} |
2 |
Best-so-far |
| Ant 2 | {A,D}, {C,E}, {B} |
3 |
Not best |
| Ant 3 | {A,E}, {B,C,D} |
2 |
Could tie, but assume worse fill/tie-break than Ant 1 |
If Ant 1 is treated as the best-so-far solution, ACS rewards the item-pair components inside Ant 1's bins:
| Best-So-Far Bin | Rewarded Item-Pair Components | Why Rewarded? |
|---|---|---|
{A,B} |
AB |
A and B co-occurred in a good bin. |
{C,D,E} |
CD, CE, DE |
Every pair inside this good bin gets reinforced. |
Pairs that are not together in the best-so-far packing do not get the best-trail reward:
For a numeric support example, assume beta = 0.3, gamma = 10, alpha = 0.4, and the converted reward for the best packing is Fitness(Best) = 20. Do not memorize these symbols or values; the exam idea is which components get reinforced.
| Item Pair | Before Update | After Decay | Best Reward? | After Reward | Final Effect |
|---|---|---|---|---|---|
AB | 12 | 0.7*12 + 0.3*10 = 11.4 | Yes | 0.6*11.4 + 0.4*20 = 14.84 | A with B becomes more likely. |
CD | 9 | 0.7*9 + 0.3*10 = 9.3 | Yes | 0.6*9.3 + 0.4*20 = 13.58 | C with D becomes more likely. |
CE | 8 | 0.7*8 + 0.3*10 = 8.6 | Yes | 0.6*8.6 + 0.4*20 = 13.16 | C with E becomes more likely. |
DE | 7 | 0.7*7 + 0.3*10 = 7.9 | Yes | 0.6*7.9 + 0.4*20 = 12.74 | D with E becomes more likely. |
AD | 15 | 0.7*15 + 0.3*10 = 13.5 | No | 13.5 | Decays only; no best-trail reward. |
6. Elitism: Does ACS Use It?
Yes, but not like GA.
GA elitism copies the best individual into the next generation. ACS elitism happens through memory:
So the best solution's components become more likely to be chosen by future ants. This is elitist because one best trail dominates the search direction.
7. Conceptual Extremes the Doctor Is Likely to Ask
| Concept | Extreme Scenario | Behavior | Explore / Exploit |
|---|---|---|---|
| Greedy transition | ACS always chooses the currently best component. | Ants keep following the strongest pheromone/heuristic edge or item grouping. | Maximum exploitation |
| Greedy transition removed | ACS always uses proportional stochastic selection. | Strong components are still favored, but weaker feasible components still have a chance. | More exploration |
| Evaporation / forgetting | No evaporation, or pheromone does not evaporate enough. | Early paths stay strong for too long. If the early best is bad, the colony keeps repeating it. | Exploitation / lock-in risk |
| Evaporation / forgetting | Full evaporation, or pheromone evaporates too much. | The algorithm forgets what it learned. It behaves closer to greedy/random construction with weak memory. | Exploration / weak learning |
| Pheromone vs heuristic | Pheromone has no influence. | The method loses historical memory and behaves closer to GRASP: constructive, greedy/stochastic, but not really learning from past trails. | No learned exploitation |
| Pheromone vs heuristic | Pheromone dominates everything. | The colony follows historically strong components even if the immediate heuristic says they are poor. This can become rich-get-richer lock-in. | Over-exploitation |
| Pheromone reward | All ants reward their components, as in AS. | Many trails remain alive. Good trails get more influence, but weaker trails are not erased immediately. | Broader exploration |
| Pheromone reward | Only the best-so-far trail rewards its components, as in ACS. | The best trail becomes a strong attractor. Future ants are pulled toward the same component set. | Fast exploitation |
| Best-trail reward strength | Reward is too strong. | One trail becomes a bully. Fast early improvement, then premature plateau if it was not truly best. | Over-exploitation |
| Best-trail reward strength | Reward is too weak. | The best trail barely affects future construction. Ants rely mostly on heuristic/stochastic choice. | Weak exploitation |
8. Plot Behavior
AS Plot
Broader search
AS usually improves more gradually and keeps more run-to-run variance. That is expected because all ants deposit pheromone and multiple basins remain reinforced.
ACS Plot
Fast lock-in risk
ACS often improves quickly, then becomes flat. BSF-only deposit creates a strong attractor; if the early trail is suboptimal, the algorithm locks in.
CB vs BSF in ACO Plots
| Curve | What It Means | Expected Behavior |
|---|---|---|
| Current best / iteration best | The best trail constructed in the current cycle. | Can fluctuate, because each cycle rebuilds new trails from pheromone and stochastic choices. |
| Best-so-far | The best trail ever seen up to this cycle. | Must be monotonic in the correct direction. For BPP as bins-used minimization, it should step down or stay flat. |
| Gap between CB and BSF | How far the current colony behavior is from the best memory. | AS may keep a larger gap because it explores more. ACS often shrinks the gap after lock-in because ants keep reconstructing the BSF basin. |
9. Why AS Can Beat ACS
ACS can lose when the early best trail is not globally good.
| ACS | AS |
|---|---|
| Early best trail receives all global deposit. | All ants deposit, so alternatives survive. |
| Pheromone concentrates fast. | Pheromone field stays broader. |
| Good for easy instances where greedy basin is near-optimal. | Better for hard instances with distant basins. |
| Risk: premature lock-in. | Risk: slower convergence. |
10. Toy TSP Example: Where Pheromone Goes
Use a tiny TSP with four cities: A, B, C, D. The component is an edge, such as AB or CD. Pheromone lives on edges.
One iteration: three ants build three tours
| Ant | Tour | Edges Used | Tour Length | Deposit = 1/length |
|---|---|---|---|---|
| Ant 1 | A-B-C-D-A | AB, BC, CD, AD | 20 | 0.050 |
| Ant 2 | A-B-D-C-A | AB, BD, CD, AC | 22 | 0.045 |
| Ant 3 | A-C-B-D-A | AC, BC, BD, AD | 28 | 0.036 |
AS update: all ants deposit
After evaporation, every edge starts the update at 0.80. Then every ant deposits on the edges it used.
| Edge | Used By | New Pheromone in AS | Interpretation |
|---|---|---|---|
| AB | Ant 1 + Ant 2 | 0.80 + 0.050 + 0.045 = 0.895 | Strong, but not alone. |
| BC | Ant 1 + Ant 3 | 0.80 + 0.050 + 0.036 = 0.886 | Alternative edge survives. |
| CD | Ant 1 + Ant 2 | 0.80 + 0.050 + 0.045 = 0.895 | Strong, reinforced by two ants. |
| AD | Ant 1 + Ant 3 | 0.80 + 0.050 + 0.036 = 0.886 | Still competitive. |
| BD | Ant 2 + Ant 3 | 0.80 + 0.045 + 0.036 = 0.881 | Not discarded. |
| AC | Ant 2 + Ant 3 | 0.80 + 0.045 + 0.036 = 0.881 | Not discarded. |
AS effect: best edges get slightly more pheromone, but edges from weaker tours also remain alive. The pheromone field stays broad.
ACS update: only best ant deposits
Ant 1 is the best tour because length 20 is shortest. In ACS, only Ant 1 deposits globally.
| Edge | In Best Tour? | New Pheromone in ACS | Interpretation |
|---|---|---|---|
| AB | Yes | 0.80 + 0.050 = 0.850 | Best trail reinforced. |
| BC | Yes | 0.80 + 0.050 = 0.850 | Best trail reinforced. |
| CD | Yes | 0.80 + 0.050 = 0.850 | Best trail reinforced. |
| AD | Yes | 0.80 + 0.050 = 0.850 | Best trail reinforced. |
| BD | No | 0.800 | No deposit. |
| AC | No | 0.800 | No deposit. |
ACS effect: pheromone becomes concentrated on the best tour's edges. This is exploitation. If Ant 1 is truly good, ACS converges fast. If Ant 1 was just an early lucky local optimum, ACS may lock in too early.
Next iteration: what changes?
| Question | AS | ACS |
|---|---|---|
| Which edges are more likely? | Several edges: AB/CD strongest, but BC/AD/BD/AC still close. | Mainly best-tour edges: AB, BC, CD, AD. |
| Search behavior | More exploration; several tours remain plausible. | More exploitation; ants are pulled toward A-B-C-D-A. |
| Plot expectation | Slower improvement, more variance. | Fast early improvement, then possible flat lock-in. |
11. HW#6 Story
Your HW#6 story should not be just "ACS got good numbers." Say why the behavior happened.
- ACS was strong because bin packing fits constructive search.
- The size heuristic acted like an FFD prior.
- Pheromone reinforced useful co-location patterns.
- ACS exploited quickly and often matched the greedy basin.
- AS sometimes beat ACS because democratic deposit prevented early lock-in.
- More iterations matter because pheromone needs feedback cycles. Many ants in one cycle do not replace multiple generations of learning.
| Observation | Exam Meaning |
|---|---|
| ACS matched or beat FFD on most HW#6 instances. | BPP works well with constructive search because the algorithm can avoid infeasible placements while building the solution. |
| Pheromone helped most clearly on the instance where useful co-location patterns were not already captured by FFD. | ACO helps when memory adds information beyond the greedy heuristic. |
| On some instances AS could beat ACS. | ACS can lock into a good-looking early grouping, while AS keeps alternative groupings alive longer. |
| ACS can be effective but not necessarily efficient. | The exam distinction is effectiveness vs efficiency: good final quality does not mean low runtime. |
12. Common Exam Prompts
Q: Did ACS use elitism?
Yes. ACS uses elitism through BSF-only pheromone deposit. The best trail's components dominate future construction.
Q: What does q0 = 1 do?
It makes ACS choose the best component greedily every time. Maximum exploitation, high premature lock-in risk.
Q: What does q0 = 0 do?
It removes the greedy branch and uses proportional selection. More exploration. But it is only AS-like in transition, not fully AS if ACS still uses modified evaporation and BSF-only reinforcement.
Q: Why do pheromones evaporate?
Evaporation is forgetting. Without it, early trails can stay strong forever and cause premature convergence. With too much evaporation, ACO forgets what it learned.
Q: What does rewarding do?
Rewarding increases pheromone on components found in good solutions. In AS, many ants reward, so search stays broader. In ACS, only the best rewards, so search becomes more exploitative.
Q: Which is more explorative?
AS, because all ants deposit and the pheromone field stays broad.
Q: Which is more exploitative?
ACS, because q0 enables greedy selection and BSF-only deposit reinforces one best trail.
Q: Is ACO generational?
Not in the GA replacement sense. Ants construct new trails each cycle, but the stable adaptive object is the pheromone table over components.
Q: What is death in ACO?
A component is not physically removed. Its pheromone can decay so much that it is rarely selected. That is the ACO version of practical death.
Q: What is the schema in ACO?
The schema is a useful repeated component pattern stored in pheromone, such as a TSP edge or BPP item-pair that appears in good solutions.
Q: More ants or more cycles?
For ACO, more cycles are usually more important than one huge ant batch, because pheromone needs repeated feedback to become useful memory.
13. Final One-Paragraph Answer
ACO is constructive search with indirect memory: ants build a complete solution one component at a time, then pheromone adapts the attractiveness of the components used in good trails. AS is democratic: all ants deposit pheromone after evaporation, so the search remains broader and more explorative. ACS is elitist: it uses a q0 greedy transition rule, modified evaporation, and best-so-far reinforcement. This makes ACS faster and more exploitative, but it can prematurely lock into an early suboptimal trail. In HW#6, ACS worked well because bin packing benefits from constructive item-pair memory and a size heuristic, but AS sometimes beat ACS because democratic deposit preserved alternative item groupings.