Transcript-first notes for DE: adaptive mutation, parent-child competition, alpha extremes, BPP adaptation, HW#5 observations, and plot behavior.
d = a + alpha*(b - c). The population's spread controls the move size: scattered population means large exploratory moves; converged population means small exploitative moves.
For each target parent Q_i, DE chooses three other individuals a, b, and c, builds a mutant, crosses it with the parent, then keeps whichever is better.
Unlike GA, DE is not mainly "generation replaces generation." It is closer to the same parent adapting its location: each parent gets one trial child, and the better one survives.
| Symbol | Meaning | Exam interpretation |
|---|---|---|
Q_i | The target parent being updated. | The owner of the child; it competes at the end. |
a | Base vector / starting point. | The area DE is exploiting around. |
b - c | Difference vector from two other individuals. | Direction and distance of the move. |
alpha | Mutation factor. | Scales exploration/exploitation. |
| Crossover | Mix mutant with parent. | Preserves some parent characteristics. |
| Selection | Trial vs parent duel. | Elitist local replacement. |
Exploration
The population is scattered. Randomly selected b and c are likely far apart, so b-c is large. Mutants can jump across regions.
Exploitation
The population is clustered. Randomly selected b and c are close, so b-c is small. Mutants fine-tune locally.
The doctor spent time on this because it is easy to misunderstand. The parent is "competing" in DE, but not at the beginning of mutant construction.
| Stage | What happens | Why it matters |
|---|---|---|
| Mutant construction | Use a, b, c from other individuals. |
Gives freedom to cross areas of the space instead of staying centered on the parent. |
| Crossover | Mix d with a copy of Q_i. |
The child keeps some parent traits. |
| Final duel | Compare trial child against Q_i. |
If the child is worse, the parent survives. Good information is not lost. |
a?":
If a = Q_i, the mutant is built around the parent, then crossed with the parent, then compared with the parent. That over-centers the search on the parent and becomes closer to hill climbing. The standard version keeps the parent out of the base vector to preserve exploration.
| Exam wording | Mutation meaning | Behavior | Explore / Exploit |
|---|---|---|---|
No difference-vector influence (alpha = 0) |
d = a |
The difference direction disappears. Trial only comes from crossover between a and parent. |
Weak movement |
Small difference-vector influence (alpha small but not zero) |
A tiny step in the direction of b-c. |
DE still uses population information, but the move is cautious and close to the base vector. | Local / conservative movement |
Moderate difference-vector influence (alpha around 0.5) |
Half of the difference vector. | Moderate movement. This is often a balanced region. | Balanced |
Full difference-vector influence (alpha around 1) |
Full difference vector. | More aggressive movement within the common range. | More exploration |
Very strong difference-vector influence (alpha very large) |
Huge scaled difference. | Mutants jump far away, may leave useful regions or require repair/clamping. | Excessive exploration |
| No population spread / all individuals similar | b-c = 0 |
No differential movement. DE is frozen unless diversity is injected. | Stagnation |
| Small population spread / individuals nearly similar | b-c is small. |
The adaptive step automatically shrinks. DE becomes more local even if the scaling factor is not small. | Late-stage exploitation |
b-c; "scale of that difference" means alpha; "small scaling" means cautious movement even if the population is diverse; "small population spread" means b-c is tiny because individuals are already similar.
a, b, and cThe default algorithm chooses the three helper individuals randomly. The transcript makes clear that this is a design lever, not a sacred rule.
| Choice rule | Behavior | Doctor-style explanation |
|---|---|---|
| Pick randomly from the full population. | Most exploration | Any area can influence the next move. |
Pick a by roulette/rank fitness. |
Biased but not greedy | Good individuals are favored, but weaker ones still have a chance. |
| Pick from top-P% then random, GRASP-style. | Controlled exploitation | Greedy because we restrict to good individuals; random because we still sample within them. |
| Pick only from the best few. | Strong exploitation | Fast convergence but high premature convergence risk. Top 3% is almost hill climbing around elites. |
a from better individuals exploits better regions. Selecting b and c from diverse individuals preserves large useful directions. If all three are chosen only from elites, DE may converge faster but lose exploration.
After the mutant d is built, DE does crossover with the parent. This matters because the mutant may be useful in some dimensions and bad in others.
| Without crossover | With crossover |
|---|---|
The child is just d. It may lose the parent's useful structure entirely. |
The child can keep some parent dimensions and take some mutant dimensions. |
More like pure mutation from a. |
Communication between the parent and the population-generated direction. |
| Less continuity. | Parent traits survive even before the final duel. |
Each item gets a real key. Sort keys to get an item order. Decode the order with First Fit or First Fit Decreasing.
Strength: easy, valid DE arithmetic.
Weakness: small key changes can cause large order changes, and many different key vectors decode to the same packing.
Define difference over item-pair co-location, bin groupings, or moves. Then apply a fraction of those features to a base packing and repair if needed.
Strength: closer to the real BPP structure.
Weakness: must handle duplicates, missing items, and capacity repair.
Suppose two good packings repeatedly place items 2 and 5 together. That co-location is a schema. A discrete DE operator can preserve or transplant that item-pair schema into another base packing.
| Continuous DE idea | BPP adaptation |
|---|---|
b-c | Features where two packings differ: item pairs, bin groupings, item moves. |
alpha | How many of those differences/features to apply. |
a + ... | Start from base packing a, then insert selected features. |
| Crossover | Mix trial packing with target parent, then repair duplicates/missing items/capacity violations. |
| Plot signature | Likely DE diagnosis | Reason |
|---|---|---|
| Large noisy current-best changes early. | Population is spread out, or alpha is high. | b-c is large, so candidate moves are exploratory. |
| Smoother improvement later. | Population converged into a promising region. | b-c shrinks, so DE fine-tunes. |
| Best-so-far flat for a long time. | No successful trial beats its parent. | Could mean alpha too high, alpha too low, or population already collapsed. |
| Population best cannot get worse after replacement. | Elitist parent-child selection is working. | If the trial is worse, the parent remains. |
| Premature plateau after selecting helpers only from elites. | Too much exploitation. | All moves are built around the same elite region. |
For DE, population size and number of iterations affect different things.
| Setup | Behavior | Risk |
|---|---|---|
| Large population, few iterations | Good initial coverage and large b-c vectors. |
Not enough time for successful trials to refine the population. |
| Small population, many iterations | More repeated refinement around found regions. | Population can collapse early; then b-c becomes tiny and DE stagnates. |
| Balanced | Enough spread for useful directions, enough iterations for refinement. | Usually safest unless the problem strongly demands exploration or exploitation. |
Your HW#5 answer should not be "DE was good" or "PSO was good." The important observation is that the encoding dominated both algorithms.
DE is adaptive mutation in real-valued vector space: d = a + alpha*(b-c), then crossover with the parent, then child-parent competition.
The mutation step comes from b-c. If the population is spread out, the step is large. If the population is clustered, the step is small.
The difference vector disappears. The mutant becomes the base vector a, so movement is weak and less directed by population spread.
The differential term dominates. Mutants jump too far and behave like excessive exploration or near-random search.
So a bad trial does not replace a good parent. The parent still contributes through crossover, but it is not used as the base vector because that would over-exploit the parent region.
Use random keys and decode with First Fit, or define a discrete difference over item-pair schemas/moves and repair infeasible packings.
DE and PSO were close because random-key encoding dominated. The representation limited both methods more than the optimizer separated them.
Yes, locally. Each trial must beat its immediate parent; otherwise the parent survives.
If individuals are identical or nearly identical, b-c becomes zero or tiny. DE loses its mutation power and stagnates.
No. It is a convenient encoding trick. A more native BPP adaptation defines difference over item groupings, schemas, or moves.
Differential Evolution is a population method for real-valued vectors. For each parent, it builds a mutant d = a + alpha*(b-c), crosses it with the parent, and keeps the better of the trial and parent. The key idea is adaptive mutation: when the population is scattered, b-c is large and DE explores; when the population converges, b-c is small and DE exploits. For bin packing, direct subtraction is meaningless, so we either use random-key encoding or define a custom discrete difference over item groupings and repair the result. In HW#5, DE and PSO were close because both were limited by random-key encoding.