From 045d4b9a0cf918283bc00564f3cf3814a21d6c8d Mon Sep 17 00:00:00 2001
From: Johann Dreo <johann.dreo@thalesgroup.com>
Date: Mon, 6 Jan 2020 12:07:25 +0100
Subject: [PATCH] Lesson summary/syllabus.

---
 LESSON.md | 208 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 208 insertions(+)
 create mode 100644 LESSON.md

diff --git a/LESSON.md b/LESSON.md
new file mode 100644
index 0000000..b1cde3c
--- /dev/null
+++ b/LESSON.md
@@ -0,0 +1,208 @@
+Metaheuristics (IA-308)
+=======================
+
+Introduction
+------------
+
+Metaheuristics are mathematical optimization algorithms solving `$\argmin_{x \in X} f(x)$` (or argmax).
+
+Synonyms:
+- search heuristics,
+- evolutionary algorithms,
+- stochastic local search.
+
+The general approach is to only look at the solutions, by trial and error, without further information on its structure.
+Hence the problem is often labelled as "black-box".
+
+Link to NP-hardness/curse of dimensionality: easy to evaluate, hard to solve.
+Easy to evaluate = fast, but not as fast as the algorithm itself.
+Hard to solve, but not impossible.
+
+
+Algorithmics
+------------
+
+Those algorithms are randomized and iteratives (hence stochastics) and manipulates a sample (synonym population)
+of solutions (s. individual) to the problem, each one being associated with its quality (s. cost, fitness).
+
+Thus, algorithms have a main loop, and articulate functions that manipulates the sample (called "operators").
+
+Main design problem: exploitation/exploration compromise (s. intensification/diversification).
+Main design goal: raise the abstraction level.
+Main design tools: learning (s. memory) + heuristics (s. bias).
+
+Forget metaphors and use mathematical descriptions.
+
+Seek a compromise between complexity, performances and explainability.
+
+The is no better "method".
+Difference between model and instance, for problem and algorithm.
+No Free Lunch Theorem.
+But there is a "better algorithm instances on a given problem instances set".
+
+The better you understand it, the better the algorithm will be.
+
+
+Problem modelization
+--------------------
+
+Way to assess the quality: fitness function.
+Way to model a solution: encoding.
+
+
+### Main models
+
+Encoding:
+- continuous (s. numeric),
+- discrete metric (integers),
+- combinatorial (graph, permutation).
+
+Fitness:
+- mono-objective,
+- multi-modal,
+- multi-objectives.
+
+
+### Constraints management
+
+Main constraints management tools for operators:
+- penalization,
+- reparation,
+- generation.
+
+
+Performance evaluation
+----------------------
+
+### What is performance
+
+Main performances axis:
+- time,
+- quality,
+- probability.
+
+Additional performance axis:
+- robustness,
+- stability.
+
+Golden rule: the output of a metaheuristic is a distribution, not a solution.
+
+
+### Empirical evaluation
+
+Proof-reality gap is huge, thus empirical performance evaluation is gold standard.
+
+Empirical evaluation = scientific method.
+
+Basic rules of thumb:
+- randomized algorithms => repetition of runs,
+- sensitivity to parameters => design of experiments,
+- use statistical tools,
+- design experiments to answer a single question,
+- test one thing at a time.
+
+### Useful statistical tools
+
+Statistical tests.
+- classical null hypothesis: test equality of distributions.
+- beware of p-value.
+
+How many runs?
+- not always "as many as possible",
+- maybe "as many as needed",
+- generally: 15 (min for non-parametric tests) -- 20 (min for parametric-gaussian tests).
+
+Use robust estimators: median instead of mean, Inter Quartile Range instead of standard deviation.
+
+
+### Expected Empirical Cumulative Distribution Functions
+
+On Run Time: ERT-ECDF.
+```
+$ERTECDF(\{X_0,\dots,X_i,\dots,X_r\}, \delta, f, t) := \#\{x_t \in X_t | f(x_t^*)>=\delta \}$
+$\delta \in [0, max_{x \in \mathcal{X}}(f(x))]$
+$X_i := \{\{ x_0^0, \dots, x_i^j, \dots, x_p^u | p\in[1,\infty[ \} | u \in [0,\infty[ \} \in \mathcal{X}$
+```
+with $p$ the sample size, $r$ the number of runs, $u$ the nubmer of iterations, $t$ the number of calls to the objective
+function.
+
+The number of calls to the objective function is a good estimator of time because it dominates all other times.
+
+The dual of the ERT-ECDF can be easily computed for quality (EQT-ECDF).
+
+3D ERT/EQT-ECDF may be useful for terminal comparison.
+
+
+### Other tools
+
+Convergence curves: do not forget the golden rule and show distributions:
+- quantile boxes,
+- violin plots,
+- histograms.
+
+
+Algorithm Design
+----------------
+
+### Neighborhood
+
+Convergence definition(s).
+- strong,
+- weak.
+
+Neighborhood: subset of solutions atteinable after an atomic transformation:
+- ergodicity,
+- quasi-ergodicity.
+
+
+### Structure of problem/algorithms
+
+Structure of problems to exploit:
+- locality (basin of attraction),
+- separability,
+- gradient,
+- funnels.
+
+Structure with which to capture those structures:
+- implicit,
+- explicit,
+- direct.
+
+Silver rule: choose the algorithmic template that adhere the most to the problem model.
+- taking constraints into account,
+- iterate between problem/algorithm models.
+
+
+### Grammar of algorithms
+
+Parameter setting < tuning < control.
+
+Portfolio approaches.
+Example: numeric low dimensions => Nelder-Mead Search is sufficient.
+
+Algorithm selection.
+
+Algorithms are templates in which operators are interchangeable.
+
+Most generic way of thinking about algorithms: grammar-based algorithm selection with parameters.
+Example: modular CMA-ES.
+
+Parameter setting tools:
+- ParamILS,
+- SPO,
+- i-race.
+
+Design tools:
+- ParadisEO.
+
+
+### Landscape-aware algorithms
+
+Fitness landscapes: structure of problems as seen by an algorithm.
+Features: tool that measure one aspect of a fitness landscape.
+
+We can observe landscapes, and learn which algorithm instance solves it better.
+Examples: SAT, TSP, BB.
+
+Toward automated solver design.
+