Optimization theory, control theory
| This is more for overview of my own than for teaching or exercise.
Other data analysis, data summarization, learning
- 1 State observers / state estimation; filters
- 2 Optimization theory, control theory
- 2.1 Glossary
- 2.2 Some controllers
- 2.3 Notes on
- 2.4 See also
- 2.5 Log-Linear and Maximum Entropy
- 2.6 Reinforcement learning (RL)
State observers / state estimation; filters
Bayes estimator, Bayes filter
An alpha beta filter (a.k.a. f-g filter, g-h filter)
Multi hypothesis tracking
Optimization theory, control theory
In terms of the (near) future:
- greedy control doesn't really look ahead.
- PID can be tuned for some basic tendencies
- MPC tries to minimize mistakes in predicted future
- For example, take a HVAC system that actively heats but passively cools. This effectively means you should be very careful of overshooting. You would make the system sluggish -- which also reduces performance because it lengthens the time of effects and settling
Doesn't look ahead, just minimizes for the current step.
For example basic proportional adjustment.
Tends not to be stable.
Can be stable enough for certain cases, in particular very slow systems where slow control is fine, and accuracy not so important.
For example, water boilers have such large volume that even a bang-bang controller (turn heater element fully on or off according to temperature threshold) will keep the water within a few few-degrees of that threshold, simply because the water's heat capacity is large in relation to the heating element you'ld probably use.
But in a wider sense, e.g. that same boiler with a small volume, or powerful heater, will mean such control causes unproductive feedback, e.g. oscillations when actuation running is about as fast or faster than measurement.
Hysteresis control (type)
Map-based controller (type)
PID is a fairly generic control-loop system, still widely used in industrial control systems.
It is useful in systems that have to deal with delays between and/or in actuation and sensing, where they can typically be tuned to work better than greedy controllers (and also be tuned to work worse), because unlike greedy, you can try to tune out overshoots as well as oscillations.
PID is computationally very cheap (a few adds and multiplies per step), compared to some other cleverer methods.
- There are no simple guarantees of optimality or stability,
- you have to tune them,
- and learn how to tune them.
- tuning is complex in that it depends on
- how fast the actuation works
- how fast you sample
- how fast the system changes/settles
- doesn't deal well with long time delays
- derivative component is sensitive to noise, so filtering may be a good idea
- has trouble controlling complex systems
- more complex systems should probably look to MPC or similar.
- linear at heart (assumes measurement and actuation are relatively linear)
- so doesn't perform so well in non-linear systems
- symmetric at heart, so not necessarily well-suited to non-symmetric actuation
- consider e.g. a HVAC system -- which would oscillate around its target by alternately heating and cooling.
- It is much more power efficient to do one passively, e.g. active heating and passive cooling (if it's cold outside), or active cooling and passive heating (if it's warmer outside)
- means it's easier to overshoot, and more likely to stick off-setpoint on the passive side, so on average be on one side
- You could make the system sluggish -- in this case it reduces the speed at which it reaches the setpoint, but that is probably acceptable to you.
- in other words: sluggish system and/or a bias to one side
The idea is to adjust the control based on some function of the error, and a Proportional–Integral–Derivative (PID) controller combines the three components it names, each tweaked with their own weight (gain).
The very short version is that
- P adjusts according to the proportional error
- I adjusts according to the integrated error
- D adjusts according to the derivative error
It can be summarized as:
- e(t) is the error
- P, I, and D are scalar weights controlling how much effect each component has
So how do you tune it?
MPC (Model Predictive Control)
FLC (Fuzzy Logic Control)
Log-Linear and Maximum Entropy
Reinforcement learning (RL)
- LP Kaelbing, ML Littman, AW Moore. (1996) Reinforcement learning: A survey.
- RS Sutton, AG Barto (1998) Reinforcement Learning: An Introduction