Approx Knwl

by Daniel V Fryer

Quotes

Quotes can lose their intended meaning when taken out of context. Follow links to the source.


“Since all models are wrong the scientist cannot obtain a “correct” one by excessive elaboration. On the contrary following William of Occam he should seek an economical description of natural phenomena. Just as the ability to devise simple but evocative models is the signature of the great scientist so overelaboration and overparameterization is often the mark of mediocrity.”
- George Box, Science and Statistics, JASA (1976)

Much of machine learning kicks off with an overparametrised model, without explicitly aiming to learn any simple representation. Exonerating the trained model of any parsimony constraint might be justified, when the desired products are black-box predictions or control systems. But if the learning is to be interpreted as an act of scientific exploration, then the human trainer is performing at best a science of mediocrity. The initial overparametrization may be justified as providing (or resembling) cognitive flexibility, but good parsimony constraints will be a hallmark of any scientific hypothesis generating machine.


Whenever there is a simple error that most laymen fall for, there is always a slightly more sophisticated version of the same problem that experts fall for. - Amos Tversky.


To lift an autumn hair is no sign of great strength, to see the sun and moon is no sign of sharp sight, to hear the noise of thunder is no sign of a quick ear. What the ancients called a clever fighter, is one who not only wins, but excels in winning with ease. Hence, his victories bring him neither reputation for wisdom, nor credit for courage, he wins his battles by making no mistakes. - Sun Tzu, The Art of War.

The above quote, by Sun Tzu, is the ancient Chinese warfare equivalent of the quote below, by Spivak.


“There are good reasons why the theorems should all be easy and the definitions hard. As the evolution of Stokes’ Theorem revealed, a single simple principle can masquerade as several difficult results; the proofs of many theorems involve merely stripping away the disguise. The definitions, on the other hand, serve a twofold purpose: they are rigorous replacements for vague notions, and machinery for elegant proofs.”
- Michael Spivak, Calculus on Manifolds (1965)


“Mathematics is the study of things that come out a certain way because there is no other way they could possibly be.” - Jordan Ellenberg, How not to be Wrong.

This quote reminds me of a feeling I often get when writing or reading a good proof. It’s like the universe wants to maximise the number of possibilities available to it, but that I’m methodically reducing its options, with a deliberate and steady advance of reason. The struggling universe finds itself in a corner, where the number of options reduces to 1, and this is a proof.


“Mathematics is the art of giving the same name to different things.” - [Henri Poincaré]

Clearly the quote from Jordan Ellenberg is better.


“Practicing diplomats and politicians have often found their margin of success depends on their ability to seize upon unique twists in the political or psychological environment, twists which allow the individual to escape the seemingly uncontrollable tide of events that one would expect a mathematical model to extrapolate… one achieves a greater ‘benefit’ within a given ‘cost-constraint,’ not by being tight and precise about budgeting one’s resources, but rather by preserving the detachment and freedom one will need in order to seize upon whole new options.”
- Paul J Werbos, Chapter V of his PhD thesis (1974)

Paul Werbos was the first (in the US) to propose using the Backpropagation algorithm for neural nets. He did so in his 1974 PhD thesis titled ‘Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences.’ The above quote was part of an argument for why political scientists may be skeptical of mathematical models. Werbos goes on to critique the ability of regression models to predict the future, and to discuss the nature of ‘truth’ in statistics.


“There is an old saying ‘If all a man has is a hammer, then every problem looks like a nail.’ The trouble for statisticians is that recently some of the problems have stopped looking like nails. I conjecture that the result of hitting this wall is that more complicated data models are appearing in current published applications. Bayesian methods combined with Markov Chain Monte Carlo are cropping up all over. This may signify that as data becomes more complex, the data models become more cumbersome and are losing the advantage of presenting a simple and clear picture of nature’s mechanism.”
- Leo Breimann, Statistical Modelling: The Two Cultures (2001)


“The gap between the automatic differentiation community and the neural network community reminds me of a split I once saw between some people making aircraft engines and people making aircraft bodies. When the engine people work on their own, without integrating their work with the airframes, they will find only limited markets for their product. The same goes for airframe people working alone. Only when the engine and the airframe are combined together, into an integrated product, can we obtain a real airplane – a product of great power and general interest.”
- Paul J Werbos, Backwards Differentiation in AD and Neural Nets: Past Links and New Opportunities (2004)


“Hamilton, who discovered quaternions, invented the word associative to disassociate himself from octonions, which are not… Closer to home — most operations on floating-point numbers are not associative on modern computers because of rounding errors.”
- Bartosz Milewski, Monoids on Steroids, Programming Cafe Blog (2017)


“1936 - Alan Turing invents every programming language that will ever be but is shanghaied by British Intelligence to be 007 before he can patent them. 1936 - Alonzo Church also invents every language that will ever be but does it better. His lambda calculus is ignored because it is insufficiently C-like. This criticism occurs in spite of the fact that C has not yet been invented.”
- James Iry, A Brief, Incomplete, and Mostly Wrong History of Programming Languages (2009)


“Many engineers are willing to try out very complex designs requiring fast derivatives, when using neural networks but not when using other kinds of nonlinear models, simply because backpropagation for neural networks is available ‘off the shelf’ with no work required on their part. A more general kind of integrated software system, allowing a wide variety of user-specified modeling modules, and compiling dual subroutines for each module type and collections of modules, could overcome these barriers.”
- Paul Werbos, Backwards Differentiation in AD and Neural Nets: Past Links and New Opportunities (2004)