## Thursday, August 21, 2014

### Calling all statisticians and probabilists

The following is a question I posted at the Mathematics Stack Exchange. Folks who see it here are likely to understand that my "Making Less of No Free Lunch" campaign is intended as much to hose down a creationist cloud of dust as to rain on the NFL parade. Please contribute a bit of your expertise to a worthy cause.

I hope to replace a proof of my own, in a paper explaining that the "no free lunch" theorems for optimization actually address sampling and statistics, with a reference to an existing result on sampling. The random selection process $$X_1, X_2, \dotsc, X_n$$ over the domain of the random objective function $F$ is statistically independent of the selected values, $$F(X_1), F(X_2), \dotsc, F(X_n),$$ despite the functional dependence $$X_i \equiv X(F(X_1), F(X_2), \dotsc, F(X_{i-1})),$$ $1 < i \leq n,$ where $X$ is statistically independent of $F.$ (See Section 0.1 here [or my last post] for more detail.)

To put this in terms of sampling, $F$ associates members of a statistical population with their unknown responses. The sampler $X$ processes data already in the sample to extend the selection. This typically biases the selection. But the selection process is statistically independent of the sequence of responses. That is the justification for referring to the latter as a sample.

This looks like textbook stuff to me: "Data processing biases the selection." But I am awash in irrelevant hits when I Google the terms you see here, and others like them.