Sequential Testing

Nick Arnosti

2021/03/01

Suppose that we wish to learn a state ZZ=0,1n. At each time step t, we can choose XtX={0,1}n, and observe (1)Yt=1(XtZ>0)

(??)

A history is a sequence of (Xt,Yt) pairs. A policy π is a map from histories to X. Given history H, let F(H) be the set of Z that are consistent with H. Define T(π,Z) to be the number of tests run under policy π when the state is Z.

Extensions