
title: Work log 20160713

We could learn a compact representation of a generative model; already dont with e.g. graphs.
For each symbol S(i) assign probabilities P(i, 0), P(i, 1), etc.
Initial model: P(S(i)) = P(i, 0)
Expand:
λv(i) >
case of
>
...
Not the brillest; how about Church Encoding?
~~ `λs0 s1 s2 ... > ` ~~
For n symbols:
term n = λ s1 s2 ... sn >
expr n = vi  0 <= 1 < n
 λ >


What can we compare recurrent clustering to?
 Nonrecurrent clustering
 Simpler feature extraction/distance, e.g. ngrams, bagofwords, tree kernel
Ranking would be good too; select first N closest matches.
We want to learn a rule from 𝒫(Def) > 𝒫(Def) (in general). Maybe more practical for 𝒫(Def) > Def > [0,1]