Mixture models

From Church Wiki
Jump to: navigation, search

UNDER CONSTRUCTION!!

Contents

Mixture model

We start with a simple dirichlet-multinomial mixture model.

First we write the model in a "direct" style, which exposes the hierarchical structure of the generative process directly via a series of maps:

(define objects '(bob jane mary steve))
(define object-indices (iota (length objects)))

(define observed-features (list true true false false))

(define categories '(a b c))
(define cat-indices (iota (length categories)))

(define psuedo-counts (make-list (length categories) 1.0))

(mh-query
 (define cats-dist (dirichlet psuedo-counts))
 (define cat-weights (map (lambda (cat) (beta 1 1)) categories))
 (define types (map (lambda (obj) (multinomial cat-indices cats-dist)) objects))
 (define observations (map (lambda (obj-index) 
                             (flip (list-ref cat-weights (list-ref types obj-index))))
                           object-indices))
 (map (lambda (cat-index) (list-ref categories cat-index)) types)
 (equal? observations observed-features))

The direct style uses a known set of objects to structure the generative process (via map). Next, we give a "random-world" style program representing the same model. This idiom builds latent values lazily (only when needed to evaluate the query or conditioning expressions) by using mem to implicitly structure the model:

(define objects '(bob jane mary steve))
(define observed-features (list true true false false))

(define categories '(a b c))

(define psuedo-counts (make-list (length categories) 1.0))

(mh-query
 (define cats-dist (dirichlet psuedo-counts))
 (define cat-weight (mem (cat) (beta 1 1)))
 (define type (mem (lambda (obj) (multinomial categories cats-dist))))
 (define observe (mem (lambda (obj) (flip (cat-weight (type obj))))))
 (map type objects)
 (equal? (map observe objects) observed-features))


Exploiting conjugacy

So far we have written programs that expose all random choices in the generative model to the inference engine, we can ease inference by collapsing conjugate pairs (and using the CRP representation of the Dirichlet process). Collapsed (exchangeable) distributions are represented in Church with XRP objects (eXchangeable Random Procedures).

An XRP (returned from a make-XRP library procedure) acts like an ordinary church thunk (procedure of no arguments). (In the code below we use the operator sample, which is an alias for the identity, to emphasize the places where we sample from an XRP).

For instance, if we collapse the dirichlet-multinomial and beta-binomial pairs, the simple finite mixture model above becomes:

(define objects '(bob jane mary steve))
(define observed-features (list true true false false))

(define categories '(a b c))

(define psuedo-counts (make-list (length categories) 1.0))

(mh-query
 (define cat-dist (make-dirichlet-multinomial categories psuedo-counts))
 (define cat-obs-fn (mem (cat) (make-beta-binomial 1.0 1.0)))
 (define type (mem (lambda (obj) (sample cat-dist))))
 (define observe (mem (lambda (obj) (sample (cat-obs-fn (type obj))))))
 (map type objects)
 (equal? (map observe objects) observed-features))

Infinite mixture model

(define objects '(bob jane mary steve))
(define observed-features (list true true false false))
  
(mh-query
 (define get-cat (DPmem 1.0 gensym))
 (define type (mem (lambda (obj) (get-cat))))
 (define cat-obs-fn (mem (lambda (cat) (make-beta-binomial 1.0 1.0))))
 (define observe (mem (lambda (obj) (sample (cat-obs-fn (type obj))))))
 (map type objects)
 (equal? (map observe objects) observed-features))


Infinite Gaussian mixture model

If the observed features are continuous, we can switch the observation model to normal with normal-gamma (conjugate) prior:

(define objects '(bob jane mary steve))
(define observed-features (list 0.3 0.2 0.9 1.2))
  
(mh-query
 (define get-cat (DPmem 1.0 gensym))
 (define type (mem (lambda (obj) (get-cat))))
 (define cat-obs-fn (mem (lambda (cat) (make-normal-normal-gamma 1.0 0.0 1.0))))
 (define observe (mem (lambda (obj) (sample (cat-obs-fn (type obj))))))
 (map type objects)
 (equal? (map observe objects) observed-features))


Stochastic block model

Infinite relational model

If we observe relations between objects (rather than features of single objects), we need only add an argument to the observe function, making it a relation:

(define objects '(bob jane mary steve))
  
(mh-query
 (define get-cat (DPmem 1.0 gensym))
 (define type (mem (lambda (obj) (get-cat))))
 (define reln-obs-fn (mem (lambda (cat1 cat2) (make-beta-binomial 1.0 1.0))))
 (define observe (mem (lambda (obj1 obj2) (sample (reln-obs-fn (type obj1) (type obj2))))))
 (map type objects)
 (and (observe 'bob 'jane)
           (not (observe 'jane 'steve))
           (observe 'mary 'steve)))

Latent Dirichlet allocation

Topic models are simple hierarchical clustering models. The (collapsed) LDA model is:

 (define num-topics 3)
 (define num-words 20)

 (define corpus-word-ind (iota (length corpus-words)))
  
 (define topic-hypers (make-list num-topics 1.0))
 (define word-hypers (make-list num-words 1.0))
     
 (mh-query
   (define word-from-topic (mem (lambda (topic) (make-dirichlet-discrete word-hypers))))
   (define topic-from-doc  (mem (lambda (doc) (make-dirichlet-discrete topic-hypers))))
   (define word-topic (mem (lambda (doc word-index) (sample (topic-from-doc doc)))))
   (define word (mem (lambda (doc word-index) (sample (word-from-topic (word-topic doc word-index))))))
   (map word-topic corpus-docs corpus-word-ind)
   (equal? (map word corpus-docs corpus-word-ind) corpus-observed-words))


HDP-topic model

A topic model with an unknown number of topics comes from simply replacing the dirichlet-multinomial with a Dirichlet process, and adding a DP prior over topics:

 (define num-words 20)

 (define corpus-word-ind (iota (length corpus-words)))
  
 (define word-hypers (make-list num-words 1.0))
     
 (mh-query
   (define topic (DPmem 1.0 gensym))
   (define word-from-topic (mem (lambda (topic) (make-dirichlet-discrete word-hypers))))
   (define topic-from-doc  (mem (lambda (doc) (DPmem 1.0 topic))))
   (define word-topic (mem (lambda (doc word-index) (sample (topic-from-doc doc)))))
   (define word (mem (lambda (doc word-index) (sample (word-from-topic (word-topic doc word-index))))))
   (map word-topic corpus-docs corpus-word-ind)
   (equal? (map word corpus-docs corpus-word-ind) corpus-observed-words))

Personal tools