The Ensemble Sampler

Standard usage of emcee involves instantiating an EnsembleSampler.

class emcee.EnsembleSampler(nwalkers, ndim, log_prob_fn, pool=None, moves=None, args=None, kwargs=None, backend=None, vectorize=False, blobs_dtype=None, parameter_names: Optional[Union[Dict[str, int], List[str]]] = None, a=None, postargs=None, threads=None, live_dangerously=None, runtime_sortingfn=None)

An ensemble MCMC sampler

If you are upgrading from an earlier version of emcee, you might notice that some arguments are now deprecated. The parameters that control the proposals have been moved to the Moves interface (a and live_dangerously), and the parameters related to parallelization can now be controlled via the pool argument (Parallelization).

Parameters
  • nwalkers (int) – The number of walkers in the ensemble.

  • ndim (int) – Number of dimensions in the parameter space.

  • log_prob_fn (callable) – A function that takes a vector in the parameter space as input and returns the natural logarithm of the posterior probability (up to an additive constant) for that position.

  • moves (Optional) – This can be a single move object, a list of moves, or a “weighted” list of the form [(emcee.moves.StretchMove(), 0.1), ...]. When running, the sampler will randomly select a move from this list (optionally with weights) for each proposal. (default: StretchMove)

  • args (Optional) – A list of extra positional arguments for log_prob_fn. log_prob_fn will be called with the sequence log_pprob_fn(p, *args, **kwargs).

  • kwargs (Optional) – A dict of extra keyword arguments for log_prob_fn. log_prob_fn will be called with the sequence log_pprob_fn(p, *args, **kwargs).

  • pool (Optional) – An object with a map method that follows the same calling sequence as the built-in map function. This is generally used to compute the log-probabilities for the ensemble in parallel.

  • backend (Optional) – Either a backends.Backend or a subclass (like backends.HDFBackend) that is used to store and serialize the state of the chain. By default, the chain is stored as a set of numpy arrays in memory, but new backends can be written to support other mediums.

  • vectorize (Optional[bool]) – If True, log_prob_fn is expected to accept a list of position vectors instead of just one. Note that pool will be ignored if this is True. (default: False)

  • parameter_names (Optional[Union[List[str], Dict[str, List[int]]]]) – names of individual parameters or groups of parameters. If specified, the log_prob_fn will recieve a dictionary of parameters, rather than a np.ndarray.

property acceptance_fraction

The fraction of proposed steps that were accepted

compute_log_prob(coords)

Calculate the vector of log-probability for the walkers

Parameters

coords – (ndarray[…, ndim]) The position vector in parameter space where the probability should be calculated.

This method returns:

  • log_prob: A vector of log-probabilities with one entry for each walker in this sub-ensemble.

  • blob: The list of meta data returned by the log_post_fn at this position or None if nothing was returned.

get_autocorr_time(**kwargs)

Compute an estimate of the autocorrelation time for each parameter

Parameters
  • thin (Optional[int]) – Use only every thin steps from the chain. The returned estimate is multiplied by thin so the estimated time is in units of steps, not thinned steps. (default: 1)

  • discard (Optional[int]) – Discard the first discard steps in the chain as burn-in. (default: 0)

Other arguments are passed directly to emcee.autocorr.integrated_time().

Returns

The integrated autocorrelation time estimate for the

chain for each parameter.

Return type

array[ndim]

get_blobs(**kwargs)

Get the chain of blobs for each sample in the chain

Parameters
  • flat (Optional[bool]) – Flatten the chain across the ensemble. (default: False)

  • thin (Optional[int]) – Take only every thin steps from the chain. (default: 1)

  • discard (Optional[int]) – Discard the first discard steps in the chain as burn-in. (default: 0)

Returns

The chain of blobs.

Return type

array[.., nwalkers]

get_chain(**kwargs)

Get the stored chain of MCMC samples

Parameters
  • flat (Optional[bool]) – Flatten the chain across the ensemble. (default: False)

  • thin (Optional[int]) – Take only every thin steps from the chain. (default: 1)

  • discard (Optional[int]) – Discard the first discard steps in the chain as burn-in. (default: 0)

Returns

The MCMC samples.

Return type

array[.., nwalkers, ndim]

get_last_sample(**kwargs)

Access the most recent sample in the chain

get_log_prob(**kwargs)

Get the chain of log probabilities evaluated at the MCMC samples

Parameters
  • flat (Optional[bool]) – Flatten the chain across the ensemble. (default: False)

  • thin (Optional[int]) – Take only every thin steps from the chain. (default: 1)

  • discard (Optional[int]) – Discard the first discard steps in the chain as burn-in. (default: 0)

Returns

The chain of log probabilities.

Return type

array[.., nwalkers]

property random_state

The state of the internal random number generator. In practice, it’s the result of calling get_state() on a numpy.random.mtrand.RandomState object. You can try to set this property but be warned that if you do this and it fails, it will do so silently.

reset()

Reset the bookkeeping parameters

run_mcmc(initial_state, nsteps, **kwargs)

Iterate sample() for nsteps iterations and return the result

Parameters
  • initial_state – The initial state or position vector. Can also be None to resume from where :func:run_mcmc left off the last time it executed.

  • nsteps – The number of steps to run.

Other parameters are directly passed to sample().

This method returns the most recent result from sample().

sample(initial_state, log_prob0=None, rstate0=None, blobs0=None, iterations=1, tune=False, skip_initial_state_check=False, thin_by=1, thin=None, store=True, progress=False)

Advance the chain as a generator

Parameters
  • initial_state (State or ndarray[nwalkers, ndim]) – The initial State or positions of the walkers in the parameter space.

  • iterations (Optional[int or NoneType]) – The number of steps to generate. None generates an infinite stream (requires store=False).

  • tune (Optional[bool]) – If True, the parameters of some moves will be automatically tuned.

  • thin_by (Optional[int]) – If you only want to store and yield every thin_by samples in the chain, set thin_by to an integer greater than 1. When this is set, iterations * thin_by proposals will be made.

  • store (Optional[bool]) – By default, the sampler stores (in memory) the positions and log-probabilities of the samples in the chain. If you are using another method to store the samples to a file or if you don’t need to analyze the samples after the fact (for burn-in for example) set store to False.

  • progress (Optional[bool or str]) – If True, a progress bar will be shown as the sampler progresses. If a string, will select a specific tqdm progress bar - most notable is 'notebook', which shows a progress bar suitable for Jupyter notebooks. If False, no progress bar will be shown.

  • skip_initial_state_check (Optional[bool]) – If True, a check that the initial_state can fully explore the space will be skipped. (default: False)

Every thin_by steps, this generator yields the State of the ensemble.

Note that several of the EnsembleSampler methods return or consume State objects:

class emcee.State(coords, log_prob=None, blobs=None, random_state=None, copy=False)

The state of the ensemble during an MCMC run

For backwards compatibility, this will unpack into coords, log_prob, (blobs), random_state when iterated over (where blobs will only be included if it exists and is not None).

Parameters
  • coords (ndarray[nwalkers, ndim]) – The current positions of the walkers in the parameter space.

  • log_prob (ndarray[nwalkers, ndim], Optional) – Log posterior probabilities for the walkers at positions given by coords.

  • blobs (Optional) – The metadata “blobs” associated with the current position. The value is only returned if lnpostfn returns blobs too.

  • random_state (Optional) – The current state of the random number generator.