hyperspy.external.astroML package¶
Submodules¶
hyperspy.external.astroML.bayesian_blocks module¶
Bayesian Block implementation¶
Dynamic programming algorithm for finding the optimal adaptivewidth histogram.
Based on Scargle et al 2012 [1]_
References
[1]  http://adsabs.harvard.edu/abs/2012arXiv1207.5578S 

class
hyperspy.external.astroML.bayesian_blocks.
Events
(p0=0.05, gamma=None)¶ Bases:
hyperspy.external.astroML.bayesian_blocks.FitnessFunc
Fitness for binned or unbinned events
Parameters:  p0 (float) – False alarm probability, used to compute the prior on N (see eq. 21 of Scargle 2012). Default prior is for p0 = 0.
 gamma (float or None) – If specified, then use this gamma to compute the general prior form, p ~ gamma^N. If gamma is specified, p0 is ignored.

fitness
(N_k, T_k)¶

prior
(N, Ntot)¶

class
hyperspy.external.astroML.bayesian_blocks.
FitnessFunc
(p0=0.05, gamma=None)¶ Bases:
object
Base class for fitness functions
Each fitness function class has the following:  fitness(...) : compute fitness function.
Arguments accepted by fitness must be among [T_k, N_k, a_k, b_k, c_k] prior(N, Ntot) : compute prior on N given a total number of points Ntot

args
¶

fitness
(**kwargs)¶

gamma_prior
(N, Ntot)¶ Basic prior, parametrized by gamma (eq. 3 in Scargle 2012)

p0_prior
(N, Ntot)¶

prior
(N, Ntot)¶

validate_input
(t, x, sigma)¶ Check that input is valid

class
hyperspy.external.astroML.bayesian_blocks.
PointMeasures
(p0=None, gamma=None)¶ Bases:
hyperspy.external.astroML.bayesian_blocks.FitnessFunc
Fitness for point measures
Parameters: gamma (float) – specifies the prior on the number of bins: p ~ gamma^N if gamma is not specified, then a prior based on simulations will be used (see sec 3.3 of Scargle 2012) 
fitness
(a_k, b_k)¶

prior
(N, Ntot)¶


class
hyperspy.external.astroML.bayesian_blocks.
RegularEvents
(dt, p0=0.05, gamma=None)¶ Bases:
hyperspy.external.astroML.bayesian_blocks.FitnessFunc
Fitness for regular events
This is for data which has a fundamental “tick” length, so that all measured values are multiples of this tick length. In each tick, there are either zero or one counts.
Parameters:  dt (float) – tick rate for data
 gamma (float) – specifies the prior on the number of bins: p ~ gamma^N

fitness
(T_k, N_k)¶

validate_input
(t, x, sigma)¶

hyperspy.external.astroML.bayesian_blocks.
bayesian_blocks
(t, x=None, sigma=None, fitness='events', **kwargs)¶ Bayesian Blocks Implementation
This is a flexible implementation of the Bayesian Blocks algorithm described in Scargle 2012 [1]_
Parameters:  t (array_like) – data times (one dimensional, length N)
 x (array_like (optional)) – data values
 sigma (array_like or float (optional)) – data errors
 fitness (str or object) –
the fitness function to use. If a string, the following options are supported:
 ‘events’ : binned or unbinned event data
 extra arguments are p0, which gives the false alarm probability to compute the prior, or gamma which gives the slope of the prior on the number of bins.
 ‘regular_events’ : nonoverlapping events measured at multiples
 of a fundamental tick rate, dt, which must be specified as an additional argument. The prior can be specified through gamma, which gives the slope of the prior on the number of bins.
 ‘measures’ : fitness for a measured sequence with Gaussian errors
 The prior can be specified using gamma, which gives the slope of the prior on the number of bins. If gamma is not specified, then a simulationderived prior will be used.
Alternatively, the fitness can be a userspecified object of type derived from the FitnessFunc class.
Returns: edges – array containing the (N+1) bin edges
Return type: ndarray
Examples
Event data:
>>> t = np.random.normal(size=100) >>> bins = bayesian_blocks(t, fitness='events', p0=0.01)
Event data with repeats:
>>> t = np.random.normal(size=100) >>> t[80:] = t[:20] >>> bins = bayesian_blocks(t, fitness='events', p0=0.01)
Regular event data:
>>> dt = 0.01 >>> t = dt * np.arange(1000) >>> x = np.zeros(len(t)) >>> x[np.random.randint(0, len(t), len(t) / 10)] = 1 >>> bins = bayesian_blocks(t, fitness='regular_events', dt=dt, gamma=0.9)
Measured point data with errors:
>>> t = 100 * np.random.random(100) >>> x = np.exp(0.5 * (t  50) ** 2) >>> sigma = 0.1 >>> x_obs = np.random.normal(x, sigma) >>> bins = bayesian_blocks(t, fitness='measures')
References
[1] Scargle, J et al. (2012) http://adsabs.harvard.edu/abs/2012arXiv1207.5578S See also
astroML.plotting.hist()
 histogram plotting function which can make use of bayesian blocks.
hyperspy.external.astroML.histtools module¶
Tools for working with distributions

class
hyperspy.external.astroML.histtools.
KnuthF
(data)¶ Bases:
object
Class which implements the function minimized by knuth_bin_width
Parameters: data (arraylike, one dimension) – data to be histogrammed Notes
the function F is given by
where is the Gamma function, is the number of data points, is the number of measurements in bin .
See also
knuth_bin_width
,astroML.plotting.hist

bins
(M)¶ Return the bin edges given a width dx

eval
(M)¶ Evaluate the Knuth function
Parameters: dx (float) – Width of bins Returns: F – evaluation of the negative Knuth likelihood function: smaller values indicate a better fit. Return type: float


hyperspy.external.astroML.histtools.
freedman_bin_width
(data, return_bins=False)¶ Return the optimal histogram bin width using the FreedmanDiaconis rule
Parameters:  data (arraylike, ndim=1) – observed (onedimensional) data
 return_bins (bool (optional)) – if True, then return the bin edges
Returns:  width (float) – optimal bin width using Scott’s rule
 bins (ndarray) – bin edges: returned if return_bins is True
Notes
The optimal bin width is
where is the percent quartile of the data, and is the number of data points.
See also
knuth_bin_width()
,scotts_bin_width()
,astroML.plotting.hist()

hyperspy.external.astroML.histtools.
histogram
(a, bins=10, range=None, **kwargs)¶ Enhanced histogram
This is a histogram function that enables the use of more sophisticated algorithms for determining bins. Aside from the bins argument allowing a string specified how bins are computed, the parameters are the same as numpy.histogram().
Parameters:  a (array_like) – array of data to be histogrammed
 bins (int or list or str (optional)) – If bins is a string, then it must be one of: ‘blocks’ : use bayesian blocks for dynamic bin widths ‘knuth’ : use Knuth’s rule to determine bins ‘scotts’ : use Scott’s rule to determine bins ‘freedman’ : use the Freedmandiaconis rule to determine bins
 range (tuple or None (optional)) – the minimum and maximum range for the histogram. If not specified, it will be (x.min(), x.max())
 keyword arguments are described in numpy.hist(). (other) –
Returns:  hist (array) – The values of the histogram. See normed and weights for a description of the possible semantics.
 bin_edges (array of dtype float) – Return the bin edges
(length(hist)+1)
.
See also
numpy.histogram()
,astroML.plotting.hist()

hyperspy.external.astroML.histtools.
knuth_bin_width
(data, return_bins=False)¶ Return the optimal histogram bin width using Knuth’s rule [1]_
Parameters:  data (arraylike, ndim=1) – observed (onedimensional) data
 return_bins (bool (optional)) – if True, then return the bin edges
Returns:  dx (float) – optimal bin width. Bins are measured starting at the first data point.
 bins (ndarray) – bin edges: returned if return_bins is True
Notes
The optimal number of bins is the value M which maximizes the function
where is the Gamma function, is the number of data points, is the number of measurements in bin .
References
[1] Knuth, K.H. “Optimal DataBased Binning for Histograms”. arXiv:0605197, 2006 See also

hyperspy.external.astroML.histtools.
scotts_bin_width
(data, return_bins=False)¶ Return the optimal histogram bin width using Scott’s rule:
Parameters:  data (arraylike, ndim=1) – observed (onedimensional) data
 return_bins (bool (optional)) – if True, then return the bin edges
Returns:  width (float) – optimal bin width using Scott’s rule
 bins (ndarray) – bin edges: returned if return_bins is True
Notes
The optimal bin width is
where is the standard deviation of the data, and is the number of data points.
See also
knuth_bin_width()
,freedman_bin_width()
,astroML.plotting.hist()