Documentation#

Data Analysis Functions#

H2MM_C.EM_H2MM_C(h2mm_model h_mod, indexes, times, max_iter=None, print_func=u'iter', print_args=None, bounds_func=None, bounds=None, max_time=np.inf, converged_min=None, num_cores=None, reset_niter=True, gamma=False, opt_array=False)#

Calculate the most likely model that explains the given set of data. The input model is used as the start of the optimization.

Parameters:
  • h_model (h2mm_model) – An initial guess for the H2MM model, just give a general guess, the algorithm will optimize, and generally the algorithm will converge even when the initial guess is very far off

  • indexes (list of NUMPY 1D int arrays) – A list of the arrival indexes for each photon in each burst. Each element of the list (a numpy array) corresponds to a burst, and each element of the array is a singular photon. The indexes list must maintain 1-to-1 correspondence to the times list

  • times (list of NUMPY 1D int arrays) – A list of the arrival times for each photon in each burst Each element of the list (a numpy array) corresponds to a burst, and each element of the array is a singular photon. The times list must maintain 1-to-1 correspondence to the indexes list

  • print_func (None, str or callable, optional) –

    Specifies how the results of each iteration will be displayed, several strings specify built-in functions. Default is ‘iter’ Acceptable inputs: str, Callable or None

    None:

    causes no printout anywhere of the results of each iteration

    Str: ‘console’, ‘all’, ‘diff’, ‘comp’, ‘iter’

    ’console’:

    the results will be printed to the terminal/console window this is useful to still be able to see the results, but not clutter up the output of your Jupyter Notebook, this is the default

    ’all’:

    prints out the full h2mm_model just evaluated, this is very verbose, and not generally recommended unless you really want to know every step of the optimization

    ’diff’:

    prints the loglik of the iteration, and the difference between the current and old loglik, this is the same format that the ‘console’ option used, the difference is whether the print function used is the C printf, or Cython print, which changes the destination from the console to the Jupyter notebook

    ’diff_time’:

    same as ‘diff’ but adds the time of the last iteration and the total optimization time

    ’comp’:

    prints out the current and old loglik

    ’comp_time’:

    same as ‘comp’ but with times, like in ‘diff_time’

    ’iter’:

    prints out only the current iteration

    Callable: user defined function

    A python function for printing out a custom output, the function must accept the input of (int, h2mm_model, h2mm_model, h2mm_model, float, float) as the function will be handed (niter, new, current, old, t_iter, t_total) from a special Cython wrapper. t_iter and t_total are the times of the iteration and total time respectively, in seconds, based on the C level clock function, which is notably inaccurate, often reporting larger than actual values.

  • print_args (2-tuple/list (int, bool) or None, optional) – Arguments to further customize the printing options. The format is (int bool) where int is how many iterations before updating the display and the bool is True if the printout will concatenate, and False if the display will be kept to one line, The default is None, which is changed into (1, False). If only an int or a bool is specified, then the default value for the other will be used. If a custom printing function is given then this argument will be passed to the function as *args. Default is None

  • bounds_func (str, callable or None, optional) –

    function to be evaluated after every iteration of the H2MM algorithm its primary function is to place bounds on h2mm_model

    Note

    bounding the h2mm_model causes the guarantee of improvement on each iteration until convergence to no longer apply, therefore the results are no longer guaranteed to be the optimal model within bounds.

    Default is None

    Acceptable inputs:

    C level limits: ‘minmax’ ‘revert’ ‘revert_old’

    prevents the model from having values outside of those defined by h2mm_limits class given to bounds, if an iteration produces a new model for loglik calculation in the next iteration with values that are out of those bounds, the 3 function differ in how they correct the model when a new model has a value out of bounds, they are as follows:

    ’minmax’:

    sets the out of bounds value to the min or max value closer to the original value

    ’revert’:

    sets the out of bounds value to the value from the last model for which the loglik was calculated

    ’revert_old’:

    similar to revert, but instead reverts to the value from the model one before the last calculated model

    Callable: python function that takes 4 inputs and returns h2mm_model object

    Warning

    The user takes full responsibility for errors in the results.

    must be a python function that takes 4 arguments, the first 3 are the h2mm_model objects, in this order: new, current, old, the fourth is the argument supplied to the bounds keyword argument, the function must return a single h2mm_limits object, the prior, trans and obs fields of which will be optimized next, other values are ignored.

  • bounds (h2mm_limits, str, 4th input to callable, or None, optional) – The argument to be passed to the bounds_func function. If bounds_func is None, then bounds will be ignored. If bounds_func is ‘minmax’, ‘revert’ or ‘revert_old’ (calling C-level bounding functions), then bounds must be a h2mm_limits object, if bounds_func is a callable, bounds must be an acceptable input as the fourth argument to the function passed to bounds_func Default is None

  • max_iter (int or None, optional) – the maximum number of iterations to conduct before returning the current h2mm_model, if None (default) use value from optimization_limits. Default is None

  • max_time (float or None, optional) – The maximum time (in seconds) before returning current model NOTE this uses the C clock, which has issues, often the time assessed by C, which is usually longer than the actual time. If None (default) use value from optimization_limits. Default is None

  • converged_min (float or None, optional) – The difference between new and current h2mm_model to consider the model converged, the default setting is close to floating point error, it is recommended to only increase the size. If None (default) use value from optimization_limits. Default is None

  • num_cores (int or None, optional) –

    the number of C threads (which ignore the gil, thus functioning more like python processes), to use when calculating iterations.

    Note

    optimization_limtis sets this to be os.cpu_count() // 2, as most machines are multithreaded. Because os.cpu_count() returns the number of threads, not necessarily the number of cpus. For most machines, being multithreaded, this actually returns twice the number of physical cores. Hence the default to set at os.cpu_count() // 2. If your machine is not multithreaded, it is best to set optimization_limits.num_cores = os.cpu_count()

    If None (default) use value from optimization_limits. Default is None

  • reset_niter (bool, optional) – Tells the algorithm whether or not to reset the iteration counter of the model, True means that even if the model was previously optimized, the iteration counter will start at 0, so calling EM_H2MM_C() on a model that reached its maximum iteration number will proceed another max_iter iterations. On the other hand, if set to False, then the new iterations will be added to the old. If set to False, you likely will have to increase max_iter, as the optimization will count the previous iterations towards the max_iter threshold. Default is False

  • gamma (bool, optional) –

    Whether or not to return the gamma array, which gives the probabilities of each photon being in a given state.

    Note

    If opt_array is True, then only the gamma arrays for the ideal model are returned.

    Default is False

  • opt_array (bool) –

    Defines how the result is returned, if False (default) then return only the ideal model, if True, then a numpy array is returned containing all models during the optimization.

    Note

    The last model in the array is the model after the convergence criterion have been met. Therefore check the h2mm_model.conv_code, if it is 7, then the second to last model is the ideal model. Still, usually the last model will be so close to ideal that it will not make any significant difference

    Default is False

Returns:

  • out (h2mm_model) – The optimized h2mm_model. will return after one of the following conditions are met: model has converged (according to converged_min, default 1e-14), maximum iterations reached, maximum time has passed, or an error has occurred (usually the result of a nan from a floating point precision error)

  • gamma (list, tuple or np.ndarray (optional)) – If gamma = True, this variable is returned, which contains (as numpy arrays) the probabilities of each photon to be in each state, for each burst. The arrays are returned in the same order and type as the input indexes, individual data points organized as [photon, state] within each data sequence.

H2MM_C.factory_h2mm_model(nstate, ndet, bounds=None, trans_scale=1e-5, prior_dist='equal', trans_dist='equal', obs_dist='equal')#

Function for generating a model of an arbitrary number of states/streams

Parameters:
  • nstate (int) – Number of states in the return model

  • ndet (int, optional) – Number of photons streams in the return model

  • bounds (h2mm_limits or None, optional) – The limits expected to be used when optimizing the model. For limits are given, the model will always be within those limits, distribution of states within the model will depend on how The default is None.

  • trans_scale (float) – Only used when no h2mm_limits object is supplied to bounds, sets the minimum (slowest) transition rates in the trans array

  • prior_dist ('even' or 'random', optional) – Define how to distribute prior states. options are ‘equal’ and ‘random’ The default is ‘equal’.

  • trans_dist ('even' or 'random', optional) – Define how to distribute trans states. options are ‘equal’ and ‘random’ The default is ‘equal’.

  • obs_dist ('even' or 'random', optional) – Define how to distribute obs states. options are ‘equal’ and ‘random’ The default is ‘equal’.

Returns:

model – An h2mm_model object ready for optimization

Return type:

h2MM_model

H2MM_C.H2MM_arr(h_mod, indexes, times, num_cores=None, gamma=False)#

Calculate the logliklihood of every model in a list/array given a set of data

Parameters:
  • h_model (list, tuple, or numpy.ndarray of h2mm_model objects) – All of the h2mm_model object for which the loglik will be calculated

  • indexes (list or tuple of NUMPY 1D int arrays) – A list of the arrival indexes for each photon in each burst. Each element of the list (a numpy array) corresponds to a burst, and each element of the array is a singular photon. The indexes list must maintain 1-to-1 correspondence to the times list

  • times (list or tuple of NUMPY 1D int arrays) – A list of the arrival times for each photon in each burst Each element of the list (a numpy array) corresponds to a burst, and each element of the array is a singular photon. The times list must maintain 1-to-1 correspondence to the indexes list

  • gamma (bool, optional) –

    Whether or not to return the gamma array, which gives the probabilities of each photon being in a given state. the number of C threads (which ignore the gil, thus functioning more like python processes), to use when calculating iterations.

    Note

    optimization_limtis sets this to be os.cpu_count() // 2, as most machines are multithreaded. Because os.cpu_count() returns the number of threads, not necessarily the number of cpus. For most machines, being multithreaded, this actually returns twice the number of physical cores. Hence the default to set at os.cpu_count() // 2. If your machine is not multithreaded, it is best to set optimization_limits.num_cores = os.cpu_count()

    If None (default) use value from optimization_limits. Default is None

Returns:

  • out (list tuple, or numpy.ndarray of h2mm_model) – A list, tuple, or numpy array of the h2mm_model with the loglik computed The converged state is automatically set to 0, and nphot is set in accordance with the number of photons in the data set. The datatype returned is the same as the datatype of h_model

  • gamma_arr (list, tuple or np.ndarray (optional)) – If gamma = True, this variable is returned, which contains (as numpy arrays) the probabilities of each photon to be in each state, for each burst and model. The arrays are returned in the same order and type as the input models and indexes, individual data points organized as [photon, state] within each data sequence.

H2MM_C.viterbi_path(h2mm_model h_mod, indexes, times, num_cores=None)#

Calculate the most likely state path through a set of data given a H2MM model

Parameters:
  • h_model (h2mm_model) – An h2mm_model, should be optimized for the given data set (result of EM_H2MM_C() or h2mm_model.optimize()) to ensure results correspond to give the most likely path

  • indexes (list or tuple of NUMPY 1D int arrays) – A list of the arrival indexes for each photon in each burst. Each element of the list (a numpy array) corresponds to a burst, and each element of the array is a singular photon. The indexes list must maintain 1-to-1 correspondence to the times list

  • times (list or tuple of NUMPY 1D int arrays) – A list of the arrival times for each photon in each burst Each element of the list (a numpy array) corresponds to a burst, and each element of the array is a singular photon. The times list must maintain 1-to-1 correspondence to the indexes list

  • num_cores (int or None, optional) – the number of C threads (which ignore the gil, thus functioning more like python processes), to use when calculating iterations. Note that os.cpu_count() returns number of threads, but it is ideal to take the number of physical cores. Therefore, unless reset by user, optimization_limtis sets this to be os.cpu_count() // 2, as most machines are multithreaded. If None (default) use value from optimization_limits. Default is None

Returns:

  • path (list of NUMPY 1D int arrays) – The most likely state path for each photon

  • scale (list of NUMPY 1D float arrays) – The posterior probability for each photon

  • ll (NUMPY 1D float array) – loglikelihood of each burst

  • icl (float) – Integrated complete likelihood, essentially the BIC for the Viterbi path

H2MM_C.viterbi_sort(h2mm_model hmod, indexes, times, num_cores=None)#

An all inclusive viterbi processing algorithm. Returns the ICL, the most likely state path, posterior probabilities, and a host of information sorted by bursts and dwells

Parameters:
  • h_model (h2mm_model) – An h2mm_model, should be optimized for the given data set (result of EM_H2MM_C() or h2mm_model.optimize()) to ensure results correspond to give the most likely path

  • indexes (list or tuple of NUMPY 1D int arrays) – A list of the arrival indexes for each photon in each burst. Each element of the list (a numpy array) corresponds to a burst, and each element of the array is a singular photon. The indexes list must maintain 1-to-1 correspondence to the times list

  • times (list or tuple of NUMPY 1D int arrays) – A list of the arrival times for each photon in each burst Each element of the list (a numpy array) corresponds to a burst, and each element of the array is a singular photon. The times list must maintain 1-to-1 correspondence to the indexes list

  • num_cores (int or None, optional) –

    the number of C threads (which ignore the gil, thus functioning more like python processes), to use when calculating iterations.

    Note

    optimization_limtis sets this to be os.cpu_count() // 2, as most machines are multithreaded. Because os.cpu_count() returns the number of threads, not necessarily the number of cpus. For most machines, being multithreaded, this actually returns twice the number of physical cores. Hence the default to set at os.cpu_count() // 2. If your machine is not multithreaded, it is best to set optimization_limits.num_cores = os.cpu_count()

    If None (default) use value from optimization_limits. Default is None

Returns:

  • icl (float) – Integrated complete likelihood, essentially the BIC for the Viterbi path

  • path (list of NUMPY 1D int arrays) – The most likely state path for each photon

  • scale (list of NUMPY 1D float arrays) – The posterior probability for each photon

  • ll (NUMPY 1D float array) – loglikelihood of each burst

  • burst_type (NUMPY 1D int array) – Identifies which states are present in the burst, identified in a binary format, ie if states 0 and 2 are present in a burst, that element will be 0b101 = 5

  • dwell_mid (square list of lists of NUMPY 1D int arrays) – Gives the dwell times of all dwells with full residence time in the data set, sorted by the state of the dwell (Top level list), and the successor state (Lower level list), each element of the numpy array is the duration of the dwell in the clock rate of the data set.

  • dwell_beg (square list of lists of NUMPY 1D int arrays) – Gives the dwell times of all dwells that start at the beginning of a burst, sorted by the state of the dwell (Top level list), and the successor state (Lower level list), each element of the numpy array is the duration of the dwell in the clock rate of the data set.

  • dwell_end (square list of lists of NUMPY 1D int arrays) – Gives the dwell times of all dwells that start at the end of a burst, sorted by the state of the dwell (Top level list), and the preceding state (Lower level list), each element of the numpy array is the duration of the dwell in the clock rate of the data set.

  • dwell_burst (list of NUMPY 1D int arrays) – List of bursts that were only in one state, giving their durations.

  • ph_counts (list of NUMPY 2D int arrays) – Counts of photons in each dwell, sorted by index and state of the dwell The index of the list identifies the state of the burst, the 1 index identifies the index of photon, and the 0 index are individual dwells.

  • ph_mid (square list of NUMPY 2D arrays) – Gives the photon counts of dwells in the middle of bursts, sorted by state of the dwell (top level of the list) and the state of the next dwell in the burst (lower level burst), each row in an array is a single dwell, the columns correspond to the counts of each photon stream.

  • ph_beg (square list of NUMPY 2D arrays) – Gives the photon counts of dwells at the beginning of bursts, sorted by state of the dwell (top level of the list) and the state of the next dwell in the burst (lower level list), each row in an array is a single dwell, the columns correspond to the counts of each photon stream.

  • ph_end (square list of NUMPY 2D arrays) – Gives the photon counts of dwells at the ends of bursts, sorted by state of the dwell (top level of the list) and the state of the previous dwell in the burst, each row in an array is a single dwell, the columns correspond to the counts of each photon stream.

  • ph_burst (list of NUMPY 2D arrays) – Gives the counts of photons in bursts with only a single state. Sorted according to the state of the burst. Each row in an array is a single dwell, the columns correspond to the counts of each photon stream.

H2MM_C.path_loglik(h2mm_model model, indexes, times, state_path, num_cores=None, BIC=True, total_loglik=False, loglikarray=False)#

Function for calculating the statistical parameters of a specific state path and data set given a model (ln(P(XY|lambda))). By default returns the BIC of the entire data set, can also return the base logliklihood in total and/or per burst. Which return values selected are defined in keyword arguments.

Parameters:
  • model (h2mm_model) – DESCRIPTION.

  • indexes (list[numpy.ndarray], tuple[numpy.ndarray] or numpy.ndarray[numpy.ndarray]) – DESCRIPTION.

  • times (list, tuple, or numpy.ndarray of integer numpy.ndarray) – DESCRIPTION.

  • state_path (list, tuple, or numpy.ndarray of integer numpy.ndarray) – DESCRIPTION.

  • num_cores (int, optional) – Number of cores to use in calculation, if None, then use value in optimization_limits. The default is None.

  • BIC (bool, optional) – Whether to return the Bayes Information Criterion of the calculation. The default is True.

  • total_loglik (bool, optional) – Whether to return the loglikelihood of the entire data set as a single number. The default is False.

  • loglikarray (bool, optional) – Whether to return the loglikelihoods of each burst as a numpy array. The default is False.

Raises:
  • TypeError – One or more inputs did not contain the proper data types.

  • ValueError – One or more values out of the acceptable ranges given the input model

  • RuntimeError – Deeper issue, raise issue on github.

Returns:

  • bic (float) – Bayes information criterion of entire data set and path.

  • loglik (float) – Log likelihood of entire dataset

  • loglik_array (np.ndarray) – Loglikelihood of each burst separately

Simulation Functions#

H2MM_C.sim_phtraj_from_times(h2mm_model hmod, ndarray times, seed=None)#

Generate a state path and photon trajectory for a given set of times

Parameters:
  • hmod (h2mm_model) – An h2mm_model to build the path and stream trajectories from

  • times (numpy.ndarray 1D int) – An unsigned integer of monotonically increasing times for the burst

  • seed (positive int, optional) – The initial random seed, use if you want to be able to replicate results. If None, then uses the current time as the seed. The default is None.

Raises:
  • TypeError – tiems array must be 1D

  • ValueError – times were not monotonically increasing

  • RuntimeError – Unknown error, raise issue on github

Returns:

  • path (numpy.ndarray 1D int) – State trajectory generated for the input times

  • dets (numpy.ndarray 1D int) – stream trajectory generate for the input times, derived from path

H2MM_C.sim_phtraj_from_state(h2mm_model hmod, ndarray states, seed=None)#

Generate a photon trajectory from a h2mm model and state trajectory

Parameters:
  • hmod (h2mm_model) – An h2mm_model to build the stream trajectory from

  • states (numpy ndarray 1D int) –

  • seed (positive int, optional) – The initial random seed, use if you want to be able to replicate results. If None, then uses the current time as the seed. The default is None.

Raises:
  • TypeError – The state array was not 1D

  • RuntimeError – Unknown error, please raise issue on github

Returns:

dets – The random photon stream indexes based on model and statepath

Return type:

numpy ndarray 1D int

H2MM_C.sim_sparsestatepath(h2mm_model hmod, ndarray times, seed=None)#

Generate a state path from a model and a sparse set of arrival times

Parameters:
  • hmod (h2mm_model) – An h2mm_model to build the state path from

  • times (numpy ndarray 1D int) – An unsigned integer of monotonically increasing times for the burst

  • seed (positive int, optional) – The initial random seed, use if you want to be able to replicate results. If None, then uses the current time as the seed. The default is None.

Raises:
  • TypeError – Your array was not 1D

  • ValueError – Your time were not monotonically increasing

  • RuntimeError – Unknown error, please raise issue on github

Returns:

path – A randomly generated statepath based on the input model and times

Return type:

numpy ndarray 1D long int

H2MM_C.sim_statepath(h2mm_model hmod, int lenp, seed=None)#

A function to generate a dense statepath (HMM as opposed to H2MM) of length lenp

Parameters:
  • hmod (h2mm_model) – An h2mm_model to build the state path from

  • lenp (int) – The length of the path you want to generate

  • seed (positive int, optional) – The initial random seed, use if you want to be able to replicate results. If None, then uses the current time as the seed. The default is None.

Raises:

RuntimeError – A problem with the C code, this is for future proofing

Returns:

path_ret – The random dense state-path

Return type:

numpy ndarray, positive integer

Classes#

class H2MM_C.h2mm_model(prior, trans, obs, loglik=- np.inf, niter=0, nphot=0, is_conv=False)#

The base class for storing objects representing the H2MM model, stores the relevant matrices, loglikelihood and information about status of optimization.

Parameters:
  • prior (numpy.ndarray) – The initial probability matrix, 1D, size nstate

  • trans (numpy.ndarray) – The transition probability matrix, 2D, shape [nstate, nstate]

  • obs (numpy.ndarray) – The emission probability matrix, 2D, shape [nstate, ndet]

  • Parameters (Optional) –

  • -------------------

  • note:: (..) – These are generally only used when trying to re-create a model from, for instance a previous optimization of a closed notebook.

  • loglik (float (optional)) – The loglikelihood, always set with nphot otherwise BIC will not be calculated correctly. Usually should be set by evaluation against data. Default is -inf

  • niter (int) – The number of iterations, used for recreating previously optimized model. Usually should be set by evaluation against data. Default is 0

  • nphot (int (optional)) – The number of photons in data set used for optimization, should only be set when recreating model, Usually should be set by evaluation against data. Default is 0

  • is_conv (bool (optional)) – Whether or not the model has met a convergence criterion, should only be set when recreating model, otherwise allow evaluation functions to handle this operation. Default is False

bic#

Bayes Information Criterion of model

conv_code#

The convergence code of model, an int

conv_str#

String description of how model optimization/calculation ended

copy(self)#

Make a duplicate model in new memory

evaluate(self, indexes, times, gamma=False, num_cores=None, inplace=True)#

Calculate the loglikelihood of the model given a set of data. The loglikelihood is stored with the model. NOTE: this method calls the H2MM_arr function, and has essentially the same syntax

Parameters:
  • indexes (list or tuple of NUMPY 1D int arrays) – A list of the arrival indexes for each photon in each burst. Each element of the list (a numpy array) corresponds to a burst, and each element of the array is a singular photon. The indexes list must maintain 1-to11 correspondence to the times list

  • times (list or tuple of NUMPY 1D int arrays) – A list of the arrival times for each photon in each burst Each element of the list (a numpy array) corresponds to a burst, and each element of the array is a singular photon. The times list must maintain 1-to-1 correspondence to the indexes list

  • gamma (bool (optional)) – Whether or not to return the gamma array. (The default is False)

  • num_cores (int or None, optional) –

    the number of C threads (which ignore the gil, thus functioning more like python processes), to use when calculating iterations.

    Note

    optimization_limtis sets this to be os.cpu_count() // 2, as most machines are multithreaded. Because os.cpu_count() returns the number of threads, not necessarily the number of cpus. For most machines, being multithreaded, this actually returns twice the number of physical cores. Hence the default to set at os.cpu_count() // 2. If your machine is not multithreaded, it is best to set optimization_limits.num_cores = os.cpu_count()

    If None (default) use value from optimization_limits. Default is None

  • inplace (bool, optional) – Whether or not to store the evaluated model in the current model object. Default is True

Returns:

  • out (h2mm_model) – The evaluated model

  • gamma_arr (list, tuple or np.ndarray (optional)) – If gamma = True, this variable is returned, which contains (as numpy arrays) the probabilities of each photon to be in each state, for each burst and model. The arrays are returned in the same order and type as the indexes, individual data points organized as [photon, state] within each data sequence.

is_calc#

If model has been optimized/eveluated, vs being an initial model

is_conv#

Whether or not the optimization reached convergence rather than exceeding the maximum iteration/time of optimization

is_opt#

Whether or not the model has undergone optimization, as opposed to evaluation or being an initial model

k#

Number of free parameters in model

loglik#

Loglikelihood of model

ndet#

Nubmer of detectors/photon streams in model

niter#

Number of iterations optimization took

normalize(self)#

For internal use, ensures all model arrays are row stochastic

nphot#

Number of photons in data set used to optimize model

nstate#

Number of states in model

obs#

Emission probability matrix, 2D shape nstate x ndet

optimize(self, indexes, times, max_iter=None, print_func='iter', print_args=None, bounds=None, bounds_func=None, max_time=None, converged_min=None, num_cores=None, reset_niter=False, gamma=False, opt_array=False, inplace=True)#

Optimize the H2MM model for the given set of data.

Note

This method calls the EM_H2MM_C() function.

Parameters:
  • indexes (list of NUMPY 1D int arrays) – A list of the arrival indexes for each photon in each burst. Each element of the list (a numpy array) corresponds to a burst, and each element of the array is a singular photon. The indexes list must maintain 1-to-1 correspondence to the times list

  • times (list of NUMPY 1D int arrays) – A list of the arrival times for each photon in each burst Each element of the list (a numpy array) corresponds to a burst, and each element of the array is a singular photon. The times list must maintain 1-to-1 correspondence to the indexes list

  • print_func (None, str or callable, optional) –

    Specifies how the results of each iteration will be displayed, several strings specify built-in functions. Default is ‘iter’ Acceptable inputs: str, Callable or None

    None:

    causes no printout anywhere of the results of each iteration

    Str: ‘console’, ‘all’, ‘diff’, ‘comp’, ‘iter’

    ’console’:

    the results will be printed to the terminal/console window this is useful to still be able to see the results, but not clutter up the output of your Jupyter Notebook, this is the default

    ’all’:

    prints out the full h2mm_model just evaluated, this is very verbose, and not generally recommended unless you really want to know every step of the optimization

    ’diff’:

    prints the loglik of the iteration, and the difference between the current and old loglik, this is the same format that the ‘console’ option used, the difference is whether the print function used is the C printf, or Cython print, which changes the destination from the console to the Jupyter notebook

    ’diff_time’:

    same as ‘diff’ but adds the time of the last iteration and the total optimization time

    ’comp’:

    prints out the current and old loglik

    ’comp_time’:

    same as ‘comp’ but with times, like in ‘diff_time’

    ’iter’:

    prints out only the current iteration

    Callable: user defined function

    A python function for printing out a custom output, the function must accept the input of (int, h2mm_model, h2mm_model, h2mm_model,float, float) as the function will be handed (niter, new, current, old, t_iter, t_total) from a special Cython wrapper. t_iter and t_total are the times of the iteration and total time respectively, in seconds, based on the C level clock function, which is notably inaccurate, often reporting larger than actual values.

  • print_args (2-tuple/list (int, bool) or None, optional) – Arguments to further customize the printing options. The format is (int bool) where int is how many iterations before updating the display and the bool is True if the printout will concatenate, and False if the display will be kept to one line, The default is None, which is changed into (1, False). If only an int or a bool is specified, then the default value for the other will be used. If a custom printing function is given then this argument will be passed to the function as *args. Default is None

  • bounds_func (str, callable or None, optional) –

    function to be evaluated after every iteration of the H2MM algorithm its primary function is to place bounds on h2mm_model

    Note

    bounding the h2mm_model causes the guarantee of improvement on each iteration until convergence to no longer apply, therefore the results are no longer guaranteed to be the optimal model within bounds.

    Default is None Acceptable inputs:

    C level limits: ‘minmax’ ‘revert’ ‘revert_old’

    prevents the model from having values outside of those defined by h2mm_limits class given to bounds, if an iteration produces a new model for loglik calculation in the next iteration with values that are out of those bounds, the 3 function differ in how they correct the model when a new model has a value out of bounds, they are as follows:

    ’minmax’:

    sets the out of bounds value to the min or max value closer to the original value

    ’revert’:

    sets the out of bounds value to the value from the last model for which the loglik was calculated

    ’revert_old’:

    similar to revert, but instead reverts to the value from the model one before the last calculated model

    Callable: python function that takes 4 inputs and returns h2mm_model object

    Warning

    The user takes full responsibility for errors in the results.

    must be a python function that takes 4 arguments, the first 3 are the h2mm_model objects, in this order: new, current, old, the fourth is the argument supplied to the bounds keyword argument, the function must return a single h2mm_limits object, the prior, trans and obs fields of which will be optimized next, other values are ignored.

  • bounds (h2mm_limits, str, 4th input to callable, or None, optional) – The argument to be passed to the bounds_func function. If bounds_func is None, then bounds will be ignored. If bounds_func is ‘minmax’, ‘revert’ or ‘revert_old’ (calling C-level bounding functions), then bounds must be a h2mm_limits object, if bounds_func is a callable, bounds must be an acceptable input as the fourth argument to the function passed to bounds_func Default is None

  • max_iter (int or None, optional) – the maximum number of iterations to conduct before returning the current h2mm_model, if None (default) use value from optimization_limits. Default is None

  • max_time (float or None, optional) –

    The maximum time (in seconds) before returning current model

    Note

    this uses the C clock, which has issues, often the time assessed by C, which is usually longer than the actual time.

    If None (default) use value from optimization_limits. Default is None

  • converged_min (float or None, optional) – The difference between new and current h2mm_model to consider the model converged, the default setting is close to floating point error, it is recommended to only increase the size. If None (default) use value from optimization_limits. Default is None

  • num_cores (int or None, optional) –

    the number of C threads (which ignore the gil, thus functioning more like python processes), to use when calculating iterations.

    Note

    optimization_limtis sets this to be os.cpu_count() // 2, as most machines are multithreaded. Because os.cpu_count() returns the number of threads, not necessarily the number of cpus. For most machines, being multithreaded, this actually returns twice the number of physical cores. Hence the default to set at os.cpu_count() // 2. If your machine is not multithreaded, it is best to set optimization_limits.num_cores = os.cpu_count()

    If None (default) use value from optimization_limits. Default is None

  • reset_niter (bool, optional) – Tells the algorithm whether or not to reset the iteration counter of the model, True means that even if the model was previously optimized, the iteration counter will start at 0, so calling EM_H2MM_C() on a model that reached its maximum iteration number will proceed another max_iter iterations. On the other hand, if set to False, then the new iterations will be added to the old. If set to False, you likely will have to increase max_iter, as the optimization will count the previous iterations towards the max_iter threshold. Default is False

  • gamma (bool, optional) –

    Whether or not to return the gamma array, which gives the probabilities of each photon being in a given state.

    Note

    If opt_array is True, then only the gamma arrays for the ideal model are returned.

    Default is False

  • opt_array (bool) –

    Defines how the result is returned, if False (default) then return only the ideal model, if True, then a numpy array is returned containing all models during the optimization.

    Note

    The last model in the array is the model after the convergence criterion have been met. Therefore check the h2mm_model.conv_code, if it is 7, then the second to last model is the ideal model. Still, usually the last model will be so close to ideal that it will not make any significant difference.

    Default is False

  • inplace (bool, optional) –

    Whether or not to store the optimized model in the current model object.

    Note

    When opt_array is True, then the ideal model is copied into the current model object, while the return value contains all models

    Default is True

Returns:

  • out (h2mm_model) – The optimized h2mm_model. will return after one of the following conditions are met: model has converged (according to converged_min, default 1e-14), maximum iterations reached, maximum time has passed, or an error has occurred (usually the result of a nan from a floating point precision error)

  • gamma (list, tuple or np.ndarray (optional)) – If gamma = True, this variable is returned, which contains (as numpy arrays) the probabilities of each photon to be in each state, for each burst. The arrays are returned in the same order and type as the input indexes, individual data points organized as [photon, state] within each data sequence.

prior#

Prior probability matrix, 1D, size nstate

set_converged(self, converged)#
trans#

Transition probability matrix, square, dimenstions nstate x nstate

class H2MM_C.h2mm_limits(model=None, min_prior=None, max_prior=None, min_trans=None, max_trans=None, min_obs=None, max_obs=None, nstate=0, ndet=0)#

Special class for setting limits on the h2mm_model, as min and max values

Parameters:
  • model (h2mm_model, optional) – An h2mm_model to base the limits off of, if None. The main purpose is to allow the user to check that the limits are valid for the model. Specifying this model will also lock the states/streams of the model, while if None, the limits is more flexible If None, none of these checks will be in place The default is None.

  • min_prior (float or 1D numpy array, optional) – The minimum value(s) of the prior array. If float, all values are set the same, but unless fixed elsewhere, the number of states is flexible. If None, no limits are set (all values min is 0.0) The default is None.

  • max_prior (float or 1D numpy array, optional) – The maximum value(s) of the prior array. If float, all values are set the same, but unless fixed elsewhere, the number of states is flexible. If None, no limits are set (all values min is 1.0) The default is None.

  • min_trans (float or 2D square numpy array, optional) – The minimum value(s) of the trans array. If float, all values are set the same, but unless fixed elsewhere, the number of states is flexible. Values on diagonal set to 0.0 If None, no limits are set (all values min is 0.0) The default is None.

  • max_trans (float or 2D square numpy array, optional) – The maximum value(s) of the trans array. If float, all values are set the same, but unless fixed elsewhere, the number of states is flexible. Values on diagonal set to 1.0 If None, no limits are set (all values min is 1.0) The default is None.

  • min_obs (float or 2D numpy array, optional) – The minimum value(s) of the obs array. If float, all values are set the same, but unless fixed elsewhere, the number of states is flexible. If None, no limits are set (all values min is 0.0) The default is None.

  • max_obs (float or 2D numpy array, optional) – The maximum value(s) of the obs array. If float, all values are set the same, but unless fixed elsewhere, the number of states is flexible. If None, no limits are set (all values min is 1.0) The default is None.

  • nstate (int, optional) – Number of states in the model to be optimized, if set to 0, and not specified elsewhere, number of states is flexible. The default is 0.

  • ndet (int, optional) – Number of streams in the model to be optimized, if set to 0, and not specified elsewhere, number of states is flexible. The default is 0.

make_model(self, model, warning=True)#

Method for checking the limits arrays generated from the input model

Parameters:
  • model (h2mm_model) – h2mm_model for which the limits are to be specified. Also, values are checked and a warning is raised if the model has values that are out of the range of the limits

  • warning (bool optional) – Whether or not to check of the initial model is within the bounds generated by make_model Default is True

Raises:
  • Exception – When user has reset a specific field so the field is no longer valid.

  • ValueError – Limits cannot be made for the given model.

Returns:

  • model (h2mm_model) – Model to be optimized with specified limits.

  • min_prior (1D numpy float array) – Minimum values for each element of the prior array

  • max_prior (1D numpy float array) – Maximum values for each element of the prior array

  • min_trans (2D square numpy float array) – Minimum values of each element of the trans array

  • max_trans (2D square numpy float array) – Maximum values of each element of the trans array

  • min_obs (2D numpy float array) – Minimum values of each element of the obs array

  • max_obs (2D numpy float array) – Maximum values of each element of the obs array