smoofit.model.ChannelContrib

class smoofit.model.ChannelContrib(channel_name, yields, sumw2=None, sub_procs=None)[source]

Specify how a process contributes to a channel.

Channels are any orthogonal selections of events, i.e. they can be data-taking eras, signal/control regions, lepton flavour channels, bins, etc.

ChannelContrib objects specify how processes contribute to a given channel - and also implicitly define which channels are to be considered in the fit. Indeed, channels are never declared explicitly, but are collected through all the ChannelContrib objects of all processes registered with a model. The name of a ChannelContrib is is used to collect all conributions to a given channel.

If a process does not contribute to a given channel (i.e. there is no corresponding ChannelContrib), templates with zero yields are automatically inserted.

Systematic uncertainties are also specified through ChannelContrib objects, and so are the statistical uncertainties on the templates.

Once a ChannelContrib object has been defined, it has to be registered with the process it belongs to, which is done using the Process.add_contrib() method.

Methods

`__init__`(channel_name, yields[, sumw2, ...])	Constructor
`add_lnN`(var, lnN[, sub_vars, sub_procs])	Add a log-normal systematic uncertainty in this channel
`add_shape_syst`(var, up, down[, sub_vars, ...])	Add a shape systematic uncertainty in this channel
`register_sumw2`(sumw2)	Register the sums of squared weights for bin-by-bin statistical uncertainties
`scale_bins_by`(var)	Register a `Variable` as a linear yield modifier across bins
`scale_by`(var)	Register a `Variable` as a linear yield modifier (e.g.
`scale_by_fn`(fn, variables)	Scale the yields in this channel by an arbitrary function

__init__(channel_name, yields, sumw2=None, sub_procs=None)[source]

Constructor

Parameters

channel_name (str) – the name of the channel - used to collect the contributions of various processes to a given channel
yields (Union[List[float], ndarray]) – the nominal yields of the considered process in this channel, as 1D or 2D numpy array
sumw2 (Union[List[float], ndarray, None]) – the sum of squared weights of the considered process in this channel, used to evaluate the template statistical uncertainties (defaults to zero, i.e. infinite statistics), should have the same shape as the yields
sub_procs (Optional[List[str]]) – list of sub-processes of the considered process which contribute to this channel. The length of this list should match the length of the first axis of yields. This list should be a sub-list of the sub-processes of the Process that this contribution will be attached to. You might use the Process.sub_procs attribute if all of the sub-processes contribute.

add_lnN(var, lnN, sub_vars=None, sub_procs=None)[source]

Add a log-normal systematic uncertainty in this channel

The log-normal uncertainties can be specified in several way depending on the shape of the lnN argument:

scalar: single uncertainty source (i.e. either var is scalar or sub_vars has length one), same uncertainty for all sub-processes (if relevant).
(n_sub_procs,): single uncertainty source (i.e. either var is scalar or sub_vars has length one), different uncertainty for each sub-process. The length should match the number of sub-processes of the process or the length of specified sub_procs.
(n_var,): separate uncertainty sources, same uncertainties for all sub-processes. The length should match the dimensionality of var or the length of specified sub_vars.
(n_var, n_sub_procs): separate uncertainty sources, different uncertaintiy for each sub-process (same rules as above apply).

All of those above options can be passed directly as argument, in which case the uncertainty will be a symmetric log-normal, or as a tuple (up, down), in which case the uncertainty will be an asymmetric log-normal.

Given the (Gaussian-constrained) nuisance parameter \(\alpha\) for a particular uncertainty source, the process yields are multiplied by a factor \(K(\alpha)^{\alpha}\) where \(K(\alpha)\) is defined as:

\[\begin{split}& K_{\text{up}} & \text{ if } \alpha > 1 \\ & \left( \frac{\alpha}{4} (3 - \alpha^2 ) (K_{\text{up}}-K_{\text{down}}) + \frac12 (K_{\text{up}}+K_{\text{down}}) \right) & \text{ if } |\alpha| \leq 1 \\ & K_{\text{down}} & \text{ if } \alpha < -1\end{split}\]

This inter-/extrapolation is such that \(K(1) = K_{\text{up}}\) and \(K(-1) = K_{\text{down}}\), and its second derivative is continuous.

Note that this assumes that \(K_{\text{down}} = \text{up}/\text{nominal}\), and \(K_{\text{up}} = \text{nominal}/\text{up}\), so that a 3% up and 5% down asymmetric uncertainty is given as \((K_{\text{up}}, K_{\text{down}}) = (1.03, 1.05)\). Conversely, a -3% “up” and -5% down uncertainty is given as \((K_{\text{up}}, K_{\text{down}}) = (0.97, 0.95)\). Specifying e.g. \((K_{\text{up}}, K_{\text{down}}) = (1.03, 0.95)\) will however result in a one-sided variation, which is typically not what is desired.

Parameters

var (Variable) – the Variable nuisance parameter corresponding to this uncertainty
lnN (Union[float, Tuple[float, float], ndarray, Tuple[ndarray, ndarray]]) – specification of the log-normal uncertainties
sub_vars (Union[List[str], List[int], None]) – list of sub-variable names in case var is vector-valued but only a subset of its components is to be used as nuisance parameters
sub_procs (Union[List[str], List[int], None]) – list of sub-process names in case this uncertainty only affects a subset of the sub-processes in this channel

add_shape_syst(var, up, down, sub_vars=None, sub_procs=None)[source]

Add a shape systematic uncertainty in this channel

The uncertainty is specified through a pair of arrays corresponding to the up and down variations of the nuisance parameter (by \(\pm 1 \sigma\)).

The shape of the arrays should be ([n_sources,][n_sub_procs,]n_bins), where:

n_sources is the number of uncertainty sources, matching either the dimension of var of the number of entries in sub_vars. This axis might not be present, if only a single source is considered.
n_sub_procs is the number of sub-processes of the considered process, or the length of the sub_procs argument. This axis might not be present, if the process only has a single component.
n_bins is the only mandatory axis and should match the number of bins in this channel.

In summary, the up and down arrays might be 1D, 2D or 3D.

Writing \(H^{\text{nom}}_i = I^{\text{nom}} N^{\text{nom}}_i\) for the nominal template and bin i, with \(I\) the integral of the template yields, and similarly for the up/down variations \(H^{\text{up/down}}_i = I^{\text{up/down}} N^{\text{up/down}}_i\) (for any given uncertainty source), the templates are interpolated vertically using the following scheme:

\[H_i(\alpha) = I(\alpha) N_i(\alpha)\]

where \(\alpha\) is the (Gaussian-constrained) nuisance parameter, \(I(\alpha)\) is an inter/extrapolation of the template normalization an asymmetric log-normal uncertainty (with \(K^{\text{up/down}} = I^{\text{up/down}} / I^{\text{nom}}\)), and \(N_i(\alpha)\) is a morphing of the normalized template shapes defined as:

\[\begin{split}& N^{\text{up}}_i + (\alpha - 1) ( N^{\text{up}}_i - N^{\text{nom}}_i ) & \text{ if } \alpha > 1 \\ & N^{\text{nom}}_i + \frac{\alpha}{2} \left( N_{i}^{\text{up}} - N_{i}^{\text{down}} \right) + \frac{1}{16} ( 3 \alpha^6 - 10 \alpha^4 + 15 \alpha^2 ) \left( N_{i}^{\text{up}} + N_{i}^{\text{down}} - 2 N_{i}^{\text{nom}} \right) & \text{ if } |\alpha| \leq 1 \\ & N^{\text{down}}_i - (\alpha + 1) ( N^{\text{down}}_i - N^{\text{nom}}_i ) & \text{ if } \alpha < -1 \\\end{split}\]

The resulting interpolation satisfies \(H_i(1) = H^{\text{up}}_i\), \(H_i(-1) = H^{\text{down}}_i\) and \(H_i(0) = H^{\text{nom}}_i\), and its second derivative is continuous.

Note that what is actually interpolated is a factor \(F_i(\alpha) = H_i(\alpha) / H^{\text{nom}}_i\). Those factors for all the systematic sources of the model are multiplied together and with the (fixed) nominal yields \(H^{\text{nom}}_i\) to define the “predicted” yields.

Parameters

var (Variable) – the Variable nuisance parameter corresponding to this uncertainty
up (Union[List[float], ndarray]) – the “up” variation (\(+1\sigma\))
down (Union[List[float], ndarray]) – the “down” variation (\(-1\sigma\))
sub_vars (Union[List[str], List[int], None]) – list of sub-variable names in case var is vector-valued but only a subset of its components is to be used as nuisance parameters
sub_procs (Union[List[str], List[int], None]) – list of sub-process names in case this uncertainty only affects a subset of the sub-processes in this channel

register_sumw2(sumw2)[source]

Register the sums of squared weights for bin-by-bin statistical uncertainties

Parameters: sumw2 (Union[List[float], ndarray]) – 1D or 2D numpy array with the sums of squared weights for each bin, should have the same shape as the yields in this channel

scale_bins_by(var)[source]

Register a Variable as a linear yield modifier across bins

If the parameter is a scalar, the yields in this channel, for all sub-processes registered with this channel, will be scaled linearly by the parameter.

If the parameter is vector-valued, the dimensionality of the variable should match the number of bins of this channel. The yields in each bin will then be scaled linearly by a single component of the variable.

Parameters: var (Variable) – a Variable scaling the yields of this process

scale_by(var)[source]

Register a Variable as a linear yield modifier (e.g. signal strength modifier)

If the parameter is a scalar, the yields in this channel, for all sub-processes registered with this channel, will be scaled linearly by the parameter.

If the parameter is vector-valued, the dimensionality of the variable should match the number of sub-processes registered with this channel. Each of those sub-processes will then be scaled linearly by a single component of the variable (restricted to this channel).

Parameters: var (Variable) – a Variable scaling the yields of this process

scale_by_fn(fn, variables)[source]

Scale the yields in this channel by an arbitrary function

The rescaling is restricted to this channel and to the registered sub-processes, i.e. the length of the first axis of the return value of fn should match the number of sub-processes of this channel.

See Process.scale_by_fn() for more information about how fn is expected to behave.

Parameters

fn (Callable[[DeviceArrayBase], DeviceArrayBase]) – a callable returning the factors by which the yields should be scaled
variables (Union[Variable, Tuple[Variable]]) – a Variable or a tuple of Variable objects whose values will be passed to fn as positional arguments in the form of 1D jnp.DeviceArray arrays