ZfitWrapper
This is an easy-to-use wrapper for my commonly used zfit
modules for simple binned and unbinned fits of distributions. zfitwrapper
is aiming to provide a JSON
based IO format for quick fits with zfit
with not much of the usually required lines of codes and handling of zfit
classes.
Lightweight Example (6 lines of code)
Install options:
- Install from DESY GitLab:
pip3 install git+https://gitlab.etp.kit.edu/aheidelbach/zfitwrapper
- Install from ETP GitLab:
pip3 install git+https://gitlab.desy.de/alexander.heidelbach/zfitwrapper
- Flexible install: At the location where you want to install the package, clone the repository
cd your_repository git clone https://gitlab.desy.de/alexander.heidelbach/zfitwrapper.git
pip3 install -e ./ --user
from zfitwrapper import *
# Define overall fit space
obs = utilities.set_obs("Example", (-3, 3))
# Define model properties (or read from JSON)
modelparameter = {
"Model" : {
"pdf": "Gauss",
"parameter": {
"mu" : {
"name": "mu",
"value": 0
},
"sigma": {
"name": "sigma",
"value": 1
}
}
}
}
# Initialize model
model = Models.Model(obs=obs, modelstring="Model", models=modelparameter)
# Simulate some data
arr = np.random.randn(10000)
# Initialize and execute fit
fit = Fitter.Fitter(arr, model)
fit.fit()
# Output
name value (rounded) hesse errors minuit_minos at limit
------ ------------------ ----------- ------------------- ------------------- ----------
mu -0.00270004 +/- 0.01 - 0.01 + 0.01 - 0.01 + 0.01 False
sigma 0.989004 +/- 0.0075 - 0.0074 + 0.0075 - 0.0074 + 0.0075 False
The Main Idea
A general fit consists of three core aspects:
- The model which should be fitted to the provided data,
- the loss function which provides a parameterization of the quality of the fit, and
- the minimisation process of the loss function.
Luckily, zfit
takes care of most things with the help of several different containers and modules. Working with zfit
, you will create a zfit.Space
in which all the following containers will live and whose limits will be the boundaries for the minimization process. Furthermore, you will define (global) zfit.Parameter
which are used as input in your needed zfit.pdf
defines your to-be minimized model. Subsequently, you define a zfit.Data
container and in combination with your model you build your zfit.loss
which you eventually minimize. To this point, you have made a lot of decisions and while repeating multiple different fits, I have noticed that with the help of the zfit
modules, a lot of the work is repetitive. In about 80% of the cases, you follow the same structure every time only swapping a few lines of code here and there. This presented package will take care of all these little tweaks and provide access to the commonly used methods. You can write down the fit procedure and focus on the choice of the correct PDF model, starting parameters, and so on only with a single JSON
file.
Since zfit
offers way more options (like constraints, toy-sampling, limit setting, ... ) than I have implemented so far, I'm very happy for your contribution in helping me to implement all of them.
Building Blocks
Fit Model
The general IO structure is based on the final model you want to fit to your data. An example input requires the fit range (or the respective zfit.Space
), the model definition and the model description. E.g.:
{
"limits": [-1, 1],
"modelname": "Latex Name",
"modelstring": "model1 + model2",
"modelparameter": {
"model1" : {
"pdf": "Gauss",
"parameter": {
"mu": {},
"simga": {}
}
},
"model2" : {}
}
}
Here, limits
and modelname
are optional inputs in the used settings. limits
can be used to build the zfit.Space
with the utility set_obs()
method. modelname
can be given to the Model
object and later be used as print object.
modelstring
and modelparameter
are mandatory to successfully build the Model
. The model names in the modelstring
should all appear as keywords in modelparameter
where they are used to build the singular PDFs. Currently, the supported operators are
-
+
(zfit.pdf.SumPDF
) -
*
(zfit.pdf.ProductPDF
) -
**
(zfit.pdf.FFTConvPDFV1
)
Furthermore, a PDF description should include a pdf
keyword that defines the used PDF. Currently, the supported PDFs are: "Gauss", "Cauchy", "CrstalBall" and "DoubleCB". Another required key for a single model is parameter
. This dictionary includes all the parameters that the corresponding zfit.pdf
requires. The keyword should be the same as used by zfit
. The possible parameter options will be explained in the following. Currently, each model can also have a limits
keyword to specify a different space in which the model is evaluated.
Fit Parameter
The used Fitparameter
class is mainly built upon the zfit.Parameter
container. Its task is to improve the communication between the user parameter handling and the global handling of zfit
. Each parameter requires:
-
name
(as inzfit
), and -
value
(as inzfit
).
Additional properties are:
-
lower
(lower bound) (as inzfit
) -
upper
(upper bound) (as inzfit
) -
step_size
(granularity) (as inzfit
) -
floating
(if true, parameter is fitted) (as inzfit
) -
latex_name
(nice latex expression) -
unit
(nice latex expression) -
shuffle
(if true, this parameter will get a random value inside bounds in the case of a failed fit)
Fit
This package provides Fitter
and HistFitter
as possible fit methods for unbinned respectively binned fits. Both methods require the desired Model
to fit. The Fitter
method will require the unbinned array while HistFitter
will take the histogrammed data counts in a flat binning in the given space. Up until this point, the general fit strategy is to minimize the resulting likelihood and if that fails automatically retry the fit (with shuffled parameters). In the case of a successful fit, the fitted parameters will get updated in the Models
class and the results will be written as a list in Latex format in the result_text
property of the used fitting method.