Skip to content
Snippets Groups Projects

ZfitWrapper

This is an easy-to-use wrapper for my commonly used zfit modules for simple binned and unbinned fits of distributions. zfitwrapper is aiming to provide a JSON based IO format for quick fits with zfit with not much of the usually required lines of codes and handling of zfit classes.

Lightweight Example (6 lines of code)

Install options:

  • Install from DESY GitLab: pip3 install git+https://gitlab.etp.kit.edu/aheidelbach/zfitwrapper
  • Install from ETP GitLab:pip3 install git+https://gitlab.desy.de/alexander.heidelbach/zfitwrapper
  • Flexible install: At the location where you want to install the package, clone the repository
    cd your_repository
    git clone https://gitlab.desy.de/alexander.heidelbach/zfitwrapper.git
    and run the installation process
    pip3 install -e ./ --user
from zfitwrapper import *

# Define overall fit space
obs = utilities.set_obs("Example", (-3, 3))

# Define model properties (or read from JSON)
modelparameter = {
    "Model" : {
        "pdf": "Gauss",
        "parameter": {
            "mu" : {
                "name": "mu",
                "value": 0
            },
            "sigma": {
                "name": "sigma",
                "value": 1
            }
        }
    }
}

# Initialize model
model = Models.Model(obs=obs, modelstring="Model", models=modelparameter)

# Simulate some data
arr = np.random.randn(10000)

# Initialize and execute fit
fit = Fitter.Fitter(arr, model)
fit.fit()
# Output
name      value  (rounded)        hesse               errors         minuit_minos    at limit
------  ------------------  -----------  -------------------  -------------------  ----------
mu             -0.00270004  +/-    0.01  -   0.01   +   0.01  -   0.01   +   0.01       False
sigma             0.989004  +/-  0.0075  - 0.0074   + 0.0075  - 0.0074   + 0.0075       False

The Main Idea

A general fit consists of three core aspects:

  1. The model which should be fitted to the provided data,
  2. the loss function which provides a parameterization of the quality of the fit, and
  3. the minimisation process of the loss function.

Luckily, zfit takes care of most things with the help of several different containers and modules. Working with zfit, you will create a zfit.Space in which all the following containers will live and whose limits will be the boundaries for the minimization process. Furthermore, you will define (global) zfit.Parameter which are used as input in your needed zfit.pdf defines your to-be minimized model. Subsequently, you define a zfit.Data container and in combination with your model you build your zfit.loss which you eventually minimize. To this point, you have made a lot of decisions and while repeating multiple different fits, I have noticed that with the help of the zfit modules, a lot of the work is repetitive. In about 80% of the cases, you follow the same structure every time only swapping a few lines of code here and there. This presented package will take care of all these little tweaks and provide access to the commonly used methods. You can write down the fit procedure and focus on the choice of the correct PDF model, starting parameters, and so on only with a single JSON file.

Since zfit offers way more options (like constraints, toy-sampling, limit setting, ... ) than I have implemented so far, I'm very happy for your contribution in helping me to implement all of them.

Building Blocks

Fit Model

The general IO structure is based on the final model you want to fit to your data. An example input requires the fit range (or the respective zfit.Space), the model definition and the model description. E.g.:

{
    "limits": [-1, 1],
    "modelname": "Latex Name",
    "modelstring": "model1 + model2",
    "modelparameter": {
        "model1" : {
            "pdf": "Gauss",
            "parameter": {
                "mu": {},
                "simga": {}
            }
        },
        "model2" : {}
    }
}

Here, limits and modelname are optional inputs in the used settings. limits can be used to build the zfit.Space with the utility set_obs() method. modelname can be given to the Model object and later be used as print object.

modelstring and modelparameter are mandatory to successfully build the Model. The model names in the modelstring should all appear as keywords in modelparameter where they are used to build the singular PDFs. Currently, the supported operators are

Furthermore, a PDF description should include a pdf keyword that defines the used PDF. Currently, the supported PDFs are: "Gauss", "Cauchy", "CrstalBall" and "DoubleCB". Another required key for a single model is parameter. This dictionary includes all the parameters that the corresponding zfit.pdf requires. The keyword should be the same as used by zfit. The possible parameter options will be explained in the following. Currently, each model can also have a limits keyword to specify a different space in which the model is evaluated.

Fit Parameter

The used Fitparameter class is mainly built upon the zfit.Parameter container. Its task is to improve the communication between the user parameter handling and the global handling of zfit. Each parameter requires:

  • name (as in zfit), and
  • value (as in zfit).

Additional properties are:

  • lower (lower bound) (as in zfit)
  • upper (upper bound) (as in zfit)
  • step_size (granularity) (as in zfit)
  • floating (if true, parameter is fitted) (as in zfit)
  • latex_name (nice latex expression)
  • unit (nice latex expression)
  • shuffle (if true, this parameter will get a random value inside bounds in the case of a failed fit)

Fit

This package provides Fitter and HistFitter as possible fit methods for unbinned respectively binned fits. Both methods require the desired Model to fit. The Fitter method will require the unbinned array while HistFitter will take the histogrammed data counts in a flat binning in the given space. Up until this point, the general fit strategy is to minimize the resulting likelihood and if that fails automatically retry the fit (with shuffled parameters). In the case of a successful fit, the fitted parameters will get updated in the Models class and the results will be written as a list in Latex format in the result_text property of the used fitting method.