page merging3

Basic yodamerge tutorial

Note that example files file1.yodafile2.yoda are included in this directory.

The basic syntax of yodamerge is:

yodamerge -o newyoda.yoda file1.yoda file2.yoda # file3.yoda... etc

This creates a new file newyoda.yoda in which any Histo*D, Profile*D, and Counter objects are correctly statistically combined, by undoing any scaling which has been applied, then summing the objects and re-applying the required scaling.

yodastack -o newyoda.yoda file1.yoda file2.yoda # file3.yoda... etc

This does the same as the previous example, but forces simple stacking of histogram objects and other types, without applying the unscaling-rescaling procedure described (briefly) above. This will literally just add the numbers in each bin of histogram objects, and not worry that one run may be larger than the other (normally a weighted average is taken).

yodastack -o newyoda.yoda file1.yoda:1.23 file2.yoda:4.56 # file3.yoda:7.89... etc

Another useful feature shown above is that one can specify a custom scaling of the individual files, using the <file name>:<scaling float> convention.

In all these cases, assumption have been made here for Scatter*D objects. See below for details of how to modify those assumptions.

Options for merging Scatter*D objects

You may have noticed the warning:

WARNING: Scatter2D /REF/ATLAS_2019_I1725190/d01-x01-y01 merge assumes asymptotic statistics and equal run sizes
WARNING: Scatter2D /REF/ATLAS_2019_I1725190/d02-x01-y01 merge assumes asymptotic statistics and equal run sizes

Which is printed for every Scatter*D object in the files.

This is because the default behaviour for merging scatters is that the average of each point of a Scatter object is taken. This seemingly straightforward behaviour actually assumes, technically, that the yoda files which are being merged had the same number of events processed which they were filled. This default behaviour is equivalent to using the assume_mean argument for the –s2d-mode option (for 2D Scatter objects). For this and all that follows, similar options exist for 1D and 3D objects (–s1d-mode and –s3d-mode).

yodamerge file1.yoda file2.yoda -o newyoda.yoda --s2d-mode assume_mean

Other choices are possible… one may want to simply add the points from the Scatter*D objects together without averaging. This can be achieved using –s2d-mode add:

yodamerge file1.yoda file2.yoda -o newyoda.yoda --s2d-mode add 

Another behaviour might be that the points should not be added or averaged at all, but instead that the list of points from the various Scatter*D objects should be concatenated. So two 6-point Scatters would be merged into a single 12-point Scatter. This would be achieved using the combine argument :

yodamerge file1.yoda file2.yoda -o newyoda.yoda --s2d-mode combine 

Or finally, it may simply be that it does not make much sense to merge Scatter objects and that the safest thing to do is pick one of them, the fist to be encountered, and write that out to the final yoda file. In this case, use –s2d-mode first

Other options

There are a few other options which may be of use:

--type-mismatch-mode MODE # 'first'or 'scatter'

This option may be of use if you encounter objects with the same name but different types. Either you can just pick the first one and skip the rest, or convert everything to a Scatter object and treat it that way.

--no-veto-empty       

By default, empty (sumW=0) data objects are not written out to the final yoda file. This option writes out such objects.

Do it yourself!

If you need features which are not implemented here, or have your own specific mergign requirements, then you could also make a local copy of yodamerge and use the very powerful Python API to implement your own merging prescriptions.


Updated on 2022-08-21 at 16:46:27 +0100