Trim that Graph...

There is a nice Service in the CMSSW framework that allows the user to locally save the full dependency graph, in dot format, of the workflow to which the Service has been attached. Its activation is as easy as adding the following lines:

process.DependencyGraph = cms.Service('DependencyGraph')

There are few configuration parameters that you can change. The full list of the is available here. The most interesting one is the highlightModules, which would allow the user to identify specific modules (the ones supplied via this option) by highlighting them with a different fill colour in the final image.

The Service works great, but it will produce one plot that will turn out to be rather huge and that will require quite some time to be processed (using dot). Sometimes you are not interested in the big picture, but maybe only to a specific part of the reconstruction chain.

For that purpose, I created a simple python script that will be able to read the output generated by the DependencyGraph Service and trim it according to your needs. The script is called dependencies_StandAlone.py and you can find it here. In order to run it you need to have the pythonds package installed, e.g., via:

pip install pythonds

The pythonds package contains definitions for Graph and graph-related methods (like bfs exploration) that are needed in order to manipulate the original, full graph. The command-line help is your friend in order to discover the available options:

python dependencies_StandAlone.py --help

The most important options are:

-f, to specify the input, full graph produced by the DependencyGraph Service that has to be trimmed
-l, to specify the python label of the module that will be used as the root node in exploring the original graph
--exclude_from_nodes, to specify, via the command line, the list of modules that have to be trimmed from the graph
--exclude_from_files, to specify the list of modules that have to be trimmed from the graph, using an ASCII file, specifying one module per line
-m, to limit the number of nodes in the final, trimmed graph to M
-o, to specify the format of the output image (e.g., pdf, png, svg, etc..). The available choices will depend on your local installation of dot.
-O, to specify a label to be used while saving the output files.

The trimming of the graph is such that the modules that have to be excluded are still included in the final graph, but highlighted in red, while all their edges, both incoming and outgoing, are removed from the graph. This will, de-facto, remove the sub-graphs that have the excluded modules as root nodes from the output graph. If nodes that belong to these exclusion subgraphs are reachable via valid modules, though, they will still be part of the final graph.

Two graphs are produced in output: one that considers the is_consumed_by relation among modules, and the other that considers the consumes relation. The graphs are produced both in dot format and also in the final format specified by the -o options.

Examples

You can find a complete dependency graph produced using the DependencyGraph Service run on a typical Phase2 scenario here. All the files needed to run the following examples are available here.

Examples of the HGCAL reconstruction starting from the HGCalRecHit module can be produced using the following commands:

python dependencies_StandAlone.py -f dependency.gv -l HGCalRecHit -m 200 -o png --exclude_from_file SimAssisted.txt -O SimAssisted
python dependencies_StandAlone.py -f dependency.gv -l HGCalRecHit -m 170 -o png --exclude_from_file TDR_Reco.txt -O TDR_Reco
python dependencies_StandAlone.py -f dependency.gv -l HGCalRecHit -m 100 -o png --exclude_from_file ticl.txt -O TICL

The output plots produced should be similar to these:

Sim-Assisted PDF Version

TDR Reco PDF Version

TICL PDF Version