6.1 Thirty Seconds Introduction to Generation and Simulation
Complete:
Detailed Review status
Goals of this page
This page provides an entry point to the physics event generation and detector simulation in CMSSW.
It should serve as a jump-start for people who will need to prepare configuration application for central production of the Monte Carlo samples, as representatives of their Physics groups.
Another group of people we aim onto are those who will need to produce their Monte Carlo samples on a private basis, i.e. outside of central production.
Contents
Introduction
Physics event generation and detector simulation are the earliest steps in the event processing chain that leads to producing a Monte Carlo samples suitable for physics analysis.
An ample variety of the Monte Carlo samples for various types of physics analysis are
produced and distributed centrally. They can be found via
DBS/DLS discovery page. You may also visit
more detailed materials on Locating Data Samples.
However, if you can not find the event sample you would like to use (or compatible with the CMSSW version that you are using), you may want to produce your own Monte Carlo samples ("private samples").
In this section we will present several How-To's on the most often used features and utilities, will show several examples, and will provide links to more detailed documents.
In general, a user should assume
CMSSW_3_3_X release cycle, although many of them are valid starting CMSSW_3_1_0 and throughout the whole CMSSW_3_X_X family. Where necessary, we'll provide specific tips, especially for most recent releases.
Generation of high-energy-physics events, i.e. sets of outgoing particles produced in the interactions between two incoming particles,
must be the first step in the Monte Carlo event processing chain.
In CMSSW we have interfaces to many physics event generators that are of interest to the collaboration. You may find detailed information in the following sections of this
WorkBook, dedicated to Generation. In this writeup we will present an example that will use Pythia6 even generator, as it has been most heavily used in production so far.
The following step(s) can be either:
- Simulation and Digitization, whereby the newly generated particles are run through a detailed, Geant4-based simulation of the CMS detector, and detector electronics response is modeled
- Fast Simulation, which uses parametric approach to simulate and reconstruct events with the CMS detector; the concept of FastSim is to reduce the CPU time overhead, while still benefiting from an accurate simulation of the detector effects, in view of doing physics analysis, develop and tune reconstruction algorithms, design detector upgrades, etc.
CMSSW offers a large collection of software tools, utilities, scripts, and pre-fabricated configuration application fragments. Thus, in most cases a user will only need to know how find the right components, to compose them together, and to execute - this will be the focus of this section.
The tips we offer here have been tested on the LXPLUS cluster at CERN. We have also tested on remote sites as well, and will provide remarks and additional guidance as needed.
Running generation and simulation in CMSSW
Running the event generation and simulation in CMSSW means running a
framework job with the
usual syntax:
cmsRun <My configuration file>
Here
cmsRun
is the principal CMSSW executable that gets configured to do one or another type of job by the input
<My configuration file>, where a user specifies desired CMSSW software components and the order of their execution.
In order to "find"
cmsRun
and to be able to run it, you will need to have it in your PATH, which will be done by setting up your run time environment. A quick sketch of doing so is given below:
scram p CMSSW CMSSW_X_Y_Z
cd CMSSW_X_Y_Z/src
cmsenv
(of course, in this sketch X_Y_Z is a "generic" form that means one or another CMSSW release).
When it comes to the configuration input file, one needs to remember that a step in the event processing chain may create event data (
edm::Event
) that will serve as an input to the subsequent steps, thus it is important to properly order the sequence of steps for execution. One also need to know that some software components may need other "helper" software, thus those services and modules also need to be present in the configuration file.
However, most users will not need to worry about such details as in CMSSW there are utilities which will ensure that all service software is included, and the sequence of steps in the event processing chain is correct.
We would like to offer an
example configuration, which is similar to those used in central production, and can be directly executed with
cmsRun
. It configures
cmsRun
to generate Higgs(200) decaying to ZZ, and further to 4 leptons in the final state. Then it processes the Higgs events through Geant4-based detector simulation, digitization, and several other steps. At the end of the job, it writes out an output file.
You can copy this configuration file into your work area and execute
cmsRun
with it on input, as schematically shown above.
Below we will provide tips on how you can compose your own configuration application, for both Full or Fast Simulation.
Composing Full Simulation Configuration Application
Later in this document we will offer a quick walk-through the
example configuration,
and will point your attention to details that are specific to event generation and simulation.
Here we would like to stress that this configuration application has been composed via
standard CMSSW utility
cmsDriver.py
. This utility will appear in your PATH once you setup your run time CMSSW environment (just like
cmsRun
).
More detailed information on
cmsDriver.py
and preparation of the production quality configuration applications can be found in the
cmsDriver documentation.
Please note that
cmsDriver.py
has endergone substantial changes, starting from the very early CMSSW_3_1_X series and also at the beginning of the CMSSW_3_4_X cycle.
In this document we will offer
quick tips on the
cmsDriver.py
usage as of CMSSW_3_4_X or prior to that (in the time-reverse order).
When composing this example application with
cmsDriver.py
we used pre-fabricated configuration fragment, already available within CMSSW.
You can re-create this configuration application by executing
cmsDriver.py
with the following input arguments:
cmsDriver.py Configuration/Generator/python/H200ZZ4L_cfi.py -s GEN:ProductionFilterSequence,SIM,DIGI,L1,DIGI2RAW,HLT --conditions FrontierConditions_GlobalTag,MC_31X_V5::All --datatier 'GEN-SIM-RAW' --eventcontent RAWSIM -n 10 --no_exec
Here we have to make two comments:
-
GlobalTag MC_31X_V5 is used here as an example. It work well with many of the 3_X_X releases but not all of them. In order to choose GlobalTag that is right for the release of your choice please consult directly the Software Guide on the FrontierConditions.
-
If you are working with a CMSSW release 3_4_0_pre1 or later - and using pre-fabricated generator-level configuration fragment(s) that correspond to the release of your interest, - you will need to use GEN:ProductionFilterSequence as the first parameter in the -s field of
cmsDriver
ONLY in your application involves one or several special-purpose GenFilter's; otherwise you should use -s GEN
More details about ProductionFilterSequence will be given WorkBookGeneration and its daughter materials.
Now let us briefly review the input arguments to
cmsDriver.py
utility.
The first - and mandatory - input argument to
cmsDriver.py
is configuration fragment that determines what physics event generator you wish to use and what topology you intend to generate. In this example we use pre-fabricated
H200ZZ4L_cfi.py configuration fragment that resides under
python subdirectory of the Configuration/Generators package of CMSSW.
For the majority of applications for producing Monte Carlo samples the
only difference will be this generator-level configuration fragment, while other conditions and steps will be
standard. This is a great benefit of using
cmsDriver.py
for composing applications for Monte Carlo production, as it will ensure that most current setups and conditions will be employed.
We would like to point out that in the
Configuration/Generator/python area one can find a large number of pre-fabricated generator-level configuration fragments for various event topologies. One can use any of these fragments to compose event generation and simulation application. However, be advised that these fragments employ either Pythia6 event generator or simple "particle gun" generators. If you wish to learn more about those generators, or other important generators interfaced to CMSSW, please proceed to the
Generation section of the WorkBook.
Now we will provide short-cuts to the input arguments to the
cmsDriver.py
utility.
The
-s
field contains the sequence of event processing steps. In the above example the chain starts with the GEN(eration), including necessary filters to select events of users specific interest where applicable, following with SIM(ulation), DIGI(tization), L1 trigger emulation, conversion of the simulated DIGI2RAW (raw data format), and H(igh)L(evel)T(triggers) simulation. The last 3 steps are not a part of the event generation and simulation domain, but this is how "step-1" event processing chain in composed in production of the Monte Carlo samples. If you wish to terminate your chain at the simulated digi's level, you may use
"GEN:ProductionFilterSequence,SIM,DIGI". However, if you intend to run "private production" we suggest that you compose the chain up to
HLT step, as it will facilitate your "step-2" (reconstruction and further).
Details of the
--conditions
field can be found in the
Software Guide on the FrontierConditions.
To select the content of the
--datatier
field plese view the documentation on the
allowed Data Tiers.
Content of the
--eventcontent
field is described in great details in the
SWGuideDataFormatTable.
In the
-n
field you will specify how many events you want to generate, simulate, etc.
The
--no_exec
argument tells
cmsDriver.py
to write out the configuration file. If you do not specify this argument,
cmsDriver.py
will proceed to executing
cmsRun
.
Composing Fast Simulation Configuration Application
From the Generator point of view, Fast Simulation is identical to the Full Simulation both in usage of single-particle gun and one of the multi-purpose event generators as described above.
However, when using
cmsDriver.py
command for Fast Simulation there are some differences compared to the Full Simulation case described above:
- First of all, Fast Simulation job runs in one go both Simulation and Reconstruction
- Second, the syntax of
cmsDriver.py
command is somewhat different.
If we consider a particular generator fragment available in Configuration/Generator
TTbar_Tauola_cfi.py,
Then the
cmsDriver.py
command for making production Fast Simulation configuratuion is as follows:
cmsDriver.py TTbar_Tauola_cfi.py -s GEN:ProductionFilterSequence,FASTSIM --pileup=NoPileUp --conditions=FrontierConditions_GlobalTag,STARTUP31X_V4::All --eventcontent=AODSIM --beamspot=Early10TeVCollision --datatier GEN-SIM-DIGI-RECO -n 10
As already mentioned earlier, in releases from
CMSSW_3_4_0_pre1 or later, one needs to use
GEN:ProductionFilterSequence as the first parameter in the
-s field of
cmsDriver
only if one or several special-purpose GenFilter's is used; otherwise one should use
-s GEN .
Walk-through Example Full Simulation Configuration File
If you wish to learn details about
CMSSW configuration language in general, please visit the
corresponding section in this WorkBook.
In this document we will provide a partial walk-through the
example configuration,
concentrating on details that are specific to event generation and detector simulation.
First of all, please notice this line:
process.source = cms.Source("EmptySource")
You will need to use EmptySource if you wish to generate events "from scratch", using one of the multi-purpose event generators. There are several types of "Sources" in CMSSW, that are useful for other types of event generation, or for processing pre-generated events through further steps. You may learn more in the
WorkBookGeneration and in its daughter materials.
Next block in the configuration file that will be of interest to you is the one that starts with
process.generator = cms.EDFilter("Pythia6GeneratorFilter",
followed by a long string of configuration cards.
This is the software module of CMSSW that interfaces Pythia6 multi-purpose event generator. If you wish to learn how to configure Pythia6GeneratorFilter to generate physics events of your choice, please refer directly to the
Pythia6Interface document. If you wish to learn more on other generators, please go to the
WorkBookGeneration.
Please note that in the event processing chain this module is labeled as "generator". Please note that "generator" is a single label to apply to the event generation step, no matter what event generator you choose to employ. It is by this particular label that subsequent steps in the event processing chain will find generator-level particles for further processing.
You may also notice this line:
process.ProductionFilterSequence = cms.Sequence(process.generator)
Additional technicalities will be covered elsewhere.
Please note that starting CMSSW_3_4_0_pre1 ProductionFilterSequence is used only if post-generation event filters with one or more GenFilter's is employed.
Towards the beginning of the configuration file you will see other topics of interest to you:
process.load('Configuration/StandardSequences/Sim_cff')
process.load('Configuration/StandardSequences/Digi_cff')
The first one is a "master" configuration fragment for Geant4-based detector simulation, and the second one brings together several software components to model electronics response in all parts of the CMS detector.
The step that simulates passage of particles through CMS detector is labeled "psim". The sequence of steps to perform digitization is collectively labeled "pdigi".
Towards the end of the configuration file you will notice how these labels are used to include these software in the processing chain:
process.simulation_step = cms.Path(process.psim)
process.digitisation_step = cms.Path(process.pdigi)
and ordered for execution:
process.schedule = cms.Schedule(...,process.simulation_step,process.digitisation_step,...)
Please be advised that there are other software components needed to properly run CMS detector simulation and digitization, that are also included in the example configuration file. We will cover more details in the
WorkBookSimDigi and its daughter materials.
Please note that these are pre-fabricated "building blocks" that are available as part of the
Configuration/StandardSequences package of CMSSW. In its turn each configuration fragment uses other
standard, pre-fabricated configuration includes and fragments, in order to configure CMSSW simulation and digitization software with the
CMS-approved setting. If you wish to learn more details about the underlying software components or how to reconfigure one or another module to your specific needs, please visit
WorkBookSimDigi and its daughter materials.
Walk-through Example Fast Simulation Configuration File
Generator-wise the content of the Fast Simulation configuration file is very much similar to what is described in a previous chanpter for Full Simulation configuration.
It's worth mentioning that the choice of the Global Conditions tag, like
STARTUP31X_V1::All
in the cmsDriver Fast SImulation example above, automatically defines a corresponding
HLT table (low-luminosity "8E29" in this case) used in the configuration file, so that a corresponding
HLT line appears in the configuration:
process.load('FastSimulation/Configuration/HLT_8E29_cff')
among other Fast Simulation - specific configuration commands (
'FastSimulation/Configuration...'
).
Another Global Conditions option is
MC_31X_V1::All
which is paired with the higher-luminosity
HLT table "1E31".
There are also several explicit replacements of the defalut parameters, which set number of in-time pileup events, misalignment options, switch on/off particular subdetector simulation etc.
process.famosPileup.PileUpSimulator.averageNumber = 0
process.famosSimHits.SimulateCalorimetry = True
...
And at the end of the configuration file, a sequence of modules execution is defined with the job schedule commands (
process.schedule...
), so that generation and simulation parts are executed first, then follows a trigger part, and finally reconstruction and output ones.
Review status
Responsible:
JuliaYarba
Last reviewed by: DanielElvira - 15 Nov 2006