Answers on
referee comments concerning
the CMS analysis note AN 2006/003
The updated note (vers.2), taking into
account referee
comments:
trilap.ps
trilap.tex
Referee Comments for AN 2006/003
Title:
"Trilepton final state from neutralino-chargino
production in mSUGRA"
Authors:
W. de Boer, M. Niegel, C. Sander, V. Zhukov (Karlsruhe),
K. Mazumdar (TIFR)
Referees: R.
Cavanaugh, A. Tricomi
1. The note
clearly represents a
significant undertaking and the
referees would
like to recognize
and commend the authors on the
significant
amount of work that
this note represents.
2. The current
analysis is clearly
a model dependent one, as evidenced
by the fact
that the di-lepton
mass is required to be less than 75
GeV.
This model
dependency (which restricts the analysis to low mass
SUSY) should
be stated explicitly
in the note (and preferably also in
the
title).
Included in Introduction
"In this paper a study of the
CMS discovery potential for the mSUGRA
pure trilepton final state at low
value of $m_{1/2}$..."
3. It
is understood that
the low mass region is phenomenologically
interesting to
dark matter
searches and that the trilepton
cross-section
rapidly drops with
the di-lepton mass. Nevertheless,
because the
trilepton final states
are also interesting outside of the
dark matter
motivation, the
referees would like the authors to also
investigate if
it is possible to
include di-lepton masses above the Z0
peak (assuming
Lint = 30
fb-1). If this is not possible, then it
would be nice
to know how much
integrated luminosity is required to
include higher
mass SUSY.
We consider the whole mSUGRA parameters plane. Since the
neutralino-chargino
production cross section is proportional to the neutralino mass
as
1/m^4,
it is dropping fast with the m1/2=2.5*m and depends very weekly from m0.
The m1/2 is related to the gaugino and gluino masses and the m0 to the
scalars. Therefore the measurable signal is expected in the vast
range of the scalar masses (i.e. it would be missleading to say we
consider the
low mass SUSY region which in fact is the bulk region at low m1/2 and
m0)
and at relatively low m1/2 (which is naturally constrained by the
decrease of the cross section). The Z peak from the SM backgrounds
disturbs the event selection in the range of m1/2 230-300 GeV. At
m1/2>300 GeV
the cross section is below 1 fb^-1 , i.e. Lint>1000 fb^-1 would be
needed to see
the signal.
Included in the text:
" For larger m1/2 the trilepton production cross section is below 1 fb-1
and would require Lint>1000 fb-1"
4. In section
2.2, the note states
that "the QCD and W+Jets
backgrounds
have been considered
at generator level [only] and were
found to have
a negligible
contribution to the trilepton final state.
After
reconstruction, the fake
leptons can appear which can increase
number of
background
channels. The detailed study of the fakes is out
of the scope
of the current paper,
here only estimations are
presented."
The referees
understand that by
requiring three isolated leptons, the
QCD and W+jets
backgrounds are
significantly reduced. However because
of the
enormous QCD and high
W+Jets cross-sections as well as the soft
lepton
requirements (this work
considers muons above 5 GeV and
electrons
above 10 GeV) used in
this analysis, the referees consider
this to be a
very serious issue
which must be addressed: fake leptons
will arise due
to many effects
such as, jets faking electrons,
pi-zeros
faking electrons, charged
pions and kaons decaying in the
detector
volume producing (real!)
muons, high pT jets faking muons via
punch through,
etc, etc. The
lack of three leptons at generator level
does not imply
that one will not
reconstruct three (low pT!) isolated
leptons at
reconstruction
level. The referees request that the
authors either
(1) include QCD and
W+jets as reconstructed backgrounds
in the
analysis, or (2) establish
a reliable estimate for the
systematic
uncertainty for
ignoring QCD and W+jets in this work.
The Wjets and QCD cross sections would require simulation of >10^8
events,
which is, unfortunately not feasible.
However we simulated some samples and included them into the analysis.
The upper limit is set on the possible contribution from
this channels.
Also very large data samples are required for the systematical study
of the reconstruction fakes. The fake rates are related to
many CMS analysis
and would require a special study which we just started.
The analysis of fakes is included in the updated note as requested.
This analysis based on limited statistics and can be considered as
an estimate. The fake rates are evaluated individually for each bkg
channel
and
estimates of the number of fake events are preseneted.
The obtained fake rates ~10^-5 for muons and ~10^-4 for electrons is
compatible with the CMSIN 2005/028 results done for the Wjets.
5. In section
2.2, the note states
that "For all backgrounds the Z and
W bosons were
forced to decay
leptonically?"
The referees
have the same comment
as above: a lack of leptons
produced at
generator level, does
not mean that fake leptons will not
be
reconstructed at reconstruction
level (particularly for low pT
leptons).
This effect is
mitigated partly by requiring three isolated
leptons in the
however, by
ignoring the hadronic decays of the W and Z
bosons, the
current analysis is
ignoring possible significant
background
contributions arising
from fake leptons. The referees
request that
the authors either
(1) include hadronic decays of the W
and Z bosons
in the reconstructed
part of the analysis, or (2)
establish a
reliable estimate for
the systematic uncertainty for
ignoring such
contributions.
This depends on the channel, for example can be relevant to ZZ with
one Z->2l and second Z-2j, etc.
For the DY or Zjets the 2 OSSF are important to pass the analysis and
the effect of more than 1 fake in one event is pretty small.
However we consider the full samples as suggested and
the contribution of the fakes is evaluated in the note.
6. In section
2.2, the note states
that "?for the Z+jets and DY,
trileptons
were preselected at
generator level to have Pt > 5 GeV/c
and |eta| <
2.4. This
preselection can lead to underestimation of the
fake [rate]."
Because
reconstructed muons are
considered above 5 GeV and
reconstructed
electrons above 10
GeV (OSSF muons are actually required
to be above 10
GeV and OSSF
electrons above 17 GeV), this generator
preselection
cut is the same as
the reconstructed cut and therefore
presents a
very serious
difficulty, not to the fake rate as suggested
in the note,
but rather in
properly estimating the background arising
from
mis-measurement of low pT
muons which were generated below 5 GeV,
but
reconstructed above 5
GeV. The referees request that the authors
either (1)
remove this
pre-selection requirement from the analysis, or
(2) reliably
estimate the
systematic uncertainty to the background
estimation due
to this
preselection cut.
The cross section for the considered background is large and
without the preselection can be almost impossible to simulate.
The generator level preselection cut is lower than used at the
reconstruction
by ~5 GeV; in preselection PT>5(10) GeV/c for the muons(electrons)
and
during reconstruction selection PT>10 GeV/c for all leptons (and
>17 GeV for
2OSSF electrons in order to pass the trigger).
Provided that the Pt resolution of muons(electrons)
at this energies is below 2(5)%, we assume the contribution from
the miss-measured leptons is negligible.
7. In section
2.2, Zbb is not
considered as a possible background.
The referees
feel that this is
potentially a significant background
and that the
authors should either
(1) include Zbb in this analysis,
or (2) justify
why Zbb is not a
source of tri-lepton background.
The Zbbar is included in the analysis now and contributes <70 events
into 1800 bkg. events
8. In section
2.2, SUSY (LM9) is
listed as a background source. The
referees agree
that SUSY itself
does indeed represent a source of
background for
this
analysis. Nevertheless, the referees request that
the authors to
kindly describe the
details of how SUSY is used as a
background.
For example, how
is the tri-lepton signal separated from
non-tri-lepton
SUSY events, etc.
The inclusive LM9 events except the direct neutralino-chargino
production (PYTHIA process
230) was used as the SUSY background.
This is clearified in the updated text.
9. In section
2.2, the CMS dataset
names are not given for the DSTs
used.
The referees request
that the authors list (in an appendix) the
dataset names
for the DSTs used in
this analysis.
The dataset names are included in the updated text.
10. In section
4, the note states
that "the correct pairing was
analyzed with
MC tagged leptons
and is almost the same for high or low
pT
combinations."
The referees
would like to request
that the authors kindly include
more details
on how the
reconstructed leptons are tagged using MC
information.
The pairing was studied at the generator level
using RawHepEvent where the
origin of the leptons is known.
This is clearified in the updated
text.
11. In section
3.1, the note
states that "The triggers (L1+HLT)
efficiency in
the m0, m1/2 plane
is presented in Figure 4. The scan
was produced
with FAMOS and the
efficiencies were tuned to the
efficiency at
LM9 from the full
simulation."
The referees
are confused by this
statement and would like the authors
to clarify
what is meant by
"tuning" the trigger efficiencies to LM9
and how those
"tuned" efficiencies
are used in the FAMOS scan of the
m0, m1/2 plane.
Since we used FAMOS for the scan, where the trigger is not
well
implimented yet, the trigger selection cuts have been
implimented to the offline reconstructed objects (muons and electrons).
Therefore the trigger efficiencies are different as compare to the
full
reconstruction( is higher).The FAMOS efficiency was normalized
to the ORCA trigger
efficieny at LM9 point where DST was simulated.
Of course this can introduce some small uncertainties
due to
dependancy of the reconstruction efficiency from Pt. However
the scan plot is presented for illustration purpose and shows
the
correct tendancy.
12. In section
3.2, the notes
states that muons can be "contaminated"
from jets not
vetoed by the jet
veto and that this contamination is
estimated by
matching the
reconstructed muon in FAMOS with a generated
muon from
PYTHIA. This
appears to have only been done for ttbar,
Z+jets, DY
samples. The
contamination is estimated to be 3 10^-6.
The referees
are concerned that
this estimate may not be accurate.
While FAMOS
does decay pions and
kaons, it does not simulate
punch-through
and may not simulate
other possible effects contributing
to fake
muons. More
importantly, however, the referees are more
concerned that
other important
backgrounds are not included in
estimation of
the muon fake
rate. Hence the referees request that
(1) the
authors justify the use of
FAMOS for estimating the muon
fake rate and
(2) that the authors
estimate the muon fake rates all
considered
backgrounds.
The most important backgrounds are simulated in FAMOS and therefore
FAMOS has been used for evaluation of the fake rates.
The FAMOS and ORCA may have some difference but this study
is certainly out of the scope of current analysis since it will require
large data samples. However the numbers we obtained are close to the
full simulations (CMSIN2005/028) although the direct
comparison
is difficult since we used a bit stronger cuts on leptons.
In addition our fake rates estimate
dont take into account the miss reconstructed leptons out of the
matching cone, i.e. we overestimate fakes.
The fakes section is reevaluated in the note, see section 4,
and the fake rates are estimated for all important channels.
13. In section
3.2, the notes
indicates that the contamination of
reconstructed
electrons come
mainly pi-zeros or photon conversions.
The
contamination is estimated
from ttbar, Wt, and Z+jets to be 7
10^-5.
The referees
are concerned that
not all backgrounds have been used to
estimate the
electron fake
rate. Further the note does not consider
the electron
fake rate
specifically from jets. The referees request
that the
authors (1) include all
backgrounds to estimate the electron
fake rate, not
only ttbar, Wt, and
Z+jets, including, in particular,
QCD, W+jets,
and Zbb, etc and (2)
include the possibility of jets
faking an
electron when estimating
the contamination rate.
The fakes rates are reevaluated in the note, see section 4,
for all channels.
14. In section
3.4, the note
states that MET "is almost zero from DY,
Z+jets, ZZ
backgrounds."
The referees
would kindly like to
suggest that the authors change the
wording
slightly to account for
the fact that the Z0 can decay to
neutrinos,
which would create
significant MET.
Corrected in the text:
" is almost zero for the leptonic (e,mu) decays.. "
15. All plots
in the note are
normalised to equal areas. The referees
would like to
suggest that all
distributions be normalized to
luminosity
weighted
cross-sections, where possible and appropriate.
This provides
the reader with a
better feel for the importance of one
distribution
compared with another
distribution within the same plot.
There are 3 plots normalized : Fig.6 Pt distribution of leptons,
Fig.7 MET and sumET distribution,Fig.8 Et jets and Njets.
The comparible quantities, which can be plotted in one plot in one
scale,
are obtained only after selection(see Table 6),
i.e the distributions have to be normalized to the number of events
after selection, which is discussed in the next
section, and apriori is not known. That prevent to use
the scaling. In the other hand the efficiency of each selection cut is
presented
in the Table and the contribution of each background can be easily
understood.
16. In section
4, the note
describes "In ~27% of the signal events
another OSSF
pair can be
constructed, where one of this pair would be
a fake, since
one lepton
originates from the chargino decay."
The referees
are confused by the
term "fake" here. Presumably the
lepton
originating from the
chargino decay is a "real" lepton (that
is, it was
generated and
reconstructed). Do the authors really mean
that using
such a lepton in the
invariant mass calculation gives the
wrong
combination? The
referees would like the authors to define what
is meant by
the term "fake" in
this section. If it corresponds to a
"real" lepton,
but simply the
wrong combination in the invariant mass
calculation,
the referees kindly
ask the authors to use a different
term so as not
to confuse the
reader with "fake reconstructed leptons"
which have no
correspondence with
a generated lepton.
The word 'fake' here is missleading, a wrong combinations
has to be used instead. The text is corrected and this part is
clearified.
17. In section
4, the note
indicates that the "fake" invariant mass
(above) stays
in signal region.
The referees
are concerned by this
fact, which is manifest in Figures
9, 10, and 12,
in which the shape
of the signal looks very similar to
the shape of
the background.
As a result, a Gaussian fit to the
line-shape
(performed later in the
analysis) is not appropriate
without taking
proper account of
the background (either via background
subtraction or
via a simultaneous
fit to the background).
Indeed, with such small significance the fitting does not make sense.
The gaussian fit is removed from the analysis.
18. In section
4, the note states
that a low mass peak (arising from
DFOS
combinations) from Z+jets and
DY backgrounds is suppressed by
generator
preselection cuts but
can also be suppressed with a cut on
the invariant
di-lepton mass >
15 GeV.
Clearly, one
is not allowed to
suppress backgrounds via generator
level
cuts. The referees are
thus confused and request that the
authors kindly
clarify what is
meant by the above, in case the
referees have
misunderstood the
intent and what was actually performed
in the
analysis. If indeed a
generator level cut was used to remove
the low mass
peak in the Z+jets
and DY backgrounds, then the referees
request that
that cut be removed
from the analysis. If a cut on the
invariant
di-lepton mass is
performed, is that cut performed at
generator
level or reconstruction
level. If the cut is performed at
generator
level, then the referees
request that the authors perform
the cut at
reconstruction
level. If the cut is done at reconstruction
level, then
the referees kindly
ask the authors to clarify this in the
text.
This ansats suggests only a possibility to stringent
the invariant mass from below in addition to upper limit (75 GeV),
The text is corrected.
19. In section
4, the note
describes a Gaussian fit to the di-lepton
invariant mass.
The referees
request that the
authors clarify the purpose of the
Gaussian
fit. If the
background is not subtracted (or fitted
simultaneously)
before the fit is
performed, then mean of Gaussian
will depend on
the background
shape, as the authors state in the note.
It is
unclear what value the
Gaussian fit adds to the results.
Indeed, the end point is hardly visible and there is no reason to have
the fits. The fit is removed from the analysis.
20. In section
4, the note states
that "A fake event is defined as an
event where at
least one lepton is
used in the selection does not come
from the main
interaction.
In almost 90% of cases, this is found to
be a fake
electron."
The referees
request that the
authors define what is meant by "main
interaction."
The referees
are concerned that the note confuses real
reconstructed
electrons
(corresponding to generated electrons, but
which may not
originate from the
"main interaction") with fake
reconstructed
electrons (which do
not correspond to a generated
electron).
If an electron is
generated and reconstructed, it is
normally
referred to as a "real"
electron, even if it does not
originate from
the "main
interaction." The referees kindly request
the authors to
also precisely
define what is meant by "fake
electron" in
the above passage.
We agree with this terminology. The real particle -means it is
simulated at MC level, independant on the source.
Note however that in the analysis the particle produced in
the pileup events were not tagged as MC(the RawHepEvent has been used
for MC event).
It means the fake rate can be slightly overestimated
by the leptons appeared from the pileup events, in reality situation
can be better.
This part is corrected in the text as suggested.
21. In section
5, the note
describes the training of the Neural
Network.
The training
samples are said to "have been preselected with
somewhat
looser selection cuts."
The referees
are concerned by this
statement. The NN training sample
should be
configured identically
to the sample which is used to make
the
significance estimation.
The referees request the authors to
clarify why
the training and data
samples have different
configurations
and what systematic
effect this has on the
significance
estimation.
Clearly the lower cuts for the training were used to increase statistics
for the training samples. During expertize the network have been used
as an additional discriminant which has combined different variables.
The contribution of the different part of the training sample
to this value is not straightforward. In our case it was
found that for
the part of statistics which is related to the lower cuts,
the NN selection is less efficient than simple cuts.
This does not introduce any systematic but simply compensates
the unefficiency of the NN for low Pt region.
This is clearified in the updated text.
22. The
referees appreciate the
motivation of applying the NN (which
accounts for
correlations between
the discriminating variables)
followed by
the Genetic Algorithms
(which can efficiently maximize a
function in a
multi-dimensional
space). However the referees note
that, by far,
the most powerful
selection cuts of the analysis comes
from neither
the NN nor the GA,
but rather from the very simple cuts
which are
applied before the
NN+GA. Indeed, the NN+GA gains are only
very modest
when compared with the
extraordinarily added complication.
Applying
such complexity
reduces one's ability to understand the
systematic
behaviour of this
analysis (which the referees note as
being very
systematically
challenging in its current form).
The NN selection does an improvement. The fact that the improvement is
not a miracle - only confirms that the selection with cuts
is close to the optimal
(which was not obvious in advance).
Also is clear that the NN(as well as GA or LHood) has larger
dependance
on the simulation model and the reconstruction algortihms, which in turn
increases the influence of systematic uncertainties. This
is even true for the
simple cuts if they are tuned to get the maximum significance of some
signal.
In the other hand each method has own advantages. Ideally one may use
the
most model independant method to see a signature and improve(and
crosscheck)
results with the more complex analysis (we have tried to do).
It is possible to make the NN more important by just loosing the cuts
and
optimize the NN better, that will probably give even better
significance,
but as you pointed out, results will be more model dependant.
We agree that the systematics related to the fake leptons is the
most difficult part of the analysis. The problem is related to the
large data samples needed to optimize the fakes suppression.
In the other hand this further optimization might be very
model(GEANT) dependant and far from the real detector performance
which will be studied in the first running year.
Therefore the presented analysis of the CMS potential for trileptons
can be only a guidline for the real data analysis (although this
does not exclude further improvement of algortihms).
23. In section
6, there is no
systematic uncertainty quoted for lepton
fake rates.
Due to the
aggressive
generator-level pre-selection cuts imposed
in this
analysis and due to the
low pT nature of the accepted
reconstructed
leptons, the
referees request that the authors provide
a reasonable
estimate for the
systematic uncertainty on the
significance
to observe the signal
due to fake leptons.
The uncertainties of the fake rates are included in the significance
calculation as suggested.
This uncertainties can be treated as a statistical
uncertainties
(sigma) in the background estimation and included in the Sc12 via
the factor
S_c12t=S_c12t*sqrt(Nb+dNb)/sqrt(sigma**2+Nb+dNb).
24. In section
6, the note
describes a variation of cuts as a way to
estimate
reconstruction
uncertainties.
The referees
do not feel that
varying the cuts is an adequate method
for
determining the systematic
effects due to uncertainties in the
reconstruction.
The authors
are requested to consult the different
PRS detector
groups for their
recommendation for applying systematic
uncertainties
on reconstructed
physics quantities.
In this analysis the reconstruction uncertainties
are mostly related to the jets energy scale and leptons Pt
errors,
used in the selection with cuts. There are no corellation in between and
the variation of the energy scale and the leptons Pt is
the simplest method to account for these reconstruction
uncertainties.
In the NN analysis more variables are involved and we didnt
consider influence of the reconstruction uncertainties for all of
them.
However the most significant variables are still related to the
leptons
and jets and can be taken into account via calculated
factors(~1%) quadratically added.
This part is changed in the text.
25. In section
6, the note states
"The dimuons, dielectrons, trimuons
and all OSSF
pairs final states
can be treated as different
experiments
and the total
significance can be evaluated."
The referees
request that the
authors clarify the exact trigger
streams for
each of the dimuons,
dielectrons, trimuons and all OSSF
final
states. If an event
corresponds to multiple streams, then it
will possibly
enter into multiple
of the above "experiments." If an
event is
selected in one
"experiment" is it also explicitly vetoed in
all other
"experiments." If
the "OR" of all specified triggers is
taken, then
the different final
states can not be treated as different
independent
experiments.
The 2m and 2e trigger streams (main streams for the trilepton study)
are separated and can be considered as an independent
experiments. However the electron contribution brings very litle
and can be unrelevent. This part is updated in the text.
V.Zhukov
24.03.2006