Lecture 1.1 What is PAT and How to use it? Content ● A short reminder of the CMS EDM and Analysis Workflow ● The answer to the question: What is PAT? ● An introduction to the PAT DataFormats ● Configuration of the PAT DataFormats ● An introduction to the PAT Workflow ● Support and Documentation PAT Tutorial June 2010
Reminder of the Event Data Model ● Configurable edm::Modules communicate with/via the EventContent ● Same file structure (i.e. root) for: Gen-Sim-Digi-Reco-Analysis ● Single framework for Reconstruction (POGs) and Analysis (PAGs)
Typical CMS Analysis Workflow ● Prompt reconstruction at Tier-0. ● Central skims at Tier-1's. ● Users run cmsRun at Tier-2's: ● Perform high level analysis steps. ● Preselect events. ● Write their own user defined EventContent to private T2/T3 space. ● The latter step might be iterated. ● Copy reduced datasets to your favorite machine. ● Run your final analysis/produce plots. PAT helps you to create a user-defined EventContent
What is the Physics Analysis Toolkit PAT is a toolkit as part of the CMSSW framework ● It serves as well tested and supported common ground for group and user analyses. ● It facilitates reproducibility and comprehensibility of analyses. ● It is an interface between the sometimes complicated EDM and the simple mind of the common user. ● You can view it as a common language between CMS analysts: ● If another CMS analyst describes you a PAT analysis you can easily know what he/she is talking about
Three Aspects of PAT Common Tool Interface ● approved algorithms & sensible defaults ● b/w RECO expertise & Analysis Level ● synergy (everybody can profit from ● simplifies access via DataFormats recent developments) ● canalizes expertise (via POG & PAG ● quick start into analysis for beginners contacts) ● crossing point between POGs & PAGs ('vertical integration') Common Format ● facilitates transfer & comparisons ● PAG common configurations ● sustained provenance
Facilitated Access to Event Information ● Do you know how to access this event information within the EDM? Correction Factors, Object Resolutions JetFlavor Object Id, Cluster shapes Generator Match, reco::Candidate Trigger Match Isolation (different from defaults) More, ... Associated Tracks, BTag Algorithms, JetCharge TagInfos ● With PAT Candidates you get this just by calling member functions! ● Note: Each PAT Candidate IS a corresponding reco::RecoCandidate (and more)
The PAT Data Formats ● All pat::Objects inherit from their corresponding reco::RecoCandidates ● A PAT Candidate is a reco::RecoCandidate PLUS more.
PAT Candidate Member Functions Check the Documentation: SWGuidePATDataFormats
Combine Flexibility and User Friendliness ● You can choose yourself whether you really need all the extra information that the PAT Candidates provide. ● Still you don't need to know, how EDM/PAT manages this access for you under the hood. Flexibility User Friendliness Maximal Configuration ● The key is: configuration of DataFormats by cfi file! (E.g. for pat::Jets).
Configuration of PAT DataFormats You can configure the content of the DataFormats yourself (example: pat::Jet)! import FWCore.ParameterSet.Config as cms patJets = cms.EDProducer("PATJetProducer", ... # embedding of AOD items Size: 14kb/event (for ttbar) embedCaloTowers = cms.bool(False), embedPFCandidates = cms.bool(False), # jet energy corrections addJetCorrFactors = cms.bool(True), jetCorrFactorsSource = cms.VInputTag("patJetCorrFactors"), # btag information addBTagInfo = cms.bool(True), addDiscriminators = cms.bool(True), discriminatorSources = cms.VInputTag( ... ), # clone tag infos ATTENTION: these take lots of space! # usually the discriminators from the default algos # are sufficient addTagInfos = cms.bool(True), tagInfoSources = cms.VInputTag( ... ), # track association addAssociatedTracks = cms.bool(True), trackAssociationSource = "ak5JetTracksAssociatorAtVertex", # jet charge addJetCharge = cms.bool(True), jetChargeSource = cms.InputTag("patJetCharge"), # add jet ID addJetID = cms.bool(True), jetIDMap = cms.InputTag("ak5JetID"),
The PAT Workflow Have a look at: SWGuidePATWorkflow Pre-Production steps before PAT Candidate creation PAT Candidate creation Main collection ( w/o cleaning ) Main collection ( with cleaning ) Resembled by the structure of the python directory in the PatAlgos package (don't be shy, check it out!)
EventContent of the default PAT Tuple ● Have a look to patEventContent_cff.py: Size: 20kb/event (for ttbar) ● Have a look to patTemplate_cfg.py: ● But decide yourself how your PAT Tuple should look like (add reco::Tracks or reco::GenParticles to the Event Content or BTag information to the jets, etc ... ).
The concept of Maximal Configuration ● Configure your own ● Add any extra info DataFormats via embedding you need the the (see Lecture 2.2/Exercise 06). EventContent. ● Configure your workflow via ● Apply selections via the tools that PAT provides (see StringCutParser. Lecture 2.1/Exercise 05).
The Code Location DataFormats/PatCandidates ● Definition of all PAT Candidates. ● pat::Photon, pat::Electron, pat::Muon, pat::Tau, pat::Jet, pat::MET, ... PhysicsTools/PatAlgos ● Implementation and filling of all data formats. ● Definition of common workflow and PAT tools. PhysicsTools/PatUtils ● Definition of common tools and helper functions used in PatAlgos. PhysicsTools/PatExamples ● Location of many examples e.g. all non-trivial examples used during this Tutorial.
Development PAT is part of any CMSSW release. We recommend to use it from the release! Have a look at: SWGuidePATRecipes
Development (cont'd) In case you want already to use features/fixes that will go into the next release follow the Pat release notes in the corresponding development branch.
Support Check the the main entry page of PAT in the software guide: SWGuidePAT A short extract of possible support: ● Lecturers & Tutors ● Hypernews ● Community ● POG/PAG contacts ● Developers ● The quite developed PAT Documentation!
Documentation ● SWGuidePAT/WorkBookPAT Main documentation pages ● WorkBookPATDataFormats Description of all PAT Candidate. ● WorkBookPATWorkflow Description of the PAT workflow. ● WorkBookPATConfiguration Description of the configuration of PAT. ● SWGuidePATTools Description of all PaT tools. ● WorkBookPATTutorial Tutorials and examples to get started. ● SWGuidePATRecipes Installation recipes ● SWGuidePATEventSize Tools for event size estimate And last but not least: This Tutorial and/or former Tutorials...
Exercises By now you should be prepared to do the following Exercises on WorkBookPATTutorial: Have Fun! ● Exercise 1 ( WorkBookPATDocNavigationExercise ) The PAT Documentation is one of the most looked after parts of the WorkBook. To know your documentation and how to use it can speed up your learning curve enormously. Learn more about the PAT Documentation and how to make effective use of it. ● Exercise 2 ( WorkBookTupleCreationExercise ) Learn how the default PAT tuple is produced to be prepared to produce your own PAT tuples. ● Exercise 3 ( WorkBookTupleCrapExercise ) This is the part of the crab tutorial. Once you are doing large sceal analyses you will need crab.
Recommend
More recommend