H a c k i n g P o s t g r e S Q L PGCon 2013 Ottawa, Canada Stephen Frost sfrost@snowman.net Resonate, Inc. • Digital Media • PostgreSQL • Hadoop • techjobs@resonateinsights.com • http://www.resonateinsights.com
S t e p h e n F r o s t •PostgreSQL •Major Contributor •Implemented Roles in 8.3 •Column-Level Privileges in 8.4 •Contributions to PL/pgSQL, PostGIS •Resonate, Inc. •Principal Database Engineer •Online Digital Media Company •We're Hiring! * techjobs@resonateinsights.com
D o y o u r e a d . . . •planet.postgresql.org
P o s t g r e S Q L S o u r c e •Overall PG source tree structure c o n t r i b - c o n t r i b m o d u l e s ( M i g h t b e c o m e e x t e n s i o n s , o n e d a y . . ) d o c - D o c u m e n t a t i o n ( S G M L ) s r c - P o s t g r e S Q L " c o r e " ( C c o d e , m o s t l y ) . . . s r c / b a c k e n d - P o s t g r e S Q L s e r v e r ( " B a c k - E n d " ) s r c / b i n - p s q l , p g _ d u m p , i n i t d b , e t c ( " F r o n t - E n d " ) s r c / c o m m o n - C o d e c o m m o n t o f r o n t & b a c k s r c / i n c l u d e - . h f i l e s , a n d f r i e n d s s r c / i n t e r f a c e s - l i b p q , e c p g s r c / p l - C o r e p r o c e d u r a l l a n g u a g e s ( p l p g s q l , p l p e r l , t c l , e t c ) s r c / p o r t - P l a t f o r m - s p e c i f i c h a c k s s r c / t o o l s - D e v e l o p e r t o o l s ( p g i n d e n t , e t c )
D o w n t h e R a b b i t H o l e . . •Components of the backend (src/backend/...) a c c e s s - M e t h o d s f o r a c c e s s i n g d i f f e r e n t t y p e s o f d a t a ( h e a p , b t r e e i n d e x e s , g i s t / g i n , e t c ) . c a t a l o g - D e f i n i t i o n o f t h e P G t a b l e s ( p g _ c a t a l o g . * ) c o m m a n d s - U s e r - l e v e l S Q L c o m m a n d s ( A L T E R , C R E A T E T A B L E , V A C U U M , e t c ) e x e c u t o r - D u h , t h e E x e c u t o r - r u n s t h e q u e r i e s a f t e r p l a n n i n g / o p t i m i z a t i o n f o r e i g n - H a n d l e s F o r e i g n D a t a W r a p p e r s , u s e r m a p p i n g s , e t c l i b - " G e n e r a l P u r p o s e " / " M i s c " f u n c t i o n s ( b u t t h e y a r e e l s e w h e r e t o o . . ) l i b p q - B a c k e n d i n t e r f a c e t o t a l k t o l i b p q , a k a t h e w i r e l i n e p r o t o c o l m a i n - m a i n ( ) , d e t e r m i n e s h o w t h e b a c k e n d P G p r o c e s s i s s t a r t i n g a n d h a n d s o f f t o t h e r i g h t s u b s y s t e m n o d e s - G e n e r a l i z e d " N o d e " s t r u c t u r e i n P G a n d f u n c t i o n s t o c o p y , c o m p a r e , e t c o p t i m i z e r - Q u e r y o p t i m i z e r , i m p l e m e n t s t h e c o s t i n g s y s t e m a n d g e n e r a t e s a p l a n f o r t h e e x e c u t o r p a r s e r - L e x e r a n d G r a m m a r , h o w P G u n d e r s t a n d s t h e q u e r i e s y o u s e n d i t p o r t - B a c k e n d - s p e c i f i c p l a t f o r m - s p e c i f i c h a c k s p o s t m a s t e r - T h e " m a i n " P G p r o c e s s t h a t a l w a y s r u n s , a n s w e r s r e q u e s t s , h a n d s o f f c o n n e c t i o n s r e g e x - H e n r y S p e n c e r ' s r e g e x l i b r a r y , a l s o u s e d b y T C L , m a i n t a i n e d m o r e - o r - l e s s b y P G n o w r e p l i c a t i o n - B a c k e n d c o m p o n e n t s t o s u p p o r t r e p l i c a t i o n , s h i p p i n g W A L l o g s , r e a d i n g t h e m i n , e t c r e w r i t e - Q u e r y r e w r i t e e n g i n e , u s e d w i t h R U L E s s n o w b a l l - S n o w b a l l s t e m m i n g , u s e d w i t h f u l l - t e x t s e a r c h s t o r a g e - S t o r a g e l a y e r , h a n d l e s m o s t d i r e c t f i l e i / o , s u p p o r t f o r l a r g e o b j e c t s , e t c t c o p - " T r a f f i c C o p " - t h i s i s w h a t g e t s t h e a c t u a l q u e r i e s , r u n s t h e m , e t c t s e a r c h - F u l l - T e x t S e a r c h e n g i n e u t i l s - V a r i o u s b a c k - e n d u t i l i t y c o m p o n e n t s , c a c h e i n g s y s t e m , m e m o r y m a n a g e r , e t c
S o y o u h a v e a n i d e a . . •Where to begin? •Depends on your idea, but I prefer the parser •Grammar drives a lot of things •Also one of the hardest items to get agreement on •The grammar is in src/backend/parser/ •scan.l - lexer, handles tokenization •gram.y - actual grammar •Built with flex (lexer) and bison (parser) •Rarely have to change the lexer
M o d i f y i n g t h e g r a m m a r •Grammar is a set of productions •"main" is the 'stmt' production •Lists all the top-level commands •Each is its own production then s t m t : A l t e r E v e n t T r i g S t m t | A l t e r D a t a b a s e S t m t | A l t e r D a t a b a s e S e t S t m t . . . | C o p y S t m t C o p y S t m t : C O P Y o p t _ b i n a r y q u a l i f i e d _ n a m e o p t _ c o l u m n _ l i s t o p t _ o i d s c o p y _ f r o m o p t _ p r o g r a m c o p y _ f i l e _ n a m e c o p y _ d e l i m i t e r o p t _ w i t h c o p y _ o p t i o n s { C o p y S t m t * n = m a k e N o d e ( C o p y S t m t ) ; n - > r e l a t i o n = $ 3 ;
M o d i f y i n g C o p y S t m t •Add it into the COPY production •Modify the C template code as needed •C code is extracted by bison •Run through a set of changes (eg: changes "$3") •Compiled as part of the overall parser (gram.c) •Remember to update the keywords list (kwlist.h) •Also remember to add to unreserved_keywords •Try to avoid creating new reserved keywords
A d d i n g a n o p t i o n t o C O P Y - - - a / s r c / b a c k e n d / p a r s e r / g r a m . y + + + b / s r c / b a c k e n d / p a r s e r / g r a m . y @ @ - 5 2 1 , 8 + 5 2 1 , 8 @ @ s t a t i c v o i d p r o c e s s C A S b i t s ( i n t c a s _ b i t s , i n t l o c a t i o n , c o n s t c h a r * c o n s t r T y p e , - C O M M I T T E D C O N C U R R E N T L Y C O N F I G U R A T I O N C O N N E C T I O N C O N S T R A I N T C O N S T R A I N T S - C O N T E N T _ P C O N T I N U E _ P C O N V E R S I O N _ P C O P Y C O S T C R E A T E + C O M M I T T E D C O M P R E S S E D C O N C U R R E N T L Y C O N F I G U R A T I O N C O N N E C T I O N C O N S T R A I N T + C O N S T R A I N T S C O N T E N T _ P C O N T I N U E _ P C O N V E R S I O N _ P C O P Y C O S T C R E A T E @ @ - 2 4 0 3 , 6 + 2 4 0 3 , 1 0 @ @ c o p y _ o p t _ i t e m : { $ $ = m a k e D e f E l e m ( " h e a d e r " , ( N o d e * ) m a k e I n t e g e r ( T R U E ) ) ; } + | C O M P R E S S E D + { + $ $ = m a k e D e f E l e m ( " c o m p r e s s e d " , ( N o d e * ) m a k e I n t e g e r ( T R U E ) ) ; + } | Q U O T E o p t _ a s S c o n s t { $ $ = m a k e D e f E l e m ( " q u o t e " , ( N o d e * ) m a k e S t r i n g ( $ 3 ) ) ; @ @ - 1 2 4 7 1 , 6 + 1 2 4 7 5 , 7 @ @ u n r e s e r v e d _ k e y w o r d : | C O M M I T T E D + | C O M P R E S S E D | C O N F I G U R A T I O N
W h a t a b o u t t h e c o d e ? •COPY has a function to process options •Surprise, it's called "ProcessCopyOptions" •COPY is defined in backend/commands/copy.c •COPY state info •Local state structure CopyStateData also in copy.c •Not in a .h because only COPY needs it •Define structures in .c files near the top
O p t i o n h a n d l i n g i n c o p y . c @ @ - 1 0 9 , 6 + 1 1 9 , 7 @ @ t y p e d e f s t r u c t C o p y S t a t e D a t a b o o l b i n a r y ; / * b i n a r y f o r m a t ? * / + b o o l c o m p r e s s e d ; / * c o m p r e s s e d f i l e ? * / b o o l o i d s ; / * i n c l u d e O I D s ? * / @ @ - 8 8 9 , 6 + 1 1 8 6 , 2 0 @ @ P r o c e s s C o p y O p t i o n s ( C o p y S t a t e c s t a t e , } + e l s e i f ( s t r c m p ( d e f e l - > d e f n a m e , " c o m p r e s s e d " ) = = 0 ) + { + # i f d e f H A V E _ L I B Z + i f ( c s t a t e - > c o m p r e s s e d ) + e r e p o r t ( E R R O R , + ( e r r c o d e ( E R R C O D E _ S Y N T A X _ E R R O R ) , + e r r m s g ( " c o n f l i c t i n g o r r e d u n d a n t o p t i o n s " ) ) ) ; + c s t a t e - > c o m p r e s s e d = d e f G e t B o o l e a n ( d e f e l ) ; + # e l s e + e r e p o r t ( E R R O R , + ( e r r c o d e ( E R R C O D E _ S Y N T A X _ E R R O R ) , + e r r m s g ( " N o t c o m p i l e d w i t h z l i b s u p p o r t . " ) ) ) ; + # e n d i f + } e l s e i f ( s t r c m p ( d e f e l - > d e f n a m e , " o i d s " ) = = 0 )
Recommend
More recommend