Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc T OPOLOGY I DENTIFIABILITY E XAMPLE 1 LESSONS • Example 1 gave two models with same f R ( M ) , different T ( M ) . • So in that case, T is not identifiable. 26 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc T OPOLOGY I DENTIFIABILITY E XAMPLE 1 LESSONS • Example 1 gave two models with same f R ( M ) , different T ( M ) . • So in that case, T is not identifiable. • Must restrict M if we hope to identify T for each M ∈ M . 27 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc T OPOLOGY I DENTIFIABILITY E XAMPLE 1 LESSONS • Example 1 gave two models with same f R ( M ) , different T ( M ) . • So in that case, T is not identifiable. • Must restrict M if we hope to identify T for each M ∈ M . T OPOLOGICAL D ETERMINISM • A class M is Topologically Determinate if ∄ M 1 , M 2 ∈ M with f R ( M 1 ) = f R ( M 2 ) , and T ( M 1 ) � = T ( M 2 ) . • i.e. , models with same f R have same T . 28 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc G OALS (I NFINITE D ATA C ASE ) • Find “large”, natural Topologically Determinate class(es) M . • Find algorithm guaranteed to recover T ( M ) for all M ∈ M . 29 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc G OALS (I NFINITE D ATA C ASE ) • Find “large”, natural Topologically Determinate class(es) M . • Find algorithm guaranteed to recover T ( M ) for all M ∈ M . E XAMPLE : C LASSICAL M ODELS M C • Classical models are Topologically Determinate. • SLTD works for them. • In fact, one model per f R ( M ) , so one model per T . 30 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc N EW C LASSES M AJIE M CE M AJI M JI M C 31 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc D IMENSIONS OF N EW C LASSES T dim ( M C , T ) 4 6 9 14 29 dim ( M CE , T ) 12 54 489 14350 536805405 dim ( M JI , T ) 15 56 478 14133 536613988 dim ( M AJI , T ) 15 56 478 14133 536613988 dim ( M AJIE , T ) 15 57 489 14395 536805415 dim ( M T ) 15 63 511 16383 536870911 T ABLE : Examples of model class dimensions. 32 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc C LASSICALLY E QUIVALENT M ODELS : M CE D EFINITION M 1 ∈ M CE iff ∃ M 2 ∈ M C with f R ( M 1 ) = f R ( M 2 ) and T ( M 1 ) = T ( M 2 ) . These are models that appear classical. 33 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc C LASSICALLY E QUIVALENT M ODELS : M CE D EFINITION M 1 ∈ M CE iff ∃ M 2 ∈ M C with f R ( M 1 ) = f R ( M 2 ) and T ( M 1 ) = T ( M 2 ) . These are models that appear classical. SLTD STILL WORKS ! • SLTD returns T ( M ) correctly for every M ∈ M CE . • Returns topology as though M is classical. • ∴ Returns correct topology. • So M CE is Topologically Determinate. 34 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc C LASSICALLY E QUIVALENT M ODELS : M CE D EFINITION M 1 ∈ M CE iff ∃ M 2 ∈ M C with f R ( M 1 ) = f R ( M 2 ) and T ( M 1 ) = T ( M 2 ) . These are models that appear classical. SLTD STILL WORKS ! • SLTD returns T ( M ) correctly for every M ∈ M CE . • Returns topology as though M is classical. • ∴ Returns correct topology. • So M CE is Topologically Determinate. E XTENSION TRICK WORKS IN GENERAL • Can apply for any algorithm and class it works on. 35 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc C OMMENTS ON M CE S TRENGTHS • M C ⊂ M CE . • Much larger than M C . • Can contain complex spatial dependencies. 36 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc C OMMENTS ON M CE S TRENGTHS • M C ⊂ M CE . • Much larger than M C . • Can contain complex spatial dependencies. D RAWBACKS • Not constructive. • Depends on receiver positions. 37 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc C OMMENTS ON M CE S TRENGTHS • M C ⊂ M CE . • Much larger than M C . • Can contain complex spatial dependencies. D RAWBACKS • Not constructive. • Depends on receiver positions. • Need a model class that: • Is not based on receiver positions. • Reflects properties of real networks. 38 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc P AINLESS G ENERALITY R ECALL • X k = � i ∈ ( 0 → k ) Z k . 39 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc P AINLESS G ENERALITY R ECALL • X k = � i ∈ ( 0 → k ) Z k . D EPENDENCY OF HIDDEN Z • If X i = 0 then for all k below i , X k = 0. 40 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc P AINLESS G ENERALITY R ECALL • X k = � i ∈ ( 0 → k ) Z k . D EPENDENCY OF HIDDEN Z • If X i = 0 then for all k below i , X k = 0. • If X f ( i ) = 0 then changing the value of Z i won’t change the output. • This suggests a way of adding dependency without affecting f R ( M ) . 41 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc M ODEL P RINCIPLES H OW DOES DEPENDENCY ARISE ? • Links touch at routers, influenced by router traffic and dynamics – suggests dependencies between siblings. • Distant links unlikely to affect each other except via tree. – suggests ruling out ‘action at a distance’. 42 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc M ODEL P RINCIPLES H OW DOES DEPENDENCY ARISE ? • Links touch at routers, influenced by router traffic and dynamics – suggests dependencies between siblings. • Distant links unlikely to affect each other except via tree. – suggests ruling out ‘action at a distance’. T RANSLATION TO M ODEL P RINCIPLES • Locally: most general possible dependency between adjacent links. • Globally: only necessary dependency over non-adjacent links. 43 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc J UMP I NDEPENDENCE D EFINITION (J UMP I NDEPENDENT M ODELS ) A model with links L and receivers R is Jump Independent if ∀ k ∈ V \ R , ∀ J ⊂ V with J ∩ d ( k ) = ∅ , X c ( k ) is conditionally independent of X J given X k = 1. 44 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc D EFINITIONS D EFINITION (S UBTREE I NDUCED BY U ) Let M ( T , f Z ) ∈ M JI with T = ( V , L ) . Let U ⊂ V . Then define the subtree induced by U as � T ( U ) = { 0 → i } i ∈ U and R ( U ) as the leaves of T ( U ) . D EFINITION ( ρ - VALUES ) Define sibling passage probabilities: ρ J = Pr ( ∩ j ∈ D { X j = 1 }| X f ( D ) = 1 ) for each set of siblings D . 45 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc F UNDAMENTAL P ROPERTY OF JI MODELS L EMMA (F UNDAMENTAL P ROPERTY OF JI MODELS ) Let M ( T , f Z ) ∈ M JI . Then � � Pr ( { X k = 1 } ) = ρ c ( i ) ∩ T ( U ) k ∈ U i ∈ T ( U ) \ R ( U ) for every U ⊂ V. 46 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc F UNDAMENTAL P ROPERTY OF JI MODELS Example : U = { 2 , 5 , 6 } Pr ( X 2 = 1 , X 5 = 1 , X 6 = 1 ) = ρ 1 · ρ 2 , 3 · ρ 4 , 5 · ρ 6 47 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc S HARED T RANSMISSION IN JI MODELS • For i , j ∈ V , S i , j = Pr ( X b = 1 ) · ρ 1 ρ 2 ρ 1 , 2 � � � · ρ 1 ρ 2 = ρ k ρ 1 , 2 k ∈ 0 → b 48 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc S HARED T RANSMISSION IN JI MODELS • For i , j ∈ V , S i , j = Pr ( X b = 1 ) · ρ 1 ρ 2 ρ 1 , 2 � � � · ρ 1 ρ 2 = ρ k ρ 1 , 2 k ∈ 0 → b • Shared Transmission a function of the shared path and the two children at the branch point. 49 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc B INARY JI MODELS M EASUREMENT E QUIVALENCE • Assume M 1 ∈ M JI and M 2 ∈ M C with T ( M 1 ) = T ( M 2 ) . • Solve for l i from M 2 in terms of ρ J from M 1 . ρ i , s ( i ) if i ∈ R , ρ s ( i ) ρ i · ρ c 1 ( i ) ρ c 2 ( i ) if i = 1 l i = ρ c 1 ( i ) , c 2 ( i ) ρ i , s ( i ) · ρ c 1 ( i ) ρ c 2 ( i ) otherwise. ρ s ( i ) ρ c 1 ( i ) , c 2 ( i ) 50 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc B INARY JI MODELS M EASUREMENT E QUIVALENCE • Assume M 1 ∈ M JI and M 2 ∈ M C with T ( M 1 ) = T ( M 2 ) . • Solve for l i from M 2 in terms of ρ J from M 1 . ρ i , s ( i ) if i ∈ R , ρ s ( i ) ρ i · ρ c 1 ( i ) ρ c 2 ( i ) if i = 1 l i = ρ c 1 ( i ) , c 2 ( i ) ρ i , s ( i ) · ρ c 1 ( i ) ρ c 2 ( i ) otherwise. ρ s ( i ) ρ c 1 ( i ) , c 2 ( i ) O BTAIN (B INARY ) E XAMPLES OF MODELS IN CE • If l i < 1, must be the marginal link passage parameter of the CE model. • Insight: siblings dependencies compensated by change in transmission on path to father. 51 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc I DENTIFIABILITY FAILURE : INVISIBLE PATHS L EMMA Let i , j , k be three distinct receivers in a Jump Independent model such that b ( i , k ) is below b ( i , j ) . Then S ( i , k ) = S ( j , k ) if and only if b ( i , j ) → b ( i , k ) is invisible. 52 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc I DENTIFIABILITY FAILURE : INVISIBLE PATHS A UGMENTED P ATH • An augmented path g ( g 1 , g 2 ) → h ( h 1 , h 2 ) is a path g → h together with g 1 , g 2 ∈ c ( g ) , h 1 , h 2 ∈ c ( h ) such that g 1 ∈ g → h . 53 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc I DENTIFIABILITY FAILURE : INVISIBLE PATHS I NVISIBLE PATH • An augmented path is invisible if � � � ρ h 1 ρ h 2 ρ g 1 ρ g 2 = ρ i . ρ g 1 , g 2 ρ h 1 , h 2 i ∈ g → h 54 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc I DENTIFIABILITY FAILURE : INVISIBLE PATHS I NVISIBLE PATH • An augmented path is invisible if � � � ρ h 1 ρ h 2 ρ g 1 ρ g 2 = ρ i . ρ g 1 , g 2 ρ h 1 , h 2 i ∈ g → h • For Binary models this reduces to: � l i = 1 . i ∈ g → h • Analogue of l k � = 1 from classical. 55 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc I DENTIFIABILITY FAILURE : LOCAL STRUCTURE L OCAL LIMITATIONS ON ANY SIBLING SET J • Internally agreeing if S i , j = S k , l ∀ i , j , k , l ∈ J with i � = j , k � = l . • Internally disagreeing if S i , j � = S k , l ∀ i , j , k , l ∈ J with { i , j } � = { k , l } . 56 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc I DENTIFIABILITY FAILURE : LOCAL STRUCTURE L OCAL LIMITATIONS ON ANY SIBLING SET J • Internally agreeing if S i , j = S k , l ∀ i , j , k , l ∈ J with i � = j , k � = l . • Internally disagreeing if S i , j � = S k , l ∀ i , j , k , l ∈ J with { i , j } � = { k , l } . R OLES • Disagreeing is the generic/general case. • Agreeing includes classical. 57 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc A GREEABLE JI MODELS D EFINITION (A GREEABLE JI MODELS ( M AJI )) An AJI model is a model M ∈ M JI which satisfies : i) (internally consistent) Each sibling set J is agreeing or disagreeing. ii) (no invisible paths) No augmented paths in M are invisible. 58 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc A GREEABLE JI MODELS D EFINITION (A GREEABLE JI MODELS ( M AJI )) An AJI model is a model M ∈ M JI which satisfies : i) (internally consistent) Each sibling set J is agreeing or disagreeing. ii) (no invisible paths) No augmented paths in M are invisible. R OLE OF RESTRICTIONS • Condition (i) prevents sibling sets from looking like they aren’t. • Condition (ii) prevents groups of non-siblings from looking like they are siblings. 59 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc A GREEABLE JI MODELS D EFINITION (A GREEABLE JI MODELS ( M AJI )) An AJI model is a model M ∈ M JI which satisfies : i) (internally consistent) Each sibling set J is agreeing or disagreeing. ii) (no invisible paths) No augmented paths in M are invisible. R OLE OF RESTRICTIONS • Condition (i) prevents sibling sets from looking like they aren’t. • Condition (ii) prevents groups of non-siblings from looking like they are siblings. Including ‘agreeing’ in (i) a big headache, but important! 60 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc A PROPERTY OF SIBLINGS IN JI MODELS L EMMA (S IBLINGS A GREE E XTERNALLY ) Let M ∈ M JI . If two nodes i , j are members of a sibling set J , and k ∈ R such that ( 0 → k ) ∩ J = ∅ , then S i , k = S j , k . 61 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc S EEKING C ERTAIN P ATERNITY T RY TO INVERT SIBLING PROPERTY • Define agreement set of i , j ∈ V A i , j = { k ∈ R : S ( i , k ) = S ( j , k ) , k � = i , j } . • Agreement sets used to compare ‘world view’ of candidate siblings. 62 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc F INDING C OMPLETE S IBLING S ETS D EFINITION (E XTERNALLY - AGREEING SETS ) Call D ⊂ R an externally-agreeing set (EAS) if | D | ≥ 3 and A i , j = R \ D for all i , j ∈ D . 63 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc F INDING C OMPLETE S IBLING S ETS D EFINITION (E XTERNALLY - AGREEING SETS ) Call D ⊂ R an externally-agreeing set (EAS) if | D | ≥ 3 and A i , j = R \ D for all i , j ∈ D . D EFINITION (A LL - AGREEING SETS ) Call D ⊂ R with | D | ≥ 2 an all-agreeing set (AAS) if A i , j = R \{ i , j } for all i , j ∈ D . Subsets of an all-agreeing set are also all-agreeing. Call an all-agreeing set D a maximal all-agreeing set (MAAS) if it is not a proper subset of another one. 64 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc F INDING C OMPLETE S IBLING S ETS L EMMA (F INDING DISAGREEING SIBLING SETS ) Consider M ∈ M AJI with receiver nodes R . A set D ⊂ R with | D | ≥ 3 is an disagreeing sibling set if and only if it is an EAS. 65 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc F INDING C OMPLETE S IBLING S ETS L EMMA (F INDING DISAGREEING SIBLING SETS ) Consider M ∈ M AJI with receiver nodes R . A set D ⊂ R with | D | ≥ 3 is an disagreeing sibling set if and only if it is an EAS. L EMMA (F INDING AGREEING SIBLING SUBSETS ) Consider M ∈ M AJI with receiver nodes R . A set D ⊂ R with | D | ≥ 2 is a subset of an agreeing sibling set if and only if it is an AAS. • The MAAS are the maximal agreeing sibling subsets. • Some/all of these may still have hidden siblings. 66 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc P ROPOSITION (C ERTAIN P ATERNITY II) Assume an M ∈ M AJI model. Then at least one available sibling set can be identified without error. P ROOF • Find all the EAS and AASes 67 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc C ASE 1: AT LEAST ONE EAS EXISTS Select any of them. 68 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc C ASE 2: NO EAS EXISTS Select a MAAS which is a sibling set (can test if one below another). 69 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc SLTD2 Similar to SLTD, but agreement set based. 70 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc SLTD2 Similar to SLTD, but agreement set based. T HEOREM (C ORRECTNESS OF SLTD2 ON M AJI ) Let M = ( T , f Z ) ∈ M AJI . Then SLTD2 returns T . 71 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc SLTD2 Similar to SLTD, but agreement set based. T HEOREM (C ORRECTNESS OF SLTD2 ON M AJI ) Let M = ( T , f Z ) ∈ M AJI . Then SLTD2 returns T . P ROOF • Find sibling set using Certain Paternity. • S ( i , j ) = ˜ S ( i , j ) for M ∈ M JI . 72 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc SLTD2 Similar to SLTD, but agreement set based. T HEOREM (C ORRECTNESS OF SLTD2 ON M AJI ) Let M = ( T , f Z ) ∈ M AJI . Then SLTD2 returns T . P ROOF • Find sibling set using Certain Paternity. • S ( i , j ) = ˜ S ( i , j ) for M ∈ M JI . • So each iteration will be correct. 73 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc SLTD2 Similar to SLTD, but agreement set based. T HEOREM (C ORRECTNESS OF SLTD2 ON M AJI ) Let M = ( T , f Z ) ∈ M AJI . Then SLTD2 returns T . P ROOF • Find sibling set using Certain Paternity. • S ( i , j ) = ˜ S ( i , j ) for M ∈ M JI . • So each iteration will be correct. • Hence recover T at termination. 74 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc AJIE M ODELS • Defined analogously to M CE , but start with M AJI instead of M C . 75 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc AJIE M ODELS • Defined analogously to M CE , but start with M AJI instead of M C . • M CE ⊂ M AJIE , since M C ⊂ M AJI . 76 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc AJIE M ODELS • Defined analogously to M CE , but start with M AJI instead of M C . • M CE ⊂ M AJIE , since M C ⊂ M AJI . • SLTD2 succeeds on all topologies in M AJIE . 77 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc R ELATIONSHIPS BETWEEN CLASSES M AJIE M CE M AJI M JI M C 78 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc D IMENSIONS OF C LASSES T dim ( M C , T ) 4 6 9 14 29 dim ( M CE , T ) 12 54 489 14350 536805405 dim ( M JI , T ) 15 56 478 14133 536613988 dim ( M AJI , T ) 15 56 478 14133 536613988 dim ( M AJIE , T ) 15 57 489 14395 536805415 dim ( M T ) 15 63 511 16383 536870911 T ABLE : Examples of model class dimensions. 79 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc I NFINITE D ATA S UMMARY P REVIOUS WORK • Classical model: full spatial independence of tree loss process. • Algorithm SLTD to recover topology in this case. 80 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc I NFINITE D ATA S UMMARY P REVIOUS WORK • Classical model: full spatial independence of tree loss process. • Algorithm SLTD to recover topology in this case. O UR WORK • Break spatial independence assumptions. • Define more general class M CE such that SLTD still works. • General result for extending class while keeping algorithm. 81 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc I NFINITE D ATA S UMMARY P REVIOUS WORK • Classical model: full spatial independence of tree loss process. • Algorithm SLTD to recover topology in this case. O UR WORK • Break spatial independence assumptions. • Define more general class M CE such that SLTD still works. • General result for extending class while keeping algorithm. 82 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc I NFINITE D ATA S UMMARY P REVIOUS WORK • Classical model: full spatial independence of tree loss process. • Algorithm SLTD to recover topology in this case. O UR WORK • Break spatial independence assumptions. • Define more general class M CE such that SLTD still works. • General result for extending class while keeping algorithm. • Define class M JI with physically motivated structure. • Find TD class M AJI with dim ( M AJI ) = dim ( M JI ) . 83 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc I NFINITE D ATA S UMMARY P REVIOUS WORK • Classical model: full spatial independence of tree loss process. • Algorithm SLTD to recover topology in this case. O UR WORK • Break spatial independence assumptions. • Define more general class M CE such that SLTD still works. • General result for extending class while keeping algorithm. • Define class M JI with physically motivated structure. • Find TD class M AJI with dim ( M AJI ) = dim ( M JI ) . • New algorithm SLTD2 recovers topology for all M ∈ M AJI . 84 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc I NFINITE D ATA S UMMARY P REVIOUS WORK • Classical model: full spatial independence of tree loss process. • Algorithm SLTD to recover topology in this case. O UR WORK • Break spatial independence assumptions. • Define more general class M CE such that SLTD still works. • General result for extending class while keeping algorithm. • Define class M JI with physically motivated structure. • Find TD class M AJI with dim ( M AJI ) = dim ( M JI ) . • New algorithm SLTD2 recovers topology for all M ∈ M AJI . • Also recovers topology for all M ∈ M AJIE . 85 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc C HALLENGES FOR F INITE D ATA • Underlying S ij not known, only estimated. • Failure of exact S ij equality underlying agreement set definition. • Random topology selection in M AJI , with degree constraints. • Random model selection, with loss constraints. • Sensible error metric on trees. 86 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc A SLTD B ASED A LGORITHM M ODIFIED I TERATION • Estimate shared transmission over all pairs � X i / n p � X j / n p � � X i X j / n p S ij = . • Merge i , j into J ∗ = ( ij ) with minimal � S ij . • Merge additional receivers k in J ∗ obeying (we use β = 0 . 002) � S ( ij ) k ≤ ( 1 + β ) � S ∗ . Straightforward because key steps based on inequality of � S ij . 87 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc M EASURING A PPROXIMATE A GREEMENT T HREE STEPS TO MEASURE AGREEMENT OF J TO A ( | J | = 2 and | A | = 1); (i) shared passage measure p k ; ij ( | J | = 2 and | A | ≥ 1); (ii) agreement set measure g ij ( A ) A ( J ) ( | J | ≥ 2 and | A | ≥ 1). (iii) sibling set measure r 88 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc M EASURING A PPROXIMATE A GREEMENT ( | J | = 2 AND | A | = 1 ) S TEP ( I ): SHARED PASSAGE MEASURE p k ; ij Let p k | i = Pr ( X k = 1 | X i = 1 ) . From the definition, S ik = S jk equivalent to p k | i = p k | j . 89 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc M EASURING A PPROXIMATE A GREEMENT ( | J | = 2 AND | A | = 1 ) S TEP ( I ): SHARED PASSAGE MEASURE p k ; ij Let p k | i = Pr ( X k = 1 | X i = 1 ) . From the definition, S ik = S jk equivalent to p k | i = p k | j . Estimate p k | i by � ˆ p k | i = ( X k X i ) / n i . 90 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc M EASURING A PPROXIMATE A GREEMENT ( | J | = 2 AND | A | = 1 ) S TEP ( I ): SHARED PASSAGE MEASURE p k ; ij Let p k | i = Pr ( X k = 1 | X i = 1 ) . From the definition, S ik = S jk equivalent to p k | i = p k | j . Estimate p k | i by � ˆ p k | i = ( X k X i ) / n i . Null hypothesis: p k | i = p k | j . Under H0 ˆ p k | = ( n i ˆ p k | i + n j ˆ p k | j ) / ( n i + n j ) ˆ p k | i − ˆ p k | j Test statistic: T ij ( k ) = � n i + n j p k | ( 1 − ˆ n i n j ˆ p k | ) with corresponding (Gaussian based) p-value p ij ∈ [ 0 , 1 ] . Higher p ij = ⇒ higher agreeement. 91 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc M EASURING A PPROXIMATE A GREEMENT S TEP ( II ): AGREEMENT SET MEASURE g ij ( A ) ( | J | = 2 AND | A | ≥ 1 ) Let A ⊂ B \{ i , j } , and select a significance level α . 92 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc M EASURING A PPROXIMATE A GREEMENT S TEP ( II ): AGREEMENT SET MEASURE g ij ( A ) ( | J | = 2 AND | A | ≥ 1 ) Let A ⊂ B \{ i , j } , and select a significance level α . Note the good proportion, g p , of the p ( k ) obeying p ( k ) > α , k ∈ A . (Avoids using p-value as a weight – bad idea) Note worst agreement: g w = min k ∈ A p ( k ) . (for g p and g w , higher values = ⇒ closer agreement) 93 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc M EASURING A PPROXIMATE A GREEMENT S TEP ( II ): AGREEMENT SET MEASURE g ij ( A ) ( | J | = 2 AND | A | ≥ 1 ) Let A ⊂ B \{ i , j } , and select a significance level α . Note the good proportion, g p , of the p ( k ) obeying p ( k ) > α , k ∈ A . (Avoids using p-value as a weight – bad idea) Note worst agreement: g w = min k ∈ A p ( k ) . (for g p and g w , higher values = ⇒ closer agreement) Define g ij ( A ) = g p , using g w to break ties. In other words, agreement follows the worst case in A . 94 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc M EASURING A PPROXIMATE A GREEMENT S TEP ( III ): SIBLING SET MEASURE r A ( J ) ( | J | ≥ 2 AND | A | ≥ 1 ) Assume A ⊂ B \ J . A ( J ) , must combine the values of g ij ( A ) for all { i , j } ∈ J . To define r 95 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc M EASURING A PPROXIMATE A GREEMENT S TEP ( III ): SIBLING SET MEASURE r A ( J ) ( | J | ≥ 2 AND | A | ≥ 1 ) Assume A ⊂ B \ J . A ( J ) , must combine the values of g ij ( A ) for all { i , j } ∈ J . To define r Per-leaf noise reduction: for each k ∈ J , average the g ( A ) values involving k . 96 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc M EASURING A PPROXIMATE A GREEMENT S TEP ( III ): SIBLING SET MEASURE r A ( J ) ( | J | ≥ 2 AND | A | ≥ 1 ) Assume A ⊂ B \ J . A ( J ) , must combine the values of g ij ( A ) for all { i , j } ∈ J . To define r Per-leaf noise reduction: for each k ∈ J , average the g ( A ) values involving k . A ( R ) ∈ [ 0 , 1 ] as the smallest such average. Define r (signature of bad leaves won’t be diluted) 97 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc M EASURING A PPROXIMATE A GREEMENT S TEP ( III ): SIBLING SET MEASURE r A ( J ) ( | J | ≥ 2 AND | A | ≥ 1 ) Assume A ⊂ B \ J . A ( J ) , must combine the values of g ij ( A ) for all { i , j } ∈ J . To define r Per-leaf noise reduction: for each k ∈ J , average the g ( A ) values involving k . A ( R ) ∈ [ 0 , 1 ] as the smallest such average. Define r (signature of bad leaves won’t be diluted) Notes: A ( J ) = g ( A ) whenever | J | = 2 such as in binary trees. – r – Typically A = B \ J in which case we write simply r ( J ) . 98 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc D EFINING T RUE T REE Inspired by SLTD2, tries to use r ( J ) to identify the MAAS and EAS. 99 / 121
Background Spatial Dependence JI Models Identifiable JI Finding Siblings SLTD2 Class Size Finite Data Conc D EFINING T RUE T REE Inspired by SLTD2, tries to use r ( J ) to identify the MAAS and EAS. L OCATING AN EAS Infeasible to search for highest r ( J ) at each iteration – too many J . 100 / 121
Recommend
More recommend