finding periodicities in astronomical light curves using
play

Finding Periodicities in Astronomical Light Curves using Information - PowerPoint PPT Presentation

Finding Periodicities in Astronomical Light Curves using Information Theoretic Learning Pablo Huijse H. Department of Electrical Engineering Universidad de Chile Joint work with: Pavlos Protopapas, Harvard University Jose Pr ncipe,


  1. Finding Periodicities in Astronomical Light Curves using Information Theoretic Learning Pablo Huijse H. Department of Electrical Engineering Universidad de Chile Joint work with: Pavlos Protopapas, Harvard University Jose Pr´ ıncipe, University of Florida Pablo Est´ evez, Universidad de Chile (PhD Advisor) Pablo Zegers, Universidad de los Andes December 13, 2011

  2. Introduction Methods Results Conclusions Introduction Statement of the problem To find periodic light curves automatically in large astronomical databases Find the period of a light curve Discriminate if it is truly periodic ... in reasonable computational time Relevance The fundamental period of light curves can be used for: Stellar classification Stellar parameter estimation Extrasolar planet detection Pablo Huijse, Universidad de Chile Finding periodicities in astronomical light curves using ITL

  3. Introduction Methods Results Conclusions Statement of the problem Challenges Light curves are unevenly sampled and noisy Astronomical databases are huge Current situation: Period detection schemes rely too much on visual inspection. Goals To develop a fully automated, efficient and robust method for period detection and estimation based on information theoretic learning Pablo Huijse, Universidad de Chile Finding periodicities in astronomical light curves using ITL

  4. Introduction Methods Results Conclusions ITL and Renyi’s quadratic entropy Information theoretic learning Information theoretic concepts of Entropy and Mutual Information applied to machine learning. Replace conventional second-order metrics (variance, correlation) with IT metrics estimated directly from samples. Renyi’s quadratic entropy (RQE) Entropy quantifies uncertainty of a system. Using Parzen windows the RQE (and the PDF) is estimated directly from the sample data �� + ∞ � ˆ p 2 ( x ) dx H R 2 ( X ) = − log = − log ( IP ( X )) −∞ N N � � 1 IP ( X ) = G σ ( x i − x j ) N 2 i =1 j =1 Pablo Huijse, Universidad de Chile Finding periodicities in astronomical light curves using ITL

  5. Introduction Methods Results Conclusions Correntropy Correntropy is an ITL metric that takes in account the time structure of random processes. Generalization of correlation. It measures similarities in a kernel space between samples sep- arated by a time lag τ in the input space. The autocorrentropy function: N − 1 � 1 � V ( τ ) = G σ ( x n − x n − τ ) N − τ + 1 n = τ Translation-invariant Gaussian kernel with kernel size σ � � −� x − y � 2 1 √ G σ ( x − y ) = 2 πσ exp . 2 σ 2 σ controls the width of the kernel and it is usually selected wrt the data properties. Pablo Huijse, Universidad de Chile Finding periodicities in astronomical light curves using ITL

  6. Introduction Methods Results Conclusions Period Estimator: Slotted Correntropy A correntropy estimator for unevenly sampled time series using the slotting technique (Edelson & Krolik, Mayo). Time lag k is defined as: k ∆ τ = [( k − 0 . 5)∆ τ, ( k + 0 . 5)∆ τ ] . � N � N j =1 G σ ( x i − x j ) · B k ∆ τ ( t i , t j ) � i =1 V ( k ∆ τ ) = � N � N , j =1 B k ∆ τ ( t i , t j ) i where B k ∆ τ ( t i , t j ) = 1 if ( t i − t j ) fall in slotted lag k. The bin size ∆ τ has to be carefully set to avoid undefined slots Fourier transform of slotted correntropy: slotted correntropy spectrum Pablo Huijse, Universidad de Chile Finding periodicities in astronomical light curves using ITL

  7. Introduction Methods Results Conclusions Previous Work Results of this investigation published in IEEE SPL Period estimation in light curves from the MACHO survey Gold standard provided by the Harvard TSC Slotted correntropy was compared with the LS periodogram, AoV, String Length and slotted correlation The slotted correntropy outperformed the other methods on EB period estimation, and performed equally well on RRL/Cepheid period estimation Pablo Huijse, Universidad de Chile Finding periodicities in astronomical light curves using ITL

  8. Introduction Methods Results Conclusions New ITL based metric for period detection Include the time structure in the kernel function Spatio-temporal kernel function Gaussian kernel to evaluate ∆ x Periodic kernel to evaluate ∆ t , no folding required Multiplication of Mercer kernels is also a Mercer kernel A periodogram based on the centered correntropy with spatio- temporal kernel function � N � N H ( P t ) = [ G σ m (∆ x ij ) − IP ] · G σ t ; P t (∆ t ij ) , i =1 j =1 By maximizing H wrt to P we obtain the period associated to the most similar set of sample pairs The H periodogram has two free parameters: σ m and σ t . Kernel sizes control the observation window in which similarity is assessed. Pablo Huijse, Universidad de Chile Finding periodicities in astronomical light curves using ITL

  9. Introduction Methods Results Conclusions Results on periodic versus non-periodic discriminator Automatic periodic light curve discrimination based on H Test on light curves from the MACHO and EROS survey We need a training dataset (EROS): We have to build one Choose a field of the survey Obtain sets of trial periods using: H, LS, AoV periodogram, etc Visually check the folded light curves Come up with a clean training set: Future generations will be grateful Then we can run on bigger dataset False positive rate: Below 0.1% Careful with spurious periodicities: sidereal day, moon phase, ... Computational efficiency: 0.1 s per light curve Pablo Huijse, Universidad de Chile Finding periodicities in astronomical light curves using ITL

  10. Introduction Methods Results Conclusions ROC curve on MACHO subset Figure: Periodic light curve discrimination using H metric on MACHO subset, ROC curve, 966 periodic, 775 non periodic light curves, 510 non variables, α : significance periodicity test False positives: Spurious day and moon phase periods and mis- labeled light curves Pablo Huijse, Universidad de Chile Finding periodicities in astronomical light curves using ITL

  11. Introduction Methods Results Conclusions ROC curve on EROS subset Figure: Periodic light curve discrimination using H metric. Preliminary results on EROS subset, 819 periodic (field of 72k light curves), 4000 non periodic light curves, θ : periodogram threshold Training dataset: False False Negatives + False False Positives Pablo Huijse, Universidad de Chile Finding periodicities in astronomical light curves using ITL

  12. Introduction Methods Results Conclusions Efficient computation and scalability EROS: 20 million light curves Training Field: 71937 light curves, 600 samples per light curve Time Description One light curve (CPU) 36 s Using desktop GPU (480 cores) 0.76 s Full Training dataset (with GPU) 17 h On full EROS (with GPU) 176 days! Full EROS on GPU cluter (32) 5.5 days χ 2 Variability filter: Even with a very low threshold (100% TPR and big FPR), times would be reduced by half Trial period selection: LS, AoV periodogram, Correntropy, etc Code optimizations: Max. GPU occupancy Reduce complexity: FGT, Cholesky decomposition Pablo Huijse, Universidad de Chile Finding periodicities in astronomical light curves using ITL

  13. Introduction Methods Results Conclusions Conclusions From signal processing/machine learning viewpoint: Interest- ing, relevant and challenging problem Contribution New information theoretic criteria for periodicity detection Not used in the astronomy field before Working on fully automated and efficient analysis of large databases Preliminary results are promising Questions? Pablo Huijse, Universidad de Chile Finding periodicities in astronomical light curves using ITL

  14. Introduction Methods Results Conclusions There is always a period But most of the time it is something like this: Pablo Huijse, Universidad de Chile Finding periodicities in astronomical light curves using ITL

  15. Introduction Methods Results Conclusions Preliminary results Eclipsing binary star, MACHO 1.3449.948, P = 14.0055 days 1 1 (2) (1) CSD PSD Correntropy Spectral Density True period True period 0.8 0.8 Power Spectral Density Period [days] Period [days] (1) 14.0056 (1) 7.0024 (2) 7.0024 0.6 0.6 (2) 3.5012 (1) (3) 3.5012 (3) 0.4 0.4 0.2 0.2 (2) 0 0 0 0.1 0.2 0.3 0.4 0.5 0 0.1 0.2 0.3 0.4 0.5 Frequency [1/days] Frequency [1/days] Pablo Huijse, Universidad de Chile Finding periodicities in astronomical light curves using ITL

  16. Introduction Methods Results Conclusions Preliminary results Influence of the higher order moments included in the slotted correntropy estimated through the Gaussian kernel. � � x − y � 2 k � � ∞ ( − 1) k 1 G σ ( x − y ) = √ 2 k σ 2 k k ! E 2 πσ k =0 Even moments included Hits Multiples Misses 0 to 2 49.22% 48.70% 2.07% 0 to 4 61.66% 36.27% 2.07% 0 to 6 62.18% 35.75% 2.07% 0 to 8 64.25% 34.72% 1.04% 0 to 10 67.36% 31.61% 1.04% 0 to ∞ 73.06% 26.42% 0.52% Pablo Huijse, Universidad de Chile Finding periodicities in astronomical light curves using ITL

Recommend


More recommend