Segmental Semi-Markov Models for Endpoint Detection in Plasma Etching Xianping Ge and Padhraic Smyth Information and Computer Science University of California, Irvine www.ics.uci.edu/ ~datalab Acknowledgements Thanks to Wenli Collison, Tom Ni, and David Hemker of LAM Research for providing the data. Ge and Smyth, AEC/APC XII: 1
Outline • Problem Statement • two techniques for endpoint detection in plasma etching • Change-point detection • Pattern matching • Segmental Semi-Markov Model • standard hidden Markov model (HMM) • semi-Markov model • segmental Markov model • Algorithms and Experimental Results Ge and Smyth, AEC/APC XII: 2
Change-Point Detection Problem 7000 6500 6000 5500 BEST "VISUAL" 5000 ESTIMATE OF CHANGE POINT 4500 4000 35 40 45 50 55 60 Time (seconds) • Single wavelength interferometry data from LAM 9400 Plasma Etch • Problem: can one automate online detection of the change-point ? Ge and Smyth, AEC/APC XII: 3
Fitting Two Quadratic Segments 7000 6500 6000 Y 5500 5000 4500 200 210 220 230 240 250 260 Time Ge and Smyth, AEC/APC XII: 4
Segmental Semi-Markov Model for Change-point Detection • Each segment corresponds to one state in the model. Segments States S=1 S=2 • Change-point = boundary between the two states. • If only we can infer the (hidden) states from the data! Ge and Smyth, AEC/APC XII: 5
Pattern-Based End-Point Detection 500 End-Point of Main Etch 400 SENSOR OUTPUT Example Pattern 300 200 0 50 100 150 200 250 300 350 400 Ge and Smyth, AEC/APC XII: 6
Pattern-Based End-Point Detection 500 400 SENSOR OUTPUT Example Pattern 300 200 0 50 100 150 200 250 300 350 400 500 400 SENSOR OUTPUT 300 New Pattern 200 0 50 100 150 200 250 300 350 400 TIME (SECONDS) Ge and Smyth, AEC/APC XII: 7
Example Pattern vs. New Pattern : • Different Example Pattern • Dynamic Range • Mean Amplitude • Duration New Pattern Ge and Smyth, AEC/APC XII: 8
Sketch of the proposed method 1 Represent the example pattern as piecewise linear (or quadratic, polynomial, …) 2 Build a probabilistic template model from the piecewise linear representation • Each segment of the piecewise linear representation corresponds to a state in the model Segments States S=1 S=2 S=M Ge and Smyth, AEC/APC XII: 9
3 Given new data (candidate pattern) • Are the (hidden) states the same as in the model? • If yes, the new data is similar to the example pattern. Candidate Pattern Segments Model States S=1 S=2 S=M Ge and Smyth, AEC/APC XII: 10
Problem Statement: a summary • Data are represented as segments • Change-point detection: two quadratic segments • Pattern matching: piecewise linear representation • The states in the model correspond to the segments in the data. • The problems will be solved, if we can • Infer the hidden states in the data ! Ge and Smyth, AEC/APC XII: 11
Next ... • Problem Statement • Segmental Semi-Markov Model • Algorithms and Experimental Results • Change-point detection • Pattern matching Ge and Smyth, AEC/APC XII: 12
Segmental Semi-Markov Model Data States S=1 S=2 S=M State transitions Semi-Markov state duration Regression in segment t Ge and Smyth, AEC/APC XII: 13
Markov Model • M states •The states correspond to segments of the data • At time t= 0 , the system is in state i with probability P(S 0 = i ) • Transition probability matrix A • A (i, j) = P(S t+ 1 = j | S t = i) • I.e., A (i, j) is the probability of switching from state i to state j Ge and Smyth, AEC/APC XII: 14
Hidden Markov Model • The states S t are not directly observable (hidden) • The observed data Y t depends on the state S t • P( Y t = y | S t = i ) • From Y 1 Y 2 ...Y t ... Y T , the most likely state sequence S 1 S 2 … S t ... S T can be computed by the Viterbi algorithm in time linear in T . Ge and Smyth, AEC/APC XII: 15
Limitation of standard Markov model • A Markov model imposes a geometric distribution over the state duration: • The probability of staying in state i for n units of time is A (i,i) n-1 [1-A (i,i)] Ge and Smyth, AEC/APC XII: 16
Semi-Markov Model: Explicit state duration modeling • Can specify non-geometric distribution for state duration (Gamma, normal, etc.) • E.g., “ The system will stay in state i for about 10 seconds ” Ge and Smyth, AEC/APC XII: 17
Limitation of standard Markov model • Given the current state S t = i , the observed data Y t is independent of time t : P(Y t = y | S t = i ) • When the system is staying in state i, the observed data Y t will have a constant distribution: • Cannot model the shape of the linear, quadratic segments ! - - - - - - - - S T S 1 S 2 Ge and Smyth, AEC/APC XII: 18
Segmental Markov Model: Modeling the shape of the segments • Each segment corresponds to a regression function, e.g., linear, quadratic, polynomial • For example, the two quadratic segments in the change-point detection problem: Segments States S=1 S=2 Ge and Smyth, AEC/APC XII: 19
From Standard Markov Models to Segmental Semi-Markov Models • The length of the segments can be directly modeled. • The shape of the segments can be linear, quadratic, polynomial … • From Y 1 Y 2 ...Y t ... Y T , find the most likely state sequence S 1 S 2 … S t ... S T • Generalization of the Viterbi algorithm • Online, efficient Ge and Smyth, AEC/APC XII: 20
Next ... • Problem Statement • Segmental Semi-Markov Model • Algorithms and Experimental Results • Change-point detection • Pattern matching Ge and Smyth, AEC/APC XII: 21
Change-point Detection • Given a segmental Semi-Markov model, compute the most likely state sequence S 1 S 2 … S t ... S T from observed data Y 1 Y 2 ...Y t ... Y T , and the change- point will be the smallest t such that S t = 2 (I.e., when switching from state 1 to state 2) • The parameters of the model can be estimated from training data, or, if no training data are available, can be estimated from real time data using Expectation- Maximization (EM) algorithm. Ge and Smyth, AEC/APC XII: 22
On-line estimation of model parameters using EM algorithm • Guess at some initial parameters θ θ θ θ • Calculate the state probabilities given θ θ θ θ • Now re-estimate the θ θ θ parameters given the state θ probabilities • Use weighted least-squares regression • Repeat the cycle until convergence State θ parameters Probabilities Ge and Smyth, AEC/APC XII: 23
Change-point Detection Experimental Results: Plasma Etching 7000 TIM E A T W H IC H O N LIN E M A R K O V A LG O R ITH M 6500 D E TE C TE D C H A N G E IB4 Interferometry Sensor 6000 5500 5000 E S TIM A TE D C H A N G E P O IN T 4500 0 10 20 30 40 50 60 TIM E (s ec onds ) Ge and Smyth, AEC/APC XII: 24
Comparison with Classic SSE Method: Simulated Data 140 C H A N G E P O IN T 120 D E TE C TE D B Y C LA S S IC A L M E TH O D 100 C H A N G E P O IN T D E TE C TE D B Y 80 M A R K O V M E TH O D 60 40 20 TR U E TIM E O F C H A N G E 0 -20 0 10 20 30 40 50 TIM E • SSE (Sum of Squared Errors) Method: Minimizes the sum of squared errors when fitting the two segments. • 2 linear segments • Gaussian noise Ge and Smyth, AEC/APC XII: 25
Histograms of Detection Errors Simulated Data, σ σ σ =5 σ 250 SSE METHOD 200 150 100 50 0 -10 -8 -6 -4 -2 0 2 4 6 8 10 300 NEW METHOD 200 100 0 -10 -8 -6 -4 -2 0 2 4 6 8 10 DETECTED TIME - TRUE TIME • Detection Error = DETECTED TIME - TRUE TIME • New method has smaller errors. Ge and Smyth, AEC/APC XII: 26
Detection Errors as a Function of Noise 18 16 SSE METHOD 14 RMSE OF DETECTION TIME 12 10 8 6 4 2 NEW METHOD 0 0 2 4 6 8 10 12 14 16 18 20 SIGMA OF ADDITIVE NOISE MODEL Ge and Smyth, AEC/APC XII: 27
Next ... • Problem Statement • Segmental Semi-Markov Model • Algorithms and Experimental Results • Change-point detection • Pattern matching Ge and Smyth, AEC/APC XII: 28
Build a model from the example pattern Segments States S=1 S=2 S=M • Each segment of the piecewise linear representation corresponds to a state in the model • The state duration distribution on state i is a truncated Gaussian with µ = length of segment i and 3 σ = (20% x length of segment i ) Ge and Smyth, AEC/APC XII: 29
Pattern Matching Algorithm Candidate Pattern Segments Model States S=1 S=2 S=M • Given a candidate pattern y i y i+ 1 ...y j : • Compute the most likely state sequence S i S i+ 1 … S j • The pattern matching is successful if • S i S i+ 1 … S j = 1… M Ge and Smyth, AEC/APC XII: 30
How can we detect a pattern? E.g., Sliding Window Matching 450 400 AMPLITUDE 350 300 250 200 0 50 100 150 200 250 300 350 400 TIME Ge and Smyth, AEC/APC XII: 31
Pre-Pattern and Post-Pattern States 500 450 400 AMPLITUDE Pre-Pattern State 350 300 250 Post-Pattern State 200 0 50 100 150 200 250 300 350 400 TIME Ge and Smyth, AEC/APC XII: 32
Recommend
More recommend