Da Data-Dr Driven Progn ognos ostics cs of of Lithium- Io Ion R n Rec echar hargeable B eable Batter eries U ies Using sing Bi Bilinear Kernel Regression Charlie Hubbard, John Bavlsik, Chinmay Hegde and Chao Hu Iowa State University
Applications • Electric/hybrid vehicles • Medical devices • Mobile devices Image credit: http://uhaweb.hartford.edu/; https://rahulmittal.files.wordpress.com; http://www.carsdirect.com/
Health of an Li-Ion Battery • Available signals • Current • Voltage • Internal temperature • Battery-health indicators • State of Charge (SOC) • State of Health (SOH) • Remaining Useful Life (RUL) Image credit: www.ionarchive.com
Remaining Useful Life Estimation Normalized Capacity vs Number of Cycles 1 • Remaining Useful Life (RUL) Capacity Data EOS Limit • End of Service limit determined by 0.95 application Normalized Capacity • Measured in charge/discharge 0.9 cycles 0.85 0.8 0.75 0 100 200 300 400 500 600 700 Number of Cycles
Methods for SOH/RUL Estimation • Model-based • Assume knowledge of physical system • Fit data to model of system • Si, et. all 2013 • Gebraeel & Pan 2008 • Data-driven • Assume no knowledge of physical system • Machine learning/pattern recognition methods determine model • Neural Networks – Liu, et. all 2010; • Relevance Vector Machines – Hu, et. all 2015;
Data Deficiencies • Types of deficiencies Normalized Capacity vs Charge/Discharge Cycles • Missing data 1 0.9 • Noise 0.8 0.7 Normalized Capacity 0.6 • Handling of deficiencies 0.5 0.4 • Remove spurious data points 0.3 • Allow model to ignore them 0.2 0.1 0 0 20 40 60 80 100 120 140 Charge/Discharge Cycles
Our Approach • Data-driven • Least-squares regression based Kernel regression model • Add additional regression step to find (and remove) noise in training data Capacity feature RUL state • Add kernel to transform data space space • Add regularization to encourage sparsity
Least-squares Regression • Feature vector: 𝑦 • Empirically determined RUL: 𝑧 • Contains 𝑜 most recent capacity readings 𝑧 & • Data matrix: 𝑌 = ⋮ • RUL vector: 𝑍 = 𝑦 && ⋯ 𝑦 &( 𝑧 + ⋮ ⋱ ⋮ 𝑦 +& ⋯ 𝑦 +(
Least-squares Regression • Given 𝑌 and 𝑍 , solve: 4 min 1 𝑍 − 𝑌𝑥 4 • Prediction vector: 𝑥 5 • Predicted RUL: 𝑧 6 = 𝑦, 𝑥 5
Kernel Least-Squares Regression • Replace 𝑌 with kernel matrix, 𝐿 Normalized Capacity vs Number of Cycles • Transform data to higher- 1 Capacity Data EOS Limit dimensional space 0.95 • Allows us to fit linear predictor, 𝑥 to non-linear data Normalized Capacity 0.9 • Kernel choice is flexible 0.85 • Dependent on data set 0.8 • Gaussian Kernel is a common choice 0.75 0 100 200 300 400 500 600 700 Number of Cycles
Gaussian Kernel Matrix, 𝐿 1 𝐿 <,& = ⋮ • 𝑗 :; row of 𝐿 given by: 1 4 (− 𝑦 − 𝑦 < 𝐿 <,(DE&) = 4 𝐿 < = exp ) 𝑠 4 4 4 (− 𝑦 & − 𝑦 & 4 (− 𝑦 & − 𝑦 +F& 4 exp ) ⋯ exp ) 𝑠 4 𝑠 4 • Add a vector of ones as the first column of 𝐿 ⋮ ⋱ ⋮ • Allow of non-zero “y-intercept” for linear function 4 4 (− 𝑦 + − 𝑦 & 4 (− 𝑦 + − 𝑦 +F& 4 exp ) ⋯ exp ) 𝑠 4 𝑠 4
Kernel Least-Squares Regression • Given 𝑌 and 𝑍, obtain 𝑥 5 by solving: 4 min 1 𝑍 − 𝐿𝑥 4 • Given feature vector 𝑦, compute: 𝑙 𝑦 𝑥ℎ𝑓𝑠𝑓 𝑙(𝑦) & = 1, R OPO Q R 𝑙(𝑦) D = exp − , 𝑘 = {2, … . , 𝑛 + 1} S R • Predicted RUL: 𝑧 6 = 𝑙 𝑦 , 𝑥 5
Kernel Regression and LASSO Overfitting Enforcing Sparsity • Kernel regression can over-fit to • Least Absolute Shrinkage and training data Selection Operator (LASSO) • Forcing 𝑥 to be sparse can prevent over-fitting • Rewrite regression problem as: 4 + λ 𝑥 & • 𝑥 will be allowed to have only a min 1 𝑍 − 𝐿𝑥 4 few non-zero entries • Only the most representative points will be used to calculate 𝑧 6
Noise and Bilinear Regression • Noise appears in almost every data set • Writing 𝐿 = 𝐿 :S]^ + 𝐹 we look to find and remove errors from 𝐿 to obtain more accurate predictions
Bilinear Regression • Kernel regression with LASSO (noiseless training data): 4 + λ 𝑥 & min 1 𝑍 − 𝐿 :S]^ 𝑥 4 • With noise in training data: 𝐿 − 𝐹 = 𝐿 :S]^ 4 + λ 𝑥 & min 1,` 𝑍 − (𝐿 − 𝐹)𝑥 4 • Include a regularization term on E 4 + λ 𝑥 & + τ 𝐹 b b min 1,` 𝑍 − (𝐿 − 𝐹)𝑥 4
Algorithm – Bilinear Regression Setup Algorithm h ← 0, 𝑢 ← 0 . Initialize : 𝐱 e 5 ← 0, 𝐹 e • INPUTS: Training data {( x i , y i )}, Compute : the kernel matrix K . i= 1, 2, …, n . While t < T do: 𝑢 ← 𝑢 + 1 • OUTPUTS: Estimated kernel j = 𝐿 − 𝐹 : h . Solve: Set 𝐿 prediction vector 𝐱 d . j𝐱‖ 4 4 𝐱 d :F& = arg min 𝜇‖𝐱‖ & + ‖𝐳 − 𝐿 Set 𝐳 q = 𝐳 − 𝐿𝐱 d :F& . Solve: • PARAMETERS: Optimization b r :F& = arg min 𝜐 𝑤𝑓𝑑 𝐹 𝐹 b parameters l and t , kernel bandwidth 4 +‖𝐳 q + 𝐹𝐱 d :F& ‖ 4 r , number of iterations T Record prediction error: 4 r :F& 𝐱 𝑄𝑠𝑓𝑒𝐹𝑠𝑠 𝑢 = 𝐳 − 𝐿 − 𝐹 d :F& 4 Find 𝑢 ∗ that minimizes 𝑄𝑠𝑓𝑒𝑓𝑠𝑠(𝑢) . Output: 𝐱 d ← 𝐱 d : ∗
Data Set Test Cells Noise Addition • 8 test cells • Imbued with Gaussian noise • Zero mean • Normalized capacity data from • std. dev. 𝜏 = {0, 0.005, 0.010, 700 charge/discharge cycles 0.015} • Cells were cycled from 2002-2012 • Weekly discharge rate
� � Experiment Setup Cross Validations and Error Metric Test Procedure • Leave-one-out cross validations • Feature vectors: three most (CVs) recent capacity readings • Root mean squared error (RMSE) • For each noise level 𝜏 4 • Perform two 8-fold CVs & † 𝑆𝑁𝑇𝐹 = } ∑ ∑ 𝑧 6 𝐘\𝐘 • 𝐲 < − 𝑧 𝐲 < ‡ˆ& <∈𝐉 •
Results RMSE Noise in Test and Training Data (std. dev of Noise in Training Data (std. dev of Gaussian noise) Gaussian noise) Prediction Method 0 0.005 0.01 0.015 0.005 0.01 0.015 Lasso 31.24 33.052 34.72 50.42 42.33 61.53 83.67 Bi-Lasso 30.26 31.62 33.28 46.22 40.96 60.16 82.40 Bi-Tikhonov 29.57 30.92 32.80 48.88 40.57 60.58 83.52 RVM 30.91 32.67 36.16 47.67 41.50 60.22 82.58
Results Noise-free test data Error CDF’s Comparison of Error CDFs 1 Bilinear - Tikhonov Bilinear - LASSO LASSO - No Error Modeling 0.9 RMSE RVM 0.8 0 0.005 0.01 0.015 0.7 Lasso 31.24 33.052 34.721 50.415 0.6 Empirical CDF Bi-Lasso 0.5 30.259 31.615 33.282 46.222 0.4 Bi-Tikhonov 29.572 30.921 32.8 48.88 0.3 RVM 30.91 32.67 36.16 47.67 0.2 0.1 0 0 10 20 30 40 50 60 70 80 90 100 Absolute Value of Errors
Testing with Noisy Data Predictions with Noisy Test Data Training/Testing Noise = 0.000 Training/Testing Noise = 0.005 600 Training/Testing Noise = 0.010 Training/Testing Noise = 0.015 Perfect Prediction 500 Predicted RUL (cycles) 400 300 200 100 0 0 100 200 300 400 500 600 Measured RUL (cycles)
Summary RUL predictions in batteries can be of • critical importance Capacity fade data is nonlinear and often • noisy Our model leverages: • • Error estimation (and removal) • Data transformations • Sparse predictions Key contribution: Provide accurate RUL • estimation in the presence of noisy data
References • Hubbard, Bavlisk, Hu, Hegde (2016). Data-Driven Prognostics of Lithium-Ion Rechargeable Batteries Using Bilinear Kernel Regression. Annual Conference of the Prognostics and Health Management Society 2016. • Liu, J., Saxena, A., Goebel, K., Saha, B., & Wang, W. (2010). An adaptive recurrent neural network for remaining useful life prediction of lithium-ion batteries . National Aeronautics and Space Administration Moffett Field, CA Ames Research Center. • Hu, C., Jain, G., Schmidt, C., Strief, C., & Sullivan, M. (2015). • Online estimation of lithium-ion battery capacity using sparse Bayesian learning. Journal of Power Sources , 289 , 105-113. • Si, X. S., Wang, W., Hu, C. H., Chen, M. Y., & Zhou, D. H. (2013). A Wiener-process-based degradation model with a recursive filter algorithm for remaining useful life estimation. Mechanical Systems and Signal Processing , 35 (1), 219-237. • Gebraeel, N., & Pan, J. (2008). Prognostic degradation models for computing and updating residual life distributions in a time- varying environment. IEEE Transactions on Reliability , 57 (4), 539-550
Recommend
More recommend