Resolving Profile Distortion Resolving Profile Distortion for Electron-based IPMs for Electron-based IPMs using Machine Learning using Machine Learning 3rd IPM Workshop D. Vilsmeier (GSI) J-PARC (Tokai, Japan) M. Sapinski (GSI) R. Singh (GSI) 1 18/09/2018 3rd IPM Workshop, D. Vilsmeier
What is Machine Learning? What is Machine Learning? “ Field of study that gives computers the ability to learn without being explicitly programmed. - Arthur Samuel (1959) + = "Classical" approach: Input Algorithm Output + = Machine Learning: Input Output Algorithm 2 18/09/2018 3rd IPM Workshop, D. Vilsmeier
Machine Learning Toolbox Machine Learning Toolbox Unsupervised Learning: Supervised Learning: k-Means Clustering Artificial Neural Networks Autoencoders Decision Trees Principal comp. analysis Linear Regression k-Nearest Neighbor Reinforcement Learning: Support Vector Machines Q-Learning Random Forest Deep Deterministic Policy ... and many more Gradient 3 18/09/2018 3rd IPM Workshop, D. Vilsmeier
IPM Profile Distortion IPM Profile Distortion Ideal case Real case Particles move on straight Trajectories are influenced lines towards the detector by initial momenta and by interaction with beam field − V + V 4 18/09/2018 3rd IPM Workshop, D. Vilsmeier
Counteract via ... Counteract via ... Increase of electric field Additional magnetic field Resulting in smaller extraction times Constrains the maximal displacement and hence smaller displacements; limit to the gyroradius of the resulting is quickly reached motion; usually an effective measure 5 18/09/2018 3rd IPM Workshop, D. Vilsmeier
Distortion without magnetic field Distortion without magnetic field Already observed in [W. DeLuca, IEEE 1969] (+ observation of focusing for electron collection) R. E. Thern "Space-charge Distortion in the Brookhaven Ionization Profile Monitor" PAC 1987 Simulations + Measurements Good agreement for nominal extraction N 1.065 ( voltages −0.435 1.54 ) = σ + 0.302 1 + 3.6 R σ m σ 2.065 Disagreement at lower extraction voltages + other approaches, W. Graves "Measurement of Transverse Emittance including non-Gaussian in the Fermilab Booster" PhD 1994 beam shapes via iterative procedures = + + σ c c σ c N beam 1 2 measured 3 6 18/09/2018 3rd IPM Workshop, D. Vilsmeier
Distortion with magnetic field Distortion with magnetic field More complex motion also due to the interaction with beam electric field Capturing effects as well as different electromagnetic drifts play a role Space-charge Displacement from initial position can be region mainly ascribed to three different effects: Detector region Displacement of gyro-center due to initial velocities (Δ x ) 1 Displacement of gyro-center due to space-charge interaction (Δ x ) 2 Displacement due to gyro-motion above detector (Δ x ) 3 Final motion is determined by effects in the "space-charge region" 7 18/09/2018 3rd IPM Workshop, D. Vilsmeier
Electron trajectories Electron trajectories dE ) ( dt Polarization drift The resulting motion strongly depends on the starting position within ExB-drift the bunch and hence on the bunch shape itself Various electromagnetic Capturing drifts / interactions create a complex dependence of the final gyro-motion on the initial p-bunch conditions Electron motion "Pure" gyro-motion 8 18/09/2018 3rd IPM Workshop, D. Vilsmeier
Gyro-radius increase Gyro-radius increase This interaction effectively results in an increase of gyro-radii which consequently determines the profile distortion The increase itself depends on the starting position and thus on the bunch shape prevents usage of simple description by other beam parameters (e.g. point- spread functions) 9 18/09/2018 3rd IPM Workshop, D. Vilsmeier
Profile distortion Profile distortion Ideally a one-dimensional projection the the transverse beam profile is measured, but... 6.5 TeV E 2.1 ⋅ 10 11 N q 0.27 mm σ x 0.36 mm σ y 4 σ 0.9 ns z 10 18/09/2018 3rd IPM Workshop, D. Vilsmeier
Magnetic field increase Magnetic field increase N-turn B-fields Without space-charge electrons at the Due to space-charge interaction bunch center will perform exactly N only large field strengths are turns for specific magnetic field effective though strengths 11 18/09/2018 3rd IPM Workshop, D. Vilsmeier
Using Machine Learning Using Machine Learning Training 1 3 https://pypi.org/project/virtual-ipm Used to fit the model; split size ~ 60%. Validation Check generalization to unseen data; split size ~ 20%. Testing Evaluate final model performance; split size ~ 20%. 2 Parameter Range Step size Protons Consider Bunch pop. [1e11] 1.1 -- 2.1 ppb 0.1 ppb 6.5 TeV 21,021 Bunch width (1σ) 270 -- 370 μm 5 μm 4kV / 85mm different Bunch height (1σ) 360 -- 600 μm 20 μm 0.2 T cases Bunch length (4σ) 0.9 -- 1.2 ns 0.05 ns Evaluated on grid data and randomly sampled data 12 18/09/2018 3rd IPM Workshop, D. Vilsmeier
Artificial Neural Networks Artificial Neural Networks Inspired by the human Map non-linearities brain, many "neurons" through non-linear linked together activation functions Perceptron Multi-Layer Perceptron Input y ( x ) = σ W ⋅ x + b ( ) layer Weights Apply non- linearity, e.g. ReLU, Tanh, Sigmoid Bias 13 18/09/2018 3rd IPM Workshop, D. Vilsmeier
ANN Implementation ANN Implementation IDense = partial(Dense, kernel_initializer=VarianceScaling()) Fully-connected keras # Create feed-forward network. feed-forward model = Sequential() network # Since this is the first hidden layer we also need to specify # the shape of the input data (49 predictors). model.add(IDense(200, activation='relu', input_shape=(49,)) model.add(IDense(170, activation='relu')) model.add(IDense(140, activation='relu')) model.add(IDense(110, activation='relu')) with ReLU # The network's output (beam sigma). This uses linear activatio activation model.add(IDense(1)) function model.compile( D. Kingma and J. Ba, optimizer=Adam(lr=0.001), "Adam: A Method for loss='mean_squared_error' Stochastic ) Optimization", model.fit( arXiv:1412.6980, 2014 x_train, y_train, batch_size=8, epochs=100, shuffle=True, After each epoch validation_data=(x_val, y_val) ) compute loss on Batch learning validation data in order to prevent Iterate through training set multiple "overfitting" times (= epochs) Weight updates are performed in batches (of training samples) 14 18/09/2018 3rd IPM Workshop, D. Vilsmeier
Why ANNs? Why ANNs? Universal approximation theorem “ Every finite continuous "target" function can be approximated with arbitrarily small error by feed- forward network with single hidden layer [corresponding Cybenko 1989; Hornik 1991] hidden units - dimensional domain activation function n d ( ∑ k j ) o σ n d y = ⋅ h + b ∑ j w w x k j jk Works on compact subsets of R d parameters to be "optimized" Proof of existence, i.e. no universal optimization algorithm exists "No free lunch theorem" 15 18/09/2018 3rd IPM Workshop, D. Vilsmeier
Profile RMS Inference - Results Profile RMS Inference - Results Very good results on simulation data below 1% accuracy Results are without consideration of noise on profile data Tested also other machine learning algorithms: • Linear regression (LR) • Kernel ridge regression (KRR) • Support vector machine (SVR) Multi-layer perceptron (= ANN) 16 18/09/2018 3rd IPM Workshop, D. Vilsmeier
RMS Inference with Noise RMS Inference with Noise Linear regression model Linear regression amplifies noise in predictions if not explicitly trained no noise on training data similar noise on training data Multi-layer perceptron MLP amplifies noise; bounded activation functions could help; as well as duplicating data before "noising" 17 18/09/2018 3rd IPM Workshop, D. Vilsmeier
Full Profile Reconstruction Full Profile Reconstruction So far: Compute beam RMS Machine σ Learning x Model σ z N Compute beam profile Instead: Machine Learning Model σ z N 18 18/09/2018 3rd IPM Workshop, D. Vilsmeier
Gaussian bunch shape Gaussian bunch shape MLP Architecture 2 hidden layers, 88 nodes tanh activation function Performance measure mean = 0.0024 Mean squared error std = 0.0045 (MSE) 1 ∑ i =1 i ) 2 N MSE = ( p , i − y y mean = 0.1231 N std = 0.0808 prediction target 19 18/09/2018 3rd IPM Workshop, D. Vilsmeier
Generalized Gaussian bunch shape Generalized Gaussian bunch shape ) β − ∣ x − μ ∣/ α ( Gen-Gauss used for testing while β e 2 α Γ(1/ β ) training (fitting) was performed with Gaussian bunch shape mean = 0.0051 ANN model std = 0.0064 generalizes to β = 3 different mean = 0.1638 beam shapes std = 0.0974 mean = 0.0068 std = 0.0087 Smaller distortion in β = 1.5 this case mean = 0.0278 std = 0.0237 20 18/09/2018 3rd IPM Workshop, D. Vilsmeier
Recommend
More recommend