A Two-Stage Method for Commodity Price Trend Forecasting SIGIR Workshop: FinIR 30 #$ July 2020 ustc_youdu: Bingjie Liang, Huixin Liu and Chujing He University of Science and Technology of China
• Problem Description The main task is to build prediction models and use data from 2003 to 2018 to predict the six metals’ price movement direction in 2019 at three time-horizons. • ������������� • ��������� ���� • ��������� ����
• Data Description • Time series data • Daily transaction data in LME for six metals • Daily transaction data of main relevant commodities and financial indices • Textual data • Analyst Reports published by institutional trader and News Reports, collected from both English and Chinese sources.
• Data Preprocessing � ��1����1��� �������������������������1���������1�� � �������������������������1�������������1� � �������1�������������������
�������� ������� ������� 3�D� ���� ���� �� �� 0��E91H��� ������ Opening price ����91H��� ������ Highest price .�L91H��� ������ Lowest price �C�I�91H��� ������ Closing price 5�CKD� ����� Trading volume 0� ��� Open interest �67 �67��E��� ����� /�7 /�7��E��� �������� 2�28��� 2�28��� �E��� ������� 4�6 4�6 �E��� ������ 5�6 5�6��E��� ����� 216 216 �E��� ������� 26�� 26�� �E��� ������� .���C9�� 1H�����H�E��C���C���H�����N � � .���C9��� 1H�����H�E��C���C���H������N � � .���C9��� 1H�����H�E��C���C���H������N � �
• Feature Selection � ������������������������
• Feature Selection • Use different feature combinations to train traditional classification models – Naïve Bayes � KNN � Random Forest � SVM • Input of classification model – Feature values of the current trading day and the expected trading day Current day Classification model Trend label: 0/1 Input Output Expected day • Important features – Open_Price � High_Price � Low_Price � Close_Price
• Model Structure ��� ����������������� Length = lag ! "#$%&'( ⋯ ! "#( ! * First Stage: Prediction Close_Price(lag=5) ! *'( ! "#$%&'( ⋯ ! "#( ! * normalize ! "#$%&'( ⋯ ! *'+, ! "#( ! * ! "#$%&'( ⋯ ! *'-, ! "#( ! * Close_Price_Pred LSTM Network �������������������������� Second Stage: Classification Close_Price Random Forest 1d label: 0/1 Close_Price_Pred_1d Close_Price 20d label: 0/1 LGB Classifier Close_Price_Pred_20d Close_Price 60d label: 0/1 Random Forest Close_Price_Pred_60d
• Prediction Curve Figure: Prediction curves for Aluminum on validation set
• Results Total Task1(1d) Task2(20d) Task3(60d) 55.18225736 50.00000000 48.74835310 66.79841897
Thank you for listening!
Recommend
More recommend