gossip 自动化交易系统《自己开发,后期开源》

时间:2025-04-09 07:49:29

文章目录

  • gossip 自动化交易系统
  • requirements
  • data
    • Download Qlib Data
      • Download CN Data
      • Downlaod US Data
      • Download CN Simple Data
      • Help
    • Using in Qlib
      • US data
      • CN data
  • highfreq
    • High-Frequency Dataset
      • Get High-Frequency Data
      • Dump & Reload & Reinitialize the Dataset
        • About Reinitialization
        • Run the Code
  • StockSelection
    • Introduction
    • ENVS
  • strategy and forecast
  • workflow

gossip 自动化交易系统

  • 主要是依据qlib和自己编写的模块,实现数据获取、股票选择、策略分析、模型训练、模型预测、回测等操作,实现金融的量化交易系统。
  • 目前支持算法包括
    • GBDT based on XGBoost (Tianqi Chen, et al. 2016)
    • GBDT based on LightGBM (Guolin Ke, et al. 2017)
    • GBDT based on Catboost (Liudmila Prokhorenkova, et al. 2017)
    • MLP based on pytorch
    • LSTM based on pytorch (Sepp Hochreiter, et al. 1997)
    • GRU based on pytorch (Kyunghyun Cho, et al. 2014)
    • ALSTM based on pytorch (Yao Qin, et al. 2017)
    • GATs based on pytorch (Petar Velickovic, et al. 2017)
    • SFM based on pytorch (Liheng Zhang, et al. 2017)
    • TFT based on tensorflow (Bryan Lim, et al. 2019)
    • TabNet based on pytorch (Sercan O. Arik, et al. 2019)
    • DoubleEnsemble based on LightGBM (Chuheng Zhang, et al. 2020)

requirements

qlib
logure
fire
requests
pandas
lxml
numpy
tqdm
yahooquery
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9

data

Download Qlib Data

Download CN Data

python get_data.py qlib_data --target_dir ~/.qlib/qlib_data/cn_data --region cn
  • 1

Downlaod US Data

python get_data.py qlib_data --target_dir ~/.qlib/qlib_data/us_data --region us
  • 1

Download CN Simple Data

python get_data.py qlib_data --name qlib_data_simple --target_dir ~/.qlib/qlib_data/cn_data --region cn
  • 1

Help

python get_data.py qlib_data --help
  • 1

Using in Qlib

For more information: /en/latest/start/

US data

Need to download data first: Download US Data

import qlib
from qlib.config import REG_US
provider_uri = "~/.qlib/qlib_data/us_data"  # target_dir
qlib.init(provider_uri=provider_uri, region=REG_US)
  • 1
  • 2
  • 3
  • 4

CN data

Need to download data first: Download CN Data

import qlib
from qlib.config import REG_CN
provider_uri = "~/.qlib/qlib_data/cn_data"  # target_dir
qlib.init(provider_uri=provider_uri, region=REG_CN)
  • 1
  • 2
  • 3
  • 4

highfreq

High-Frequency Dataset

This dataset is an example for RL high frequency trading.

Get High-Frequency Data

Get high-frequency data by running the following command:

    python  get_data
  • 1

Dump & Reload & Reinitialize the Dataset

The High-Frequency Dataset is implemented as in the . DatatsetH is the subclass of , whose state can be dumped in or loaded from disk in pickle format.

About Reinitialization

After reloading Dataset from disk, Qlib also support reinitializing the dataset. It means that users can reset some states of Dataset or DataHandler such as instruments, start_time, end_time and segments, etc., and generate new data according to the states.

The example is given in , users can run the code as follows.

Run the Code

Run the example by running the following command:

    python  dump_and_load_dataset
  • 1

StockSelection

Introduction

====
This project demonstrates how to apply machine learning algorithms to distinguish “good” stocks from the “bad” stocks. To this end, we construct 244 technical and fundamental features to characterize each stock, and label stocks according to their ranking with respect to the return-volatility ratio. Algorithms ranging from traditional statistical learning methods to recently popular deep learning method, . Logistic Regression (LR), Random Forest (RF), Deep Neural Network (DNN), and Stacking Ensemble model, are trained to solve the classification task. Genetic Algorithm is also used to implement features selection. The effectiveness of the stock selection strategy is validated in Chinese stock market from both statistical and practical aspects, showing that:

  • Stacking outperforms other models reaching an AUC score of 0.972;
  • Genetic Algorithm picks a subset of 114 features and the prediction performances of all models remain almost unchanged after the selection procedure, which suggests some features are indeed redundant;
  • LR and DNN are radical models; RF is risk-neutral model; Stacking is somewhere between DNN and RF.

ENVS

  • python = 3.5
numpy
pandas
matplotlib
math
os
sklearn
tensorflow
keras
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8

strategy and forecast

workflow