Variance_Reduced_Replica_Exchange_Stochastic_Gradient_MCMC:通过减少方差来加速副本交换(ICLR'21)

时间:2024-03-23 23:42:33
【文件属性】:

文件名称:Variance_Reduced_Replica_Exchange_Stochastic_Gradient_MCMC:通过减少方差来加速副本交换(ICLR'21)

文件大小:568KB

文件格式:ZIP

更新时间:2024-03-23 23:42:33

Roff

方差减少的副本交换SGHMC 尽管在近凸问题中减小梯度方差具有优势,但理论与实践之间的自然差异是在非凸问题中是否应避免梯度噪声。 为了填补这一空白,我们仅关注于噪声能量估计量的方差减小以利用理论加速度,而不再考虑噪声梯度的方差减小,因此具有动量的随机梯度下降(M-SGD)的经验经验可以自然地进口。 要求 Python 2.7 或类似 麻木 CUDA 分类:批次大小为256的CIFAR100上的ResNet20 动量随机梯度下降(M-SGD),具有500个时期,批量为256个,学习率不断降低 $ python bayes_cnn . py - sn 500 - chains 1 - lr 2e-6 - LRanneal 0.984 - T 1e-300 - burn 0.6 随机梯度哈密顿量蒙特卡洛(SGHMC),在预热期间具有退火温度,之后具有固定温度 $ python baye


【文件预览】:
Variance_Reduced_Replica_Exchange_Stochastic_Gradient_MCMC-main
----trainer.py(8KB)
----.gitignore(113B)
----bayes_cnn.py(4KB)
----output()
--------cifar100_resnet20_batch_256_lr_2e-6_T_0.001_cycle_5_burn_0.7_seed_24439_74.40(116KB)
--------cifar100_resnet20_batch_256_chain_2_T_0.0003_LRgap_0.66_Tgap_0.2_F_5e6_VR_1_p_2_burn_0.6_alpha_0.3_cycle_1_adapt_c_1_seed_58476(0B)
--------cifar100_resnet20_batch_256_chain_2_T_0.0003_LRgap_0.66_Tgap_0.2_F_5e6_VR_1_p_2_burn_0.6_alpha_0.3_cycle_1_adapt_c_0_seed_31604(7KB)
--------cifar100_resnet20_batch_256_chain_2_T_0.003_LRgap_0.66_Tgap_0.2_F_5e5_VR_1_p_2_burn_0.6_alpha_0.3_cycle_1_adapt_c_0_seed_47841(7KB)
--------cifar100_resnet20_batch_256_chain_2_T_0.003_LRgap_0.66_Tgap_0.2_F_5e5_VR_1_p_2_burn_0.6_alpha_0.3_cycle_1_adapt_c_1_seed_5056(0B)
--------cifar100_resnet20_batch_256_chain_2_T_0.01_LRgap_0.66_Tgap_0.2_F_1.5e5_VR_0_p_2_burn_0.6_alpha_0.3_cycle_1_seed_47211_74.43(157KB)
--------cifar100_resnet20_batch_256_chain_1_T_1e-300_seed_67154_71.81(66KB)
--------cifar100_resnet20_batch_256_lr_2e-6_T_0.001_cycle_5_burn_0.7_seed_37225_74.08(116KB)
--------cifar100_resnet20_batch_256_lr_2e-6_T_0.001_cycle_5_burn_0.7_seed_82929_74.34(116KB)
--------cifar100_resnet20_batch_256_chain_2_T_0.003_LRgap_0.66_Tgap_0.2_F_5e5_VR_1_p_2_burn_0.6_alpha_0.3_cycle_1_adapt_c_1_seed_1682(0B)
--------cifar100_resnet20_batch_256_chain_1_T_1e-300_seed_81391_71.68(66KB)
--------cifar100_resnet20_batch_256_chain_2_T_0.0001_LRgap_0.66_Tgap_0.2_F_1.5e7_VR_1_p_2_burn_0.6_alpha_0.3_cycle_1_adapt_c_0_seed_27443(7KB)
--------cifar100_resnet20_batch_256_chain_2_T_0.01_LRgap_0.66_Tgap_0.2_F_1.5e5_VR_1_p_2_burn_0.6_alpha_0.3_cycle_1_seed_18803_74.78(303KB)
--------cifar100_resnet32_batch_256_chain_2_T_0.01_LRgap_0.66_Tgap_0.2_F_1.5e5_VR_0_p_2_burn_0.6_alpha_0.3_cycle_1_seed_35902_76.86(167KB)
--------cifar100_resnet32_batch_256_lr_2e-6_T_0.001_cycle_5_burn_0.7_seed_79580_76.42(116KB)
--------cifar100_resnet20_batch_256_chain_2_T_0.01_LRgap_0.66_Tgap_0.2_F_1.5e5_VR_0_p_2_burn_0.6_alpha_0.3_cycle_1_seed_29292_73.06(156KB)
--------cifar100_resnet20_batch_256_chain_1_T_1e-300_seed_64760_72.19(66KB)
--------cifar100_resnet32_batch_256_chain_2_T_0.01_LRgap_0.66_Tgap_0.2_F_1.5e5_VR_1_p_2_burn_0.6_alpha_0.3_cycle_1_seed_72638_77.19(293KB)
--------cifar100_resnet32_batch_256_chain_2_T_0.01_LRgap_0.66_Tgap_0.2_F_1.5e5_VR_1_p_2_burn_0.6_alpha_0.3_cycle_1_seed_87124_77.54(309KB)
--------trial_1()
--------cifar100_resnet32_batch_256_chain_2_T_0.01_LRgap_0.66_Tgap_0.2_F_1.5e5_VR_0_p_2_burn_0.6_alpha_0.3_cycle_1_seed_45730_76.05(167KB)
--------cifar100_resnet20_batch_256_chain_2_T_0.0003_LRgap_0.66_Tgap_0.2_F_5e6_VR_1_p_2_burn_0.6_alpha_0.3_cycle_1_adapt_c_1_seed_31965(9KB)
--------cifar100_resnet20_batch_256_chain_2_T_0.0001_LRgap_0.66_Tgap_0.2_F_1.5e7_VR_1_p_2_burn_0.6_alpha_0.3_cycle_1_adapt_c_1_seed_97873(1KB)
--------run.log(0B)
--------cifar100_resnet20_batch_256_chain_2_T_0.01_LRgap_0.66_Tgap_0.2_F_1.5e5_VR_1_p_2_burn_0.6_alpha_0.3_cycle_1_seed_85674_75.17(315KB)
--------cifar100_resnet20_batch_256_chain_2_T_0.001_LRgap_0.66_Tgap_0.2_F_1.5e6_VR_1_p_2_burn_0.6_alpha_0.3_cycle_1_adapt_c_1_seed_32983(9KB)
--------cifar100_resnet32_batch_256_lr_2e-6_T_0.001_cycle_5_burn_0.7_seed_84935_76.91(116KB)
--------cifar100_resnet20_batch_256_chain_2_T_0.0001_LRgap_0.66_Tgap_0.2_F_1.5e7_VR_1_p_2_burn_0.6_alpha_0.3_cycle_1_adapt_c_1_seed_69699(0B)
----sgmcmc.py(2KB)
----models()
--------cifar()
--------__init__.py(0B)
----tools()
--------transforms.py(2KB)
--------__init__.py(61B)
--------.bak()
--------torch_tools.py(3KB)
--------data_manipulation.py(5KB)
----README.md(2KB)
----grid_search.py(893B)

网友评论