Pervasive Attention: 用于序列到序列预测的2D卷积网络

网友投稿 896 2022-11-05

Pervasive Attention: 用于序列到序列预测的2D卷积网络

Pervasive Attention: 用于序列到序列预测的2D卷积网络

Pervasive Attention: 2D Convolutional Networks for Sequence-to-Sequence Prediction

This is an open source PyTorch implementation of the pervasive attention model described in:

Maha Elbayad, Laurent Besacier, and Jakob Verbeek. 2018. Pervasive Attention: 2D Convolutional Networks for Sequence-to-Sequence Prediction. In Proceedings of the 22nd Conference on Computational Natural Language Learning (CoNLL 2018)

Requirements

pytorch (tested with v0.4.1)subword-nmth5py (2.7.0)tensorboardX

Usage:

IWSLT'14 pre-processing:

cd scripts./prepare-iwslt14.shcd ..python preprocess.py -d iwslt

Training:

mkdir -p save eventspython train.py -c config/iwslt_l24.yaml

Note: in this setup the model takes up to 15G gpu memory. If you want to train the model on a smaller GPU try with the memeory-efficient implementation of the DenseNet or with a Log-DenseNet:

python train.py -c config/iwslt_l24_efficient.yamlpython train.py -c config/iwslt_l24_log.yaml

Generation & evaluation

python generate.py -c config/iwslt_l24.yaml

版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。

上一篇:spring中AutowiredAnnotationBeanPostProcessor的注册时机
下一篇:JVM的垃圾回收机制你了解吗
相关文章

 发表评论

暂时没有评论,来抢沙发吧~