Beam search

By default, translation is done using beam search. 可以用 -beam_size 选项来换取翻译时间和收索准确性, -beam_size 1 为贪婪搜索。 在使用中,小的默认集束一般就足够了。

通过设 -n_best > 1,集束搜索也可用来提供一个大致的 n-best 翻译列表。翻译命令还使用一个oracle/ gold 文件 -tgt ,并输出一个得分比较以供分析。

Hypotheses filtering

The beam search provides a built-in filter based on unknown words: -max_num_unks. Hypotheses with more unknown words than this value are dropped.


As dropped hypotheses temporarily reduce the beam size, the -pre_filter_factor is a way to increase the number of considered hypotheses before applying filters.


The beam search also supports various normalization techniques that are disabled by default and can be used to biased the scores generated by the model:

where is the source, is the current target, and the functions as defined below. An additional penalty on end of sentence tokens can also be added to prioritize longer sentences.

Length normalization

Scores are normalized by the following formula as defined in Wu et al. (2016):

where is the current target length and is the length normalization coefficient -length_norm.

Coverage normalization

Scores are penalized by the following formula as defined in Wu et al. (2016):

where is the attention probability of the -th target word on the -th source word , is the source length, is the current target length and is the coverage normalization coefficient -coverage_norm.

End of sentence normalization

The score of the end of sentence token is penalized by the following formula:

where is the source length, is the current target length and is the end of sentence normalization coefficient -eos_norm.

Decoding with auxiliary language model

Beam search can use an additional language model to modify score of each option as defined in Gulcehre et al. (2015) as "Shallow Fusion":

Where is the language model log-probability of the sequence and is defined by -lm_weight parameter. To activate the language model, simply use -lm_model lm.t7.


The language model cannot use bidirectional RNN and needs to share the same vocabulary (tokens and features) than the translation model.

Output attention to a file

The option -save_attention FILE can be used to save attention state to a file during translation. The format of the file is as following (compatible with NEMATUS):

sentence id ||| target words ||| score ||| source words ||| number of source words ||| number of target words

Where T_1 ... T_n are the target words - each alignement line is space separated probability to source word.

To visualize the beam search exploration, you can use the option -save_beam_to beam.json. It will save a JSON serialization of the beam search history.


This option requires the dkjson package.

This representation can then be visualized dynamically using the generate_beam_viz.py script from the OpenNMT/VisTools repository:

git clone https://github.com/OpenNMT/VisTools.git
cd VisTools
mkdir out/
python generate_beam_viz.py -d ~/OpenNMT/beam.json -o out/
firefox out/000000.html

Beam search visualization