Python程序写诗【1分钟】古诗词生成

先看看写出来什么诗

Python程序写诗【1分钟】古诗词生成
Python程序写诗【1分钟】古诗词生成
Python程序写诗【1分钟】古诗词生成

好吧,有点难懂,毕竟是古文……

下面来分享程序

from gensim.models import Word2Vec
from random import choice
import warnings
warnings.filterwarnings('ignore')  # 不打印警告

"""配置"""
path = '古诗词.txt'
window = 14
min_count = 49  # 过滤低频字
size = 140  # 词向量维度
topn = 14  # 生成诗词的开放度
# path = '春联.txt'
# window = 10
# min_count = 29
# size = 120
# topn = 11
literary_form = {'五言绝句': (4, 5),
                 '七言绝句': (4, 7),
                 '对联': (2, 9)}

"""数据读取"""
with open(path, encoding='utf-8') as f:
    ls_of_ls_of_c = [list(line.strip()) for line in f]

"""建模训练"""
model = Word2Vec(ls_of_ls_of_c, size, window=window, min_count=min_count)
chr_dict = model.wv.index2word

"""文本序列生成"""
def poem_generator(title, form):
    filter = lambda lst: [t[0] for t in lst if t[0] not in [',', '。']]
    # 标题补全
    if len(title) < 4:
        if not title:
            title += choice(chr_dict)
        for _ in range(4 - len(title)):
            similar_chr = filter(model.similar_by_word(title[-1], topn // 2))
            char = choice([c for c in similar_chr if c not in title])
            title += char
    # 文本生成
    poem = list(title)[-window:]
    for i in range(form[0]):
        for _ in range(form[1]):
            predict_chr = model.predict_output_word(poem[-window:], max(topn, len(poem) + 1))
            predict_chr = filter(predict_chr)
            char = choice([c for c in predict_chr if c not in poem])
            poem.append(char)
        poem.append(',' if i % 2 == 0 else '。')
    length = form[0] * (form[1] + 1)
    return '《%s》' % ''.join(poem[:-length]) + '\n' + ''.join(poem[-length:])

if __name__ == '__main__':
    while True:
        title = input('输入标题:').strip()
        poem5 = poem_generator(title, literary_form['五言绝句'])
        print('\033[035m', poem5, '\033[0m', sep='')
        poem7 = poem_generator(title, literary_form['七言绝句'])
        print('\033[033m', poem7, '\033[0m', sep='')
        poem9 = poem_generator(title, literary_form['对联'])
        print('\033[036m', poem9, '\033[0m', sep='')
        print()

语料下载地址(本想设置免费,奈何积分是被动生成的o(╥﹏╥)o
https://download.****.net/download/yellow_python/10946669

想要读懂程序,先要有些基础

猛戳→gensim词向量