Pydub - 将split_on_silence与最小长度/文件大小结合起来

Pydub - 将split_on_silence与最小长度/文件大小结合起来

问题描述:

我有两个脚本,其中一个脚本分割一定长度的音频,另一个脚本每当有一段无声段落时都会分割音频。是否有可能在沉默时分割音频,但只能在一段时间后才能分割音频?我需要大块的视频片段,不要超过5分钟。Pydub - 将split_on_silence与最小长度/文件大小结合起来

分裂脚本忽略沉默:

from pydub import AudioSegment 
#from pydub.utils import mediainfo 
from pydub.utils import make_chunks 
import math 

#lac_audio = AudioSegment.from_file("Kalimba.mp3", "mp3") 
#flac_audio.export("audio.mp3", format="mp3") 
myaudio = AudioSegment.from_file("Kalimba.mp3" , "mp3") 
channel_count = myaudio.channels #Get channels 
sample_width = myaudio.sample_width #Get sample width 
duration_in_sec = len(myaudio)/1000#Length of audio in sec 
sample_rate = myaudio.frame_rate 

print "sample_width=", sample_width 
print "channel_count=", channel_count 
print "duration_in_sec=", duration_in_sec 
print "frame_rate=", sample_rate 
bit_rate =16 #assumption , you can extract from mediainfo("test.wav") dynamically 


wav_file_size = (sample_rate * bit_rate * channel_count * duration_in_sec)/8 
print "wav_file_size = ",wav_file_size 


file_split_size = 10000000 # 10Mb OR 10, 000, 000 bytes 
total_chunks = wav_file_size // file_split_size 

#Get chunk size by following method #There are more than one ofcourse 
#for duration_in_sec (X) --> wav_file_size (Y) 
#So whats duration in sec (K) --> for file size of 10Mb 
# K = X * 10Mb/Y 

chunk_length_in_sec = math.ceil((duration_in_sec * 10000000) /wav_file_size) #in sec 
chunk_length_ms = chunk_length_in_sec * 1000 
chunks = make_chunks(myaudio, chunk_length_ms) 

#Export all of the individual chunks as wav files 

for i, chunk in enumerate(chunks): 
    chunk_name = "chunk{0}.mp3".format(i) 
    print "exporting", chunk_name 
    chunk.export(chunk_name, format="mp3") 

分裂脚本而忽略长度:

from pydub import AudioSegment 
from pydub.silence import split_on_silence 

sound = AudioSegment.from_mp3("my_file.mp3") 
chunks = split_on_silence(sound, 
    # must be silent for at least half a second 
    min_silence_len=500, 

    # consider it silent if quieter than -16 dBFS 
    silence_thresh=-16 

) 

for i, chunk in enumerate(chunks): 
    chunk.export("/path/to/ouput/dir/chunk{0}.wav".format(i), format="wav") 

我的建议是使用pydub.silence.split_on_silence(),然后根据需要,让您有一个大致您的目标大小的文件重新组合的段。

from pydub import AudioSegment 
from pydub.silence import split_on_silence 

sound = AudioSegment.from_file("/path/to/file.mp3", format="mp3") 
chunks = split_on_silence(
    sound, 

    # split on silences longer than 1000ms (1 sec) 
    min_silence_len=1000, 

    # anything under -16 dBFS is considered silence 
    silence_thresh=-16, 

    # keep 200 ms of leading/trailing silence 
    keep_silence=200 
) 

# now recombine the chunks so that the parts are at least 90 sec long 
target_length = 90 * 1000 
output_chunks = [chunks[0]] 
for chunk in chunks[1:]: 
    if len(output_chunks[-1]) < target_length: 
     output_chunks[-1] += chunk 
    else: 
     # if the last output chunk is longer than the target length, 
     # we can start a new one 
     output_chunks.append(chunk) 

# now your have chunks that are bigger than 90 seconds (except, possibly the last one) 

或者,您可以使用pydub.silence.detect_nonsilent()寻找范围,并就在哪里切片原始音频

注意到自己的决定:我也贴这一个similar/duplicate github issue

+0

太好了,谢谢 – HCLivess

的解决方案是使用mp3splt代替: http://mp3splt.sourceforge.net/mp3splt_page/documentation/man.html

-t TIME [> MIN_TIME] 时间模式。此选项将创建具有由TIME指定的固定时间长度(具有上述相同格式)的无限数量的较小文件。将长文件分割成较小的文件(例如CD的时间长度)很有用。调整选项(-a)可用于通过静音检测来调整分割点。 > MIN_TIME可用于指定最后一段的理论最小轨道长度;它允许避免创建非常小的文件作为最后一个段。请务必在使用MIN_TIME - “TIME> MIN_TIME”时引用参数。

然后,它可以在Python中这样使用:

import os 
os.system("mp3splt inputfile.mp3")