通过外部表合并子目录中的文件
问题描述:
我有一个文件夹(split_libs),其子文件夹根据SraRunTable3.txt第9和32列中描述的sample_name命名,每个文件夹都与sra_study关联。每个子文件夹内都有一个seqs.fna文件,对此,我不能更改名称 - 这是QIIME命令的输出。通过外部表合并子目录中的文件
我想通过阅读子文件夹名称(= sample_name)根据sra_study在子文件夹内合并seqs.fna文件。例如所有来自同一SRA研究的seqs.fna将被合并。
目录的一个例子概述:
split_libs
sample1
seqs.fna
sample2
seqs.fna
sample3
seqs.fna
的SraRunTable的例子概述:
(...)Sample_Name(...)SRA_Study(...)
sample_1 study_1
sample_2 study_1
sample_3 study_2
这里是我试过到目前为止:
import os
from operator import itemgetter
fields = itemgetter(9, 32)
with open('/home/andre/Desktop/PRJEB0000/SraRunTable3.txt') as csvfile:
next(csvfile)
for line in csvfile:
sample_name, sra_study = fields(line.split())
for folder in os.listdir('./split_libs'):
if folder == sample_name:
open('seqs.fna') as infile, open('/home/andre/Desktop/PRJEB0000/cat_fna/' + sra_study + ".fna", 'a') as outfile:
outfile.write(infile.read())
这个问题脱掉的Joining files by corresponding columns in outside table
任何捐款将不胜感激!
答
import os
from operator import itemgetter
fields = itemgetter(9, 32)
with open('/home/andre/Desktop/PRJEB0000/SraRunTable3.txt') as csvfile:
next(csvfile)
for line in csvfile:
sample_name, sra_study = fields(line.split())
#open the folder corresponding to sample_name and add the seqs to the appropriate study file
with open('split_libs/'+sample_name+'/seqs.fna') as infile, open('/home/andre/Desktop/PRJEB0000/cat_fna/' + sra_study + ".fna", 'a') as outfile:
outfile.write(infile.read())
所有学分Amanda Clare(未在Stackoverflow上注册)!
@mhawke,就像我们谈过的,这里是改进的重新发布! –