如何将两列的文本文件转换为fasta格式
问题描述:
我对这段代码感到困惑。我有我的testfile.txt
如何将两列的文本文件转换为fasta格式
Sclsc1_3349_SS1G_09805T0 TTGCGATCTATGCCGACGTTCCA
Sclsc1_8695_SS1G_14118T0 ATGGTTTCGGC
Sclsc1_12154_SS1G_05183T0 ATGGTTTCGGC
Sclsc1_317_SS1G_00317T0 ATGGTTTCGGC
Sclsc1_10094_SS1G_03122T0 ATGGTTTCGGC
我想将这个文件转换为这种格式(fasta
)如下:
>Sclsc1_3349_SS1G_09805T0
TTGCGATCTATGCCGACGTTCCA
>Sclsc1_8695_SS1G_14118T0
ATGGTTTCGGC
>Sclsc1_12154_SS1G_05183T0
ATGGTTTCGGC
>Sclsc1_317_SS1G_00317T0
ATGGTTTCGGC
>Sclsc1_10094_SS1G_03122T0
ATGGTTTCGGC
这里是我的Python代码(运行,如:python mycode.py testfile.txt outputfile.txt
,但它不输出结果是我想要的。有人可以帮我解决这个代码吗?谢谢!
import sys
#File input
fileInput = open(sys.argv[1], "r")
#File output
fileOutput = open(sys.argv[2], "w")
#Seq count
count = 1 ;
#Loop through each line in the input file
print "Converting to FASTA..."
for strLine in fileInput:
#Strip the endline character from each input line
strLine = strLine.rstrip("\n")
#Output the header
fileOutput.write("> " + str(count) + "\n")
fileOutput.write(strLine + "\n")
count = count + 1
print ("Done.")
#Close the input and output file
fileInput.close()
fileOutput.close()
答
当你在Linux操作系统,这里是短而快awk的一行代码:
awk '{ printf ">%s\n%s\n",$1,$2 }' testfile.txt > outputfile.txt
的outputfile.txt
内容:
>Sclsc1_3349_SS1G_09805T0
TTGCGATCTATGCCGACGTTCCA
>Sclsc1_8695_SS1G_14118T0
ATGGTTTCGGC
>Sclsc1_12154_SS1G_05183T0
ATGGTTTCGGC
>Sclsc1_317_SS1G_00317T0
ATGGTTTCGGC
>Sclsc1_10094_SS1G_03122T0
ATGGTTTCGGC
是在Linux上? – RomanPerekhrest
@RomanPerekhrest是的 – MAPK
如何缩短命令行单线? – RomanPerekhrest