I have a function that calculates an "either or" type of list for my data.
我有一个函数,它为我的数据计算一个“或者”类型的列表。
keyword = sys.argv[1] # a name from the Name column
def exon_coords():
exon_start_plus = [] # Plus strand coordinates
exon_start_minus = [] # Minus strand coordinates
for line in csv.reader(sys.stdin, csv.excel_tab):
if len(line) >= 1:
if re.search(keyword, str(line)): # If arg keyword exists in file
if line[3] == "-": # If the DNA strand is a minus strand
chrompos = line[0] + ";" # Get the chromosome position
exon_start_minus.append(chrompos+line[1]) # Full exon position
else: # all other lines are plus strands
chrompos = line[0] + ";"
exon_start_plus.append(chrompos+line[1])
return exon_start_minus, exon_start_plus #Return lists
Goal is to then write an output text file with the coordinates.
目标是用坐标写出一个输出文本文件。
with open(keyword+"_plus.txt", "w") as thefile:
for item in exon_start_plus:
thefile.write("{}, ".format(item))
OR if the keyword resulted in MINUS strands:
或者如果关键字产生了负链:
with open(keyword+"_minus.txt", "w") as thefile:
for item in exon_start_minus:
thefile.write("{}, ".format(item))
I tried putting these write files within the code but then the return functions just would not give me the full list, and I end up only writing one coordinate every time. I put them at the end, but this results in empty files and empty strings - I would like to keep this as one function and have it determine if the keyword (i.e. a gene ID) has coordinates given for plus/minus strand (I have a gigantic data file that contains this data and the point is to not manually scan the IDs and see if they are plus/minus DNA strands).
我试着把这些写文件放在代码里,但是返回函数不会给我完整的列表,我每次只写一个坐标。我放在最后,但这导致空文件和空字符串——我想这是一个函数,确定关键字(即基因ID)坐标给出+ / -链(我有一个巨大的数据文件,其中包含数据和关键是不要手动扫描ID,看看它们是+ / - DNA链)。
Thank you!
谢谢你!
EDIT (sample data, had to remove some columns so I edited the code as well):
编辑(示例数据,必须删除一些列,所以我也编辑了代码):
Position Start End Strand Overhang Name
1 3798630 3798861 + . ENSPFOG0000001
1 3799259 3799404 + . ENSPFOG0000001
1 3809992 3810195 + . ENSPFOG0000001
1 3810582 3810729 + . ENSPFOG0000001
2 4084800 4084866 - . ENSPFOG0000002
2 4084466 4084566 - . ENSPFOG0000002
2 4084089 4084179 - . ENSPFOG0000002
So if I use ENSPFOG0000001 as my keyword, then the script should run and determine that the strands are plus, collect the start coordinates in a list and then output a file that just has the coordinates. The file would have keyword+"_plus.txt" appended. If it was ENSPFOG0000002, then it would collect the minus strand coordinates, and create a file where keyword+"_minus.txt" is created.
因此,如果我使用ENSPFOG0000001作为我的关键字,那么脚本应该运行并确定这些链是正的,在列表中收集开始坐标,然后输出一个只有坐标的文件。该文件将具有关键字+“_plus”。txt”附加。如果它是ENSPFOG0000002,那么它将收集负链坐标,并创建一个文件,其中关键字+“_-”。txt”创建。
1 个解决方案
#1
0
An empty list evaluates to False:
空列表的计算结果为False:
>>> exon_start_minus, exon_start_plus = [], []
>>> bool(exon_start_minus), bool(exon_start_plus)
(False, False)
>>> exon_start_minus, exon_start_plus = [1], []
>>> bool(exon_start_minus), bool(exon_start_plus)
(True, False)
>>> exon_start_minus, exon_start_plus = [1], [1]
>>> bool(exon_start_minus), bool(exon_start_plus)
(True, True)
>>> exon_start_minus, exon_start_plus = [], [1]
>>> bool(exon_start_minus), bool(exon_start_plus)
(False, True)
So you can test for an empty list and take action as appropriate
因此,您可以对空列表进行测试,并酌情采取行动。
>>> if exon_start_plus:
print('!!!')
!!!
>>> if exon_start_minus:
print('!!!')
>>>
To retrieve both lists from the function:
从函数中检索两个列表:
exon_start_minus, exon_start_plus = exon_coords()
#1
0
An empty list evaluates to False:
空列表的计算结果为False:
>>> exon_start_minus, exon_start_plus = [], []
>>> bool(exon_start_minus), bool(exon_start_plus)
(False, False)
>>> exon_start_minus, exon_start_plus = [1], []
>>> bool(exon_start_minus), bool(exon_start_plus)
(True, False)
>>> exon_start_minus, exon_start_plus = [1], [1]
>>> bool(exon_start_minus), bool(exon_start_plus)
(True, True)
>>> exon_start_minus, exon_start_plus = [], [1]
>>> bool(exon_start_minus), bool(exon_start_plus)
(False, True)
So you can test for an empty list and take action as appropriate
因此,您可以对空列表进行测试,并酌情采取行动。
>>> if exon_start_plus:
print('!!!')
!!!
>>> if exon_start_minus:
print('!!!')
>>>
To retrieve both lists from the function:
从函数中检索两个列表:
exon_start_minus, exon_start_plus = exon_coords()