如何在同一个函数内得到两个循环的结果?

时间:2021-11-04 20:41:34

I have a function that calculates an "either or" type of list for my data.

我有一个函数,它为我的数据计算一个“或者”类型的列表。

keyword = sys.argv[1]  # a name from the Name column

def exon_coords():
    exon_start_plus = [] # Plus strand coordinates
    exon_start_minus = [] # Minus strand coordinates
    for line in csv.reader(sys.stdin, csv.excel_tab):
        if len(line) >= 1:
            if re.search(keyword, str(line)): # If arg keyword exists in file
                if line[3] == "-": # If the DNA strand is a minus strand
                    chrompos = line[0] + ";" # Get the chromosome position
                    exon_start_minus.append(chrompos+line[1]) # Full exon position
                else: # all other lines are plus strands
                    chrompos = line[0] + ";" 
                    exon_start_plus.append(chrompos+line[1])

return exon_start_minus, exon_start_plus #Return lists

Goal is to then write an output text file with the coordinates.

目标是用坐标写出一个输出文本文件。

with open(keyword+"_plus.txt", "w") as thefile:
    for item in exon_start_plus:
        thefile.write("{}, ".format(item))

OR if the keyword resulted in MINUS strands:

或者如果关键字产生了负链:

with open(keyword+"_minus.txt", "w") as thefile:
    for item in exon_start_minus:
        thefile.write("{}, ".format(item))

I tried putting these write files within the code but then the return functions just would not give me the full list, and I end up only writing one coordinate every time. I put them at the end, but this results in empty files and empty strings - I would like to keep this as one function and have it determine if the keyword (i.e. a gene ID) has coordinates given for plus/minus strand (I have a gigantic data file that contains this data and the point is to not manually scan the IDs and see if they are plus/minus DNA strands).

我试着把这些写文件放在代码里,但是返回函数不会给我完整的列表,我每次只写一个坐标。我放在最后,但这导致空文件和空字符串——我想这是一个函数,确定关键字(即基因ID)坐标给出+ / -链(我有一个巨大的数据文件,其中包含数据和关键是不要手动扫描ID,看看它们是+ / - DNA链)。

Thank you!

谢谢你!

EDIT (sample data, had to remove some columns so I edited the code as well):

编辑(示例数据,必须删除一些列,所以我也编辑了代码):

Position    Start   End Strand  Overhang    Name
1   3798630 3798861 +   .   ENSPFOG0000001
1   3799259 3799404 +   .   ENSPFOG0000001
1   3809992 3810195 +   .   ENSPFOG0000001
1   3810582 3810729 +   .   ENSPFOG0000001
2   4084800 4084866 -   .   ENSPFOG0000002
2   4084466 4084566 -   .   ENSPFOG0000002
2   4084089 4084179 -   .   ENSPFOG0000002

So if I use ENSPFOG0000001 as my keyword, then the script should run and determine that the strands are plus, collect the start coordinates in a list and then output a file that just has the coordinates. The file would have keyword+"_plus.txt" appended. If it was ENSPFOG0000002, then it would collect the minus strand coordinates, and create a file where keyword+"_minus.txt" is created.

因此,如果我使用ENSPFOG0000001作为我的关键字,那么脚本应该运行并确定这些链是正的,在列表中收集开始坐标,然后输出一个只有坐标的文件。该文件将具有关键字+“_plus”。txt”附加。如果它是ENSPFOG0000002,那么它将收集负链坐标,并创建一个文件,其中关键字+“_-”。txt”创建。

1 个解决方案

#1


0  

An empty list evaluates to False:

空列表的计算结果为False:

>>> exon_start_minus, exon_start_plus = [], []
>>> bool(exon_start_minus), bool(exon_start_plus)
(False, False)
>>> exon_start_minus, exon_start_plus = [1], []
>>> bool(exon_start_minus), bool(exon_start_plus)
(True, False)
>>> exon_start_minus, exon_start_plus = [1], [1]
>>> bool(exon_start_minus), bool(exon_start_plus)
(True, True)
>>> exon_start_minus, exon_start_plus = [], [1]
>>> bool(exon_start_minus), bool(exon_start_plus)
(False, True)

So you can test for an empty list and take action as appropriate

因此,您可以对空列表进行测试,并酌情采取行动。

>>> if exon_start_plus:
    print('!!!')

!!!
>>> if exon_start_minus:
    print('!!!')

>>> 

To retrieve both lists from the function:

从函数中检索两个列表:

exon_start_minus, exon_start_plus =  exon_coords()

#1


0  

An empty list evaluates to False:

空列表的计算结果为False:

>>> exon_start_minus, exon_start_plus = [], []
>>> bool(exon_start_minus), bool(exon_start_plus)
(False, False)
>>> exon_start_minus, exon_start_plus = [1], []
>>> bool(exon_start_minus), bool(exon_start_plus)
(True, False)
>>> exon_start_minus, exon_start_plus = [1], [1]
>>> bool(exon_start_minus), bool(exon_start_plus)
(True, True)
>>> exon_start_minus, exon_start_plus = [], [1]
>>> bool(exon_start_minus), bool(exon_start_plus)
(False, True)

So you can test for an empty list and take action as appropriate

因此,您可以对空列表进行测试,并酌情采取行动。

>>> if exon_start_plus:
    print('!!!')

!!!
>>> if exon_start_minus:
    print('!!!')

>>> 

To retrieve both lists from the function:

从函数中检索两个列表:

exon_start_minus, exon_start_plus =  exon_coords()