在numpy数组中获得具有最小长度的相同条目的序列范围

时间:2020-11-30 21:21:05

Consider an array with entries consisting exclusively of -1 or 1. How do I get the ranges of all slices containing 1 exclusively and being of minimum length t (e.g. t=3)

考虑一个条目,其条目仅由-1或1组成。如何获得仅包含1且最小长度为t的所有切片的范围(例如,t = 3)

Example:

>>>a=np.array([-1,-1,1,1,1,1,1,-1,1,-1,-1,1,1,1,1], dtype=int)
>>> a
array([-1, -1,  1,  1,  1,  1,  1, -1,  1, -1, -1,  1,  1,  1,  1])

Then, desired output fort=3 would be [(2,7),(11,15)].

然后,期望输出fort = 3将是[(2,7),(11,15)]。

2 个解决方案

#1


3  

One approach using np.diff and np.where -

使用np.diff和np.where的一种方法 -

# Append with `-1s` at either ends and get the differentiation
dfa = np.diff(np.hstack((-1,a,-1)))

# Get the positions of starts and stops of 1s in `a`
starts = np.where(dfa==2)[0]
stops = np.where(dfa==-2)[0]

# Get valid mask for pairs from starts and stops being of at least 3 in length
valid_mask = (stops - starts) >= 3

# Finally collect the valid pairs as the output
out = np.column_stack((starts,stops))[valid_mask].tolist()

#2


0  

Don't know numpy very well but wouldn't be better to use simple function?

不知道numpy很好但是使用简单的功能会不会更好?

def slices(a, t):
    start = None
    i = 0 # index into array
    slices = [] 
    for val in a:
        if a[i] == 1: # start of sequence
            if start is None:
                start = i
        else: # -1 end of sequence
            if start is not None:
                if i - start >= t: # check sequence for minimum size
                    slices.append((start, i))
                start = None
        i += 1

    # if sequence of 1's doesn't end with -1 within array
    if start is not None:
        if i - start >= t:
            slices.append((start, i))

   return slices

#1


3  

One approach using np.diff and np.where -

使用np.diff和np.where的一种方法 -

# Append with `-1s` at either ends and get the differentiation
dfa = np.diff(np.hstack((-1,a,-1)))

# Get the positions of starts and stops of 1s in `a`
starts = np.where(dfa==2)[0]
stops = np.where(dfa==-2)[0]

# Get valid mask for pairs from starts and stops being of at least 3 in length
valid_mask = (stops - starts) >= 3

# Finally collect the valid pairs as the output
out = np.column_stack((starts,stops))[valid_mask].tolist()

#2


0  

Don't know numpy very well but wouldn't be better to use simple function?

不知道numpy很好但是使用简单的功能会不会更好?

def slices(a, t):
    start = None
    i = 0 # index into array
    slices = [] 
    for val in a:
        if a[i] == 1: # start of sequence
            if start is None:
                start = i
        else: # -1 end of sequence
            if start is not None:
                if i - start >= t: # check sequence for minimum size
                    slices.append((start, i))
                start = None
        i += 1

    # if sequence of 1's doesn't end with -1 within array
    if start is not None:
        if i - start >= t:
            slices.append((start, i))

   return slices