如何从像R这样的numpy中提取特殊数据

时间:2021-09-15 16:32:08

i stored a bed file in numpy like this:

我存储了一个numpy床文件,如下所示:

>>> t
array([['chr1', '2488152', '2488153'],
       ['chr1', '2488397', '2488398'],
       ['chr1', '2491262', '2491417'],
       ..., 
       ['chrX', '153628144', '153628282'],
       ['chrX', '154292795', '154292796'],
       ['chrX', '154294899', '154294900']], 
      dtype='|S9')

usually, I do this job with R

通常,我用R做这个工作

library(dplyr)
filter(t, chrom=='chr1')

how can I get the same result with numpy? and is there any better way to stored bed file and extract the special lines I need?

如何用numpy获得相同的结果?有没有更好的方法来存储床文件并提取我需要的特殊线?

thanks for any help.

谢谢你的帮助。

1 个解决方案

#1


0  

Create a boolean mask by comparing the first column of every row in t to some value, and then apply that mask to t

通过将t中每行的第一列与某个值进行比较来创建一个布尔掩码,然后将该掩码应用于t

>>> t[:,0]
array([b'chr1', b'chr1', b'chr1', b'chrX', b'chrX', b'chrX'], 
      dtype='|S9')
>>> mask = t[:,0]==b'chr1'
>>> mask
array([ True,  True,  True, False, False, False], dtype=bool)
>>> t[mask]
array([[b'chr1', b'2488152', b'2488153'],
       [b'chr1', b'2488397', b'2488398'],
       [b'chr1', b'2491262', b'2491417']], 
      dtype='|S9')

#1


0  

Create a boolean mask by comparing the first column of every row in t to some value, and then apply that mask to t

通过将t中每行的第一列与某个值进行比较来创建一个布尔掩码,然后将该掩码应用于t

>>> t[:,0]
array([b'chr1', b'chr1', b'chr1', b'chrX', b'chrX', b'chrX'], 
      dtype='|S9')
>>> mask = t[:,0]==b'chr1'
>>> mask
array([ True,  True,  True, False, False, False], dtype=bool)
>>> t[mask]
array([[b'chr1', b'2488152', b'2488153'],
       [b'chr1', b'2488397', b'2488398'],
       [b'chr1', b'2491262', b'2491417']], 
      dtype='|S9')