Python strip()函数 - 删除字符串之前/之后的字符

时间:2021-07-07 22:52:05

I am trying to remove everything in the following array except for the two numbers and the , in between.

我试图删除以下数组中的所有内容,除了两个数字和中间的数字。

This is the array: [array([[ 1948.97753906, 1058.23937988]], dtype=float32)]

这是数组:[array([[1948.97753906,1058.23937988]],dtype = float32)]

This array is always changing in size (can have 1 pair of numbers or 6 pairs etc) and being filled with different numbers, however, the format always remains the same.

此数组的大小总是在变化(可以有1对数字或6对等)并且填充不同的数字,但格式始终保持不变。

I currently have the below code, however, I think this is only working when there is one pair of numbers in the array??

我目前有以下代码,但是,我认为只有在数组中有一对数字时才能使用它?

final = str(self.lostfeatures).strip('[array([[ ').strip(']], dtype=float32)')

Any help would be greatly appreciated!

任何帮助将不胜感激!

5 个解决方案

#1


1  

if that is really just a prefix/suffix, use replace:

如果那只是一个前缀/后缀,请使用replace:

final = str(self.lostfeatures).replace('[array([[','').replace(']], dtype=float32)', '')

You can do something similar with regex:

你可以用正则表达式做类似的事情:

numbers = re.findall('(?P<number>\d+\.\d+)', str(self.lostfeatures))

which will also give you an array of the numbers themselves (so it's trivial to cast to float from there).

这也将给你一个数字本身的数组(所以从那里转换为浮动是微不足道的)。

However... if you are doing str(lostfeatures), the original must already be in an array. Why are you even casting to string? You should be able to extract the numerical array directly like this:

但是......如果你正在做str(lostfeatures),原来必须已经在一个数组中。为什么你甚至要投射到弦?您应该能够像这样直接提取数值数组:

lostfeatures[0][0]

(you appear to have two levels of indirection... lostfeatures[0] = array([[ 1948.97753906, 1058.23937988]], then lostfeatures[0][0] == [1948.97753906, 1058.23937988]). It's not clear exactly what your data structure looks like, but that would be by far the fastest.

(你似乎有两个层次的间接... lostfeatures [0] =数组([[1948.97753906,1058.23937988]],然后是lostfeatures [0] [0] == [1948.97753906,1058.23937988]。目前还不清楚究竟是什么数据结构看起来像,但这是迄今为止最快的。

#2


1  

I'll take a punt that you've got a 2D numpy array (self.features) of (coordinate pairs?) and you want to format each row (position?), eg:

我会说你有一个2D numpy数组(self.features)(坐标对?)并且你想格式化每一行(位置?),例如:

for pair in self.features: 
    print '{0}, {1}'.format(*pair)

#3


0  

As in your example. I think that answers your question.

如你的例子。我认为这可以回答你的问题。

>>> x = "[array([[ 1948.97753906, 1058.23937988]], dtype=float32)]"
>>> print x.split("[[")[1].split("]]")[0].replace(",","")

#4


0  

If the format is always the same, that is it always starts with "[array([[" and always ends with "]], dtype=float32)" you should use a slice instead.

如果格式总是相同的,那就是它总是以“[array([[”并且始终以“]结束],dtype = float32)开头”你应该使用切片。

final = str(self.lostfeatures)[len('[array([[ '):-len(']], dtype=float32)')]

#5


0  

I'd probably recommend a regex for this use case

我可能会为这个用例推荐一个正则表达式

import re

ptrn = re.compile(r'((?:\d+(?:\.\d+)?, ?)+(?:\d+(?:\.\d+)?))'

x = "[array([[ 1948.97753906, 1058.23937988]], dtype=float32)]"
print ptrn.search(x).group(1)

#1


1  

if that is really just a prefix/suffix, use replace:

如果那只是一个前缀/后缀,请使用replace:

final = str(self.lostfeatures).replace('[array([[','').replace(']], dtype=float32)', '')

You can do something similar with regex:

你可以用正则表达式做类似的事情:

numbers = re.findall('(?P<number>\d+\.\d+)', str(self.lostfeatures))

which will also give you an array of the numbers themselves (so it's trivial to cast to float from there).

这也将给你一个数字本身的数组(所以从那里转换为浮动是微不足道的)。

However... if you are doing str(lostfeatures), the original must already be in an array. Why are you even casting to string? You should be able to extract the numerical array directly like this:

但是......如果你正在做str(lostfeatures),原来必须已经在一个数组中。为什么你甚至要投射到弦?您应该能够像这样直接提取数值数组:

lostfeatures[0][0]

(you appear to have two levels of indirection... lostfeatures[0] = array([[ 1948.97753906, 1058.23937988]], then lostfeatures[0][0] == [1948.97753906, 1058.23937988]). It's not clear exactly what your data structure looks like, but that would be by far the fastest.

(你似乎有两个层次的间接... lostfeatures [0] =数组([[1948.97753906,1058.23937988]],然后是lostfeatures [0] [0] == [1948.97753906,1058.23937988]。目前还不清楚究竟是什么数据结构看起来像,但这是迄今为止最快的。

#2


1  

I'll take a punt that you've got a 2D numpy array (self.features) of (coordinate pairs?) and you want to format each row (position?), eg:

我会说你有一个2D numpy数组(self.features)(坐标对?)并且你想格式化每一行(位置?),例如:

for pair in self.features: 
    print '{0}, {1}'.format(*pair)

#3


0  

As in your example. I think that answers your question.

如你的例子。我认为这可以回答你的问题。

>>> x = "[array([[ 1948.97753906, 1058.23937988]], dtype=float32)]"
>>> print x.split("[[")[1].split("]]")[0].replace(",","")

#4


0  

If the format is always the same, that is it always starts with "[array([[" and always ends with "]], dtype=float32)" you should use a slice instead.

如果格式总是相同的,那就是它总是以“[array([[”并且始终以“]结束],dtype = float32)开头”你应该使用切片。

final = str(self.lostfeatures)[len('[array([[ '):-len(']], dtype=float32)')]

#5


0  

I'd probably recommend a regex for this use case

我可能会为这个用例推荐一个正则表达式

import re

ptrn = re.compile(r'((?:\d+(?:\.\d+)?, ?)+(?:\d+(?:\.\d+)?))'

x = "[array([[ 1948.97753906, 1058.23937988]], dtype=float32)]"
print ptrn.search(x).group(1)