Python - 如何删除特定单词/字符后的其余字符串

时间:2022-09-13 09:56:55

I'm a complete python noob, so please go easy.

我是一个完整的python noob,所以请放轻松。

I'm currently hacking/editing a kodi plugin called pseudo library, so that it cleans up the titles of the streams I'm grabbing so that I can put it into a better looking EPG.

我正在攻击/编辑一个名为伪库的kodi插件,这样它就可以清理我正在抓取的流的标题,这样我就可以把它放到一个更好看的EPG中。

Currently they look like this:

目前它们看起来像这样:

[COLOR white]3E (Now - 07 - 30 That '70s Show) - .strm

[COLOR white] 3E(现在 - 07 - 30那70年代秀) - .strm

I've identified the code that produces this here:

我在这里确定了生成此代码的代码:

FleName = (title + ' - ' + eptitle + '.strm').replace(":"," - ")
FleName = re.sub('[\/:*?<>|!@#$/:]', '', FleName)

and edited as follows (messy I know and I'm sure there is a much better way, as I said above I'm a noob!)

编辑如下(凌乱我知道,我确信有更好的方法,正如我上面说的那样,我是一个菜鸟!)

FleName = (title + '.strm').replace(":"," - ").replace("[COLOR white]","").replace("[COLOR blue]","")
FleName = re.sub('[\/:*?<>|!@#$/:]', '', FleName)

This then changes the above title to:

然后将以上标题更改为:

3E (Now - 07 - 30 That '70s Show).strm

3E(现在 - 07 - 30那70年代秀).strm

What I really want the output to be is:

我真正想要的输出是:

3E.strm

The closest answer I can find to my problem is here:

我能找到最接近我的问题的答案是:

https://*.com/a/14599280

However I also have parentheses within parentheses to remove and the above does not solve that e.g.

但是我在括号内也有括号要删除,而上述内容并没有解决这个问题。

Zee Cinema (Now - 19 - 15 Baazigar (1993)).strm

Zee Cinema(现在 - 19 - 15 Baazigar(1993))。strm

I've looked at strip to remove all characters after and including "(Now" but can't quite work it out. Please can someone provide a universal solution to my problem above so that whether the title is

我已经看过strip以删除所有字符,包括“(现在”但不能完全解决。请有人为我上面的问题提供一个通用的解决方案,以便标题是否为

[COLOR white]3E (Now - 07 - 30 That '70s Show) - .strm OR

[COLOR white] 3E(现在 - 07 - 30那70年代秀) - .strm OR

[COLOR white]Zee Cinema (Now - 19 - 15 Baazigar (1993)).strm

[COLOR white] Zee Cinema(现在 - 19 - 15 Baazigar(1993))。strm

that it outputs just the title and .strm. So in the examples above:

它只输出标题和.strm。所以在上面的例子中:

3E.strm
Zee Cinema.strm

Many thanks for looking and for hopefully helping me resolve my issue.

非常感谢您寻找并希望帮助我解决我的问题。

3 个解决方案

#1


FileName.split(']')[1].split('(')[0].strip() + ".strm"

FileName.split(']')[1] .split('(')[0] .strip()+“。strmm”

#2


So you essentially you have something of the form [something]text you want (something else).strm? The easiest way to solve this is to just ignore everything after the opening ( and before the extension:

所以你基本上你有一些你想要的东西(某些东西).strm?解决此问题的最简单方法是在打开之后(以及扩展之前)忽略所有内容:

re.sub(r"^[^\]]+\]([^(]+) \(.*\.strm$",r"\1.strm",FleName)

Be aware of failure modes, however. This will fail for improperly formatted filenames by not changing them at all in most cases. Craig's will fail with an exception in most cases. It's possible that a more complex solution could be made to raise an exception for a wider range of improperly formatted filename, but neither of these solutions do.

但是要注意故障模式。对于格式不正确的文件名,在大多数情况下根本不更改它们将会失败。在大多数情况下,克雷格的失败会有例外。可能会有一个更复杂的解决方案来为更广泛的格式不正确的文件名引发异常,但这些解决方案都没有。

#3


Based on the pattern of the original titles it seems like you need to get the text between the first '](' pair, strip the whitespace and append the extension. Here's an example:

基于原始标题的模式,您似乎需要在第一个']('对之间获取文本,删除空白并附加扩展名。这是一个示例:

originalFileName = "[COLOR white]3E (Now - 07 - 30 That '70s Show) - .strm"
fileName, fileExt = originalFileName.split(".")
newFileName = ".".join([re.search("\](.*?)\(", fileName).groups()[0].strip(), fileExt])

#1


FileName.split(']')[1].split('(')[0].strip() + ".strm"

FileName.split(']')[1] .split('(')[0] .strip()+“。strmm”

#2


So you essentially you have something of the form [something]text you want (something else).strm? The easiest way to solve this is to just ignore everything after the opening ( and before the extension:

所以你基本上你有一些你想要的东西(某些东西).strm?解决此问题的最简单方法是在打开之后(以及扩展之前)忽略所有内容:

re.sub(r"^[^\]]+\]([^(]+) \(.*\.strm$",r"\1.strm",FleName)

Be aware of failure modes, however. This will fail for improperly formatted filenames by not changing them at all in most cases. Craig's will fail with an exception in most cases. It's possible that a more complex solution could be made to raise an exception for a wider range of improperly formatted filename, but neither of these solutions do.

但是要注意故障模式。对于格式不正确的文件名,在大多数情况下根本不更改它们将会失败。在大多数情况下,克雷格的失败会有例外。可能会有一个更复杂的解决方案来为更广泛的格式不正确的文件名引发异常,但这些解决方案都没有。

#3


Based on the pattern of the original titles it seems like you need to get the text between the first '](' pair, strip the whitespace and append the extension. Here's an example:

基于原始标题的模式,您似乎需要在第一个']('对之间获取文本,删除空白并附加扩展名。这是一个示例:

originalFileName = "[COLOR white]3E (Now - 07 - 30 That '70s Show) - .strm"
fileName, fileExt = originalFileName.split(".")
newFileName = ".".join([re.search("\](.*?)\(", fileName).groups()[0].strip(), fileExt])