如何比较python中的Rpm版本

时间:2022-08-16 22:48:24

I'm trying to find out how I can compare 2 lists of RPMS (Currently installed) and (Available in local repository) and see which RPMS are out of date. I've been tinkering with regex but there are so many different naming standards for RPMS that i can't get a good list to work with. I don't have the actual RPMS on my drive so i can't do rpm -qif.

我正在尝试找出如何比较两个rpm列表(当前已安装)和(在本地存储库中可用),并查看哪些rpm已经过时。我一直在修改regex,但是有太多不同的rpm命名标准,我无法得到一个好的列表。我的驱动器上没有实际的rpm,所以我不能做rpm -qif。

pattern1 = re.compile(r'^([a-zA-Z0-9_\-\+]*)-([a-zA-Z0-9_\.]*)-([a-zA-Z0-9_\.]*)\.(.*)')
for rpm in listOfRpms:
     packageInfo = pattern1.search(rpm[0]).groups()
     print packageInfo

This works for a vast majority but not all (2300 / 2400)

这适用于绝大多数人,但不是所有人(2300 / 2400)

  yum-metadata-parser-1.1.2-2.el5
('yum-metadata-parser', '1.1.2', '2', 'el5') **What I need

But none these work for instance unless I break some others that worked before..

但是,这些方法都不管用,除非我把以前用过的其他方法都打破了。

  • wvdial-1.54.0-3
  • wvdial-1.54.0-3
  • xdelta-1.1.3-20
  • xdelta-1.1.3-20
  • xdelta-1.1.3-20_2
  • xdelta-1.1.3-20_2
  • xmlsec1-1.2.6-3
  • xmlsec1-1.2.6-3
  • xmlsec1-1.2.6-3_2
  • xmlsec1-1.2.6-3_2
  • ypbind-1.17.2-13
  • ypbind-1.17.2-13
  • ypbind-1.17.2-8
  • ypbind-1.17.2-8
  • ypserv-2.13-14
  • ypserv - 2.13 - 14所示
  • zip-2.3-27
  • zip - 2.3 - 27所示
  • zlib-1.2.3-3
  • zlib-1.2.3-3
  • zlib-1.2.3-3_2
  • zlib-1.2.3-3_2
  • zsh-4.2.6-1
  • zsh-4.2.6-1

5 个解决方案

#1


15  

In RPM parlance, 2.el5 is the release field; 2 and el5 are not separate fields. However, release need not have a . in it as your examples show. Drop the \.(.*) from the end to capture the release field in one shot.

RPM的说法,2。el5是发布字段;2和el5不是独立的字段。但是,发布不需要a。如你的例子所示。从末尾删除(. .*),一次捕获发布字段。

So now you have a package name, version, and release. The easiest way to compare them is to use rpm's python module:

现在您有了包名、版本和发布。比较它们的最简单方法是使用rpm的python模块:

import rpm
# t1 and t2 are tuples of (version, release)
def compare(t1, t2):
    v1, r1 = t1
    v2, r2 = t2
    return rpm.labelCompare(('1', v1, r1), ('1', v2, r2))

What's that extra '1', you ask? That's epoch, and it overrides other version comparison considerations. Further, it's generally not available in the filename. Here, we're faking it to '1' for purposes of this exercise, but that may not be accurate at all. This is one of two reasons your logic is going to be off if you're going by file names alone.

你会问,多出来的1是什么?这是一个时代,它超越了其他版本的比较考虑。此外,它通常在文件名中不可用。在这里,我们把它伪装成“1”来做这个练习,但这可能根本不准确。如果你只使用文件名,这是你的逻辑会被关闭的两个原因之一。

The other reason that your logic may be different from rpm's is the Obsoletes field, which allows a package to be upgraded to a package with an entirely different name. If you're OK with these limitations, then proceed.

您的逻辑可能与rpm不同的另一个原因是废弃字段,该字段允许将包升级到具有完全不同名称的包。如果你对这些限制没有意见,那就继续。

If you don't have the rpm python library at hand, here's the logic for comparing each of release, version, and epoch as of rpm 4.4.2.3:

如果您手头没有rpm python库,下面是比较每个版本、版本和rpm 4.4.4.2.3的历元的逻辑:

  • Search each string for alphabetic fields [a-zA-Z]+ and numeric fields [0-9]+ separated by junk [^a-zA-Z0-9]*.
  • 每个字符串搜索字母领域[a-zA-Z][0 - 9]+ +和数字字段由垃圾[^ a-zA-Z0-9]*。
  • Successive fields in each string are compared to each other.
  • 每个字符串中的连续字段进行比较。
  • Alphabetic sections are compared lexicographically, and the numeric sections are compared numerically.
  • 按字母顺序对分段进行词汇比较,对数字分段进行数值比较。
  • In the case of a mismatch where one field is numeric and one is alphabetic, the numeric field is always considered greater (newer).
  • 如果一个字段是数字的,而另一个字段是字母的,那么数字字段总是被认为是更大的(更新的)。
  • In the case where one string runs out of fields, the other is always considered greater (newer).
  • 在一个字符串耗尽字段的情况下,另一个总是被认为是较大的(更新的)。

See lib/rpmvercmp.c in the RPM source for the gory details.

看到lib / rpmvercmp。在RPM源代码为血腥的细节。

#2


2  

Here's a working program based off of rpmdev-vercmp from the rpmdevtools package. You shouldn't need anything special installed but yum (which provides the rpmUtils.miscutils python module) for it to work.

这是一个基于rpmdev-vercmp的工作程序,来自rpmdevtools包。您不需要安装任何特殊的东西,但是yum(提供rpmUtils)。它的工作。

The advantage over the other answers is you don't need to parse anything out, just feed it full RPM name-version strings like:

相对于其他答案的优势是,你不需要解析任何东西,只需输入完整的RPM名称-版本字符串,比如:

$ ./rpmcmp.py bash-3.2-32.el5_9.1 bash-3.2-33.el5.1
0:bash-3.2-33.el5.1 is newer
$ echo $?
12

Exit status 11 means the first one is newer, 12 means the second one is newer.

退出状态11表示第一个更新,12表示第二个更新。

#!/usr/bin/python

import rpm
import sys
from rpmUtils.miscutils import stringToVersion

if len(sys.argv) != 3:
    print "Usage: %s <rpm1> <rpm2>"
    sys.exit(1)

def vercmp((e1, v1, r1), (e2, v2, r2)):
    return rpm.labelCompare((e1, v1, r1), (e2, v2, r2))

(e1, v1, r1) = stringToVersion(sys.argv[1])
(e2, v2, r2) = stringToVersion(sys.argv[2])

rc = vercmp((e1, v1, r1), (e2, v2, r2))
if rc > 0:
    print "%s:%s-%s is newer" % (e1, v1, r1)
    sys.exit(11)

elif rc == 0:
    print "These are equal"
    sys.exit(0)

elif rc < 0:
    print "%s:%s-%s is newer" % (e2, v2, r2)
    sys.exit(12)

#3


1  

Based on Owen S's excellent answer, I put together a snippet that uses the system RPM bindings if available, but falls back to a regex based emulation otherwise:

根据Owen S出色的回答,我整理了一个使用系统RPM绑定(如果有的话)的代码片段,但是如果没有的话,又回到了基于regex的仿真:

try:
    from rpm import labelCompare as _compare_rpm_labels
except ImportError:
    # Emulate RPM field comparisons
    #
    # * Search each string for alphabetic fields [a-zA-Z]+ and
    #   numeric fields [0-9]+ separated by junk [^a-zA-Z0-9]*.
    # * Successive fields in each string are compared to each other.
    # * Alphabetic sections are compared lexicographically, and the
    #   numeric sections are compared numerically.
    # * In the case of a mismatch where one field is numeric and one is
    #   alphabetic, the numeric field is always considered greater (newer).
    # * In the case where one string runs out of fields, the other is always
    #   considered greater (newer).

    import warnings
    warnings.warn("Failed to import 'rpm', emulating RPM label comparisons")

    try:
        from itertools import zip_longest
    except ImportError:
        from itertools import izip_longest as zip_longest

    _subfield_pattern = re.compile(
        r'(?P<junk>[^a-zA-Z0-9]*)((?P<text>[a-zA-Z]+)|(?P<num>[0-9]+))'
    )

    def _iter_rpm_subfields(field):
        """Yield subfields as 2-tuples that sort in the desired order

        Text subfields are yielded as (0, text_value)
        Numeric subfields are yielded as (1, int_value)
        """
        for subfield in _subfield_pattern.finditer(field):
            text = subfield.group('text')
            if text is not None:
                yield (0, text)
            else:
                yield (1, int(subfield.group('num')))

    def _compare_rpm_field(lhs, rhs):
        # Short circuit for exact matches (including both being None)
        if lhs == rhs:
            return 0
        # Otherwise assume both inputs are strings
        lhs_subfields = _iter_rpm_subfields(lhs)
        rhs_subfields = _iter_rpm_subfields(rhs)
        for lhs_sf, rhs_sf in zip_longest(lhs_subfields, rhs_subfields):
            if lhs_sf == rhs_sf:
                # When both subfields are the same, move to next subfield
                continue
            if lhs_sf is None:
                # Fewer subfields in LHS, so it's less than/older than RHS
                return -1
            if rhs_sf is None:
                # More subfields in LHS, so it's greater than/newer than RHS
                return 1
            # Found a differing subfield, so it determines the relative order
            return -1 if lhs_sf < rhs_sf else 1
        # No relevant differences found between LHS and RHS
        return 0


    def _compare_rpm_labels(lhs, rhs):
        lhs_epoch, lhs_version, lhs_release = lhs
        rhs_epoch, rhs_version, rhs_release = rhs
        result = _compare_rpm_field(lhs_epoch, rhs_epoch)
        if result:
            return result
        result = _compare_rpm_field(lhs_version, rhs_version)
        if result:
            return result
        return _compare_rpm_field(lhs_release, rhs_release)

Note that I haven't tested this extensively for consistency with the C level implementation - I only use it as a fallback implementation that's at least good enough to let Anitya's test suite pass in environments where system RPM bindings aren't available.

注意,我还没有对它进行全面的测试,以确保与C级实现的一致性——我只将它作为一个后备实现使用,它至少足以让Anitya的测试套件在系统RPM绑定不可用的环境中通过。

#4


1  

A much simpler regex is /^(.+)-(.+)-(.+)\.(.+)\.rpm$/

一个更简单的正则表达式/ ^(+)-(+)-(+)\。(+)\ . rpm /美元

I'm not aware of any restrictions on the package name (first capture). The only restrictions on version and release are that they do not contain '-'. There is no need to code this, as the uncaptured '-'s separate those fields, thus if one did have a '-' it would be split and not be a single feild, ergo the resulting capture would not contain a '-'. Only the first capture, the name, contains any '-' because it consumes all extraneous '-' first.

我不知道对包名(第一次捕获)有任何限制。对版本和发布的唯一限制是它们不包含“-”。没有必要对它进行编码,因为未捕获的-' -'是分开的字段,因此如果有一个'-'它将被分割,而不是一个单独的实体,那么产生的捕获将不包含'-'。只有第一个捕获,名字,包含任何'-',因为它首先消耗所有无关的'-'。

Then, there's the architecture, which this regex assumes no restriction on the architecture name, except that it not contain a '.'.

然后,还有体系结构,这个regex对体系结构名称没有任何限制,除了它不包含'。

The capture results are [name, version, release, arch]

捕获结果是[name, version, release, arch]

Caveats from Owen's answer about relying on the rpm name alone still apply.

欧文关于仅依赖rpm名称的回答仍然适用。

Now you have to compare the version strings, which is not straightforward. I don't believe that can be done with a regex. You'd need to implement the comparison algorithm.

现在需要比较版本字符串,这并不简单。我不认为可以用正则表达式来做。您需要实现比较算法。

#5


0  

RPM has python bindings, which lets you use rpmUtils.miscutils.compareEVR. The first and third arguments of the tuple are the package name and the packaging version. The middle is the version. In the example below, I'm trying to figure out where 3.7.4a gets sorted.

RPM具有python绑定,可以使用rps .miscutils. compareevr。元组的第一个和第三个参数是包名和打包版本。中间是版本。在下面的示例中,我试图找出3.7.4a在哪里排序。

[root@rhel56 ~]# python
Python 2.4.3 (#1, Dec 10 2010, 17:24:35) 
[GCC 4.1.2 20080704 (Red Hat 4.1.2-50)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import rpmUtils.miscutils
>>> rpmUtils.miscutils.compareEVR(("foo", "3.7.4", "1"), ("foo", "3.7.4", "1"))
0
>>> rpmUtils.miscutils.compareEVR(("foo", "3.7.4", "1"), ("foo", "3.7.4a", "1")) 
-1
>>> rpmUtils.miscutils.compareEVR(("foo", "3.7.4a", "1"), ("foo", "3.7.4", "1")) 
1

#1


15  

In RPM parlance, 2.el5 is the release field; 2 and el5 are not separate fields. However, release need not have a . in it as your examples show. Drop the \.(.*) from the end to capture the release field in one shot.

RPM的说法,2。el5是发布字段;2和el5不是独立的字段。但是,发布不需要a。如你的例子所示。从末尾删除(. .*),一次捕获发布字段。

So now you have a package name, version, and release. The easiest way to compare them is to use rpm's python module:

现在您有了包名、版本和发布。比较它们的最简单方法是使用rpm的python模块:

import rpm
# t1 and t2 are tuples of (version, release)
def compare(t1, t2):
    v1, r1 = t1
    v2, r2 = t2
    return rpm.labelCompare(('1', v1, r1), ('1', v2, r2))

What's that extra '1', you ask? That's epoch, and it overrides other version comparison considerations. Further, it's generally not available in the filename. Here, we're faking it to '1' for purposes of this exercise, but that may not be accurate at all. This is one of two reasons your logic is going to be off if you're going by file names alone.

你会问,多出来的1是什么?这是一个时代,它超越了其他版本的比较考虑。此外,它通常在文件名中不可用。在这里,我们把它伪装成“1”来做这个练习,但这可能根本不准确。如果你只使用文件名,这是你的逻辑会被关闭的两个原因之一。

The other reason that your logic may be different from rpm's is the Obsoletes field, which allows a package to be upgraded to a package with an entirely different name. If you're OK with these limitations, then proceed.

您的逻辑可能与rpm不同的另一个原因是废弃字段,该字段允许将包升级到具有完全不同名称的包。如果你对这些限制没有意见,那就继续。

If you don't have the rpm python library at hand, here's the logic for comparing each of release, version, and epoch as of rpm 4.4.2.3:

如果您手头没有rpm python库,下面是比较每个版本、版本和rpm 4.4.4.2.3的历元的逻辑:

  • Search each string for alphabetic fields [a-zA-Z]+ and numeric fields [0-9]+ separated by junk [^a-zA-Z0-9]*.
  • 每个字符串搜索字母领域[a-zA-Z][0 - 9]+ +和数字字段由垃圾[^ a-zA-Z0-9]*。
  • Successive fields in each string are compared to each other.
  • 每个字符串中的连续字段进行比较。
  • Alphabetic sections are compared lexicographically, and the numeric sections are compared numerically.
  • 按字母顺序对分段进行词汇比较,对数字分段进行数值比较。
  • In the case of a mismatch where one field is numeric and one is alphabetic, the numeric field is always considered greater (newer).
  • 如果一个字段是数字的,而另一个字段是字母的,那么数字字段总是被认为是更大的(更新的)。
  • In the case where one string runs out of fields, the other is always considered greater (newer).
  • 在一个字符串耗尽字段的情况下,另一个总是被认为是较大的(更新的)。

See lib/rpmvercmp.c in the RPM source for the gory details.

看到lib / rpmvercmp。在RPM源代码为血腥的细节。

#2


2  

Here's a working program based off of rpmdev-vercmp from the rpmdevtools package. You shouldn't need anything special installed but yum (which provides the rpmUtils.miscutils python module) for it to work.

这是一个基于rpmdev-vercmp的工作程序,来自rpmdevtools包。您不需要安装任何特殊的东西,但是yum(提供rpmUtils)。它的工作。

The advantage over the other answers is you don't need to parse anything out, just feed it full RPM name-version strings like:

相对于其他答案的优势是,你不需要解析任何东西,只需输入完整的RPM名称-版本字符串,比如:

$ ./rpmcmp.py bash-3.2-32.el5_9.1 bash-3.2-33.el5.1
0:bash-3.2-33.el5.1 is newer
$ echo $?
12

Exit status 11 means the first one is newer, 12 means the second one is newer.

退出状态11表示第一个更新,12表示第二个更新。

#!/usr/bin/python

import rpm
import sys
from rpmUtils.miscutils import stringToVersion

if len(sys.argv) != 3:
    print "Usage: %s <rpm1> <rpm2>"
    sys.exit(1)

def vercmp((e1, v1, r1), (e2, v2, r2)):
    return rpm.labelCompare((e1, v1, r1), (e2, v2, r2))

(e1, v1, r1) = stringToVersion(sys.argv[1])
(e2, v2, r2) = stringToVersion(sys.argv[2])

rc = vercmp((e1, v1, r1), (e2, v2, r2))
if rc > 0:
    print "%s:%s-%s is newer" % (e1, v1, r1)
    sys.exit(11)

elif rc == 0:
    print "These are equal"
    sys.exit(0)

elif rc < 0:
    print "%s:%s-%s is newer" % (e2, v2, r2)
    sys.exit(12)

#3


1  

Based on Owen S's excellent answer, I put together a snippet that uses the system RPM bindings if available, but falls back to a regex based emulation otherwise:

根据Owen S出色的回答,我整理了一个使用系统RPM绑定(如果有的话)的代码片段,但是如果没有的话,又回到了基于regex的仿真:

try:
    from rpm import labelCompare as _compare_rpm_labels
except ImportError:
    # Emulate RPM field comparisons
    #
    # * Search each string for alphabetic fields [a-zA-Z]+ and
    #   numeric fields [0-9]+ separated by junk [^a-zA-Z0-9]*.
    # * Successive fields in each string are compared to each other.
    # * Alphabetic sections are compared lexicographically, and the
    #   numeric sections are compared numerically.
    # * In the case of a mismatch where one field is numeric and one is
    #   alphabetic, the numeric field is always considered greater (newer).
    # * In the case where one string runs out of fields, the other is always
    #   considered greater (newer).

    import warnings
    warnings.warn("Failed to import 'rpm', emulating RPM label comparisons")

    try:
        from itertools import zip_longest
    except ImportError:
        from itertools import izip_longest as zip_longest

    _subfield_pattern = re.compile(
        r'(?P<junk>[^a-zA-Z0-9]*)((?P<text>[a-zA-Z]+)|(?P<num>[0-9]+))'
    )

    def _iter_rpm_subfields(field):
        """Yield subfields as 2-tuples that sort in the desired order

        Text subfields are yielded as (0, text_value)
        Numeric subfields are yielded as (1, int_value)
        """
        for subfield in _subfield_pattern.finditer(field):
            text = subfield.group('text')
            if text is not None:
                yield (0, text)
            else:
                yield (1, int(subfield.group('num')))

    def _compare_rpm_field(lhs, rhs):
        # Short circuit for exact matches (including both being None)
        if lhs == rhs:
            return 0
        # Otherwise assume both inputs are strings
        lhs_subfields = _iter_rpm_subfields(lhs)
        rhs_subfields = _iter_rpm_subfields(rhs)
        for lhs_sf, rhs_sf in zip_longest(lhs_subfields, rhs_subfields):
            if lhs_sf == rhs_sf:
                # When both subfields are the same, move to next subfield
                continue
            if lhs_sf is None:
                # Fewer subfields in LHS, so it's less than/older than RHS
                return -1
            if rhs_sf is None:
                # More subfields in LHS, so it's greater than/newer than RHS
                return 1
            # Found a differing subfield, so it determines the relative order
            return -1 if lhs_sf < rhs_sf else 1
        # No relevant differences found between LHS and RHS
        return 0


    def _compare_rpm_labels(lhs, rhs):
        lhs_epoch, lhs_version, lhs_release = lhs
        rhs_epoch, rhs_version, rhs_release = rhs
        result = _compare_rpm_field(lhs_epoch, rhs_epoch)
        if result:
            return result
        result = _compare_rpm_field(lhs_version, rhs_version)
        if result:
            return result
        return _compare_rpm_field(lhs_release, rhs_release)

Note that I haven't tested this extensively for consistency with the C level implementation - I only use it as a fallback implementation that's at least good enough to let Anitya's test suite pass in environments where system RPM bindings aren't available.

注意,我还没有对它进行全面的测试,以确保与C级实现的一致性——我只将它作为一个后备实现使用,它至少足以让Anitya的测试套件在系统RPM绑定不可用的环境中通过。

#4


1  

A much simpler regex is /^(.+)-(.+)-(.+)\.(.+)\.rpm$/

一个更简单的正则表达式/ ^(+)-(+)-(+)\。(+)\ . rpm /美元

I'm not aware of any restrictions on the package name (first capture). The only restrictions on version and release are that they do not contain '-'. There is no need to code this, as the uncaptured '-'s separate those fields, thus if one did have a '-' it would be split and not be a single feild, ergo the resulting capture would not contain a '-'. Only the first capture, the name, contains any '-' because it consumes all extraneous '-' first.

我不知道对包名(第一次捕获)有任何限制。对版本和发布的唯一限制是它们不包含“-”。没有必要对它进行编码,因为未捕获的-' -'是分开的字段,因此如果有一个'-'它将被分割,而不是一个单独的实体,那么产生的捕获将不包含'-'。只有第一个捕获,名字,包含任何'-',因为它首先消耗所有无关的'-'。

Then, there's the architecture, which this regex assumes no restriction on the architecture name, except that it not contain a '.'.

然后,还有体系结构,这个regex对体系结构名称没有任何限制,除了它不包含'。

The capture results are [name, version, release, arch]

捕获结果是[name, version, release, arch]

Caveats from Owen's answer about relying on the rpm name alone still apply.

欧文关于仅依赖rpm名称的回答仍然适用。

Now you have to compare the version strings, which is not straightforward. I don't believe that can be done with a regex. You'd need to implement the comparison algorithm.

现在需要比较版本字符串,这并不简单。我不认为可以用正则表达式来做。您需要实现比较算法。

#5


0  

RPM has python bindings, which lets you use rpmUtils.miscutils.compareEVR. The first and third arguments of the tuple are the package name and the packaging version. The middle is the version. In the example below, I'm trying to figure out where 3.7.4a gets sorted.

RPM具有python绑定,可以使用rps .miscutils. compareevr。元组的第一个和第三个参数是包名和打包版本。中间是版本。在下面的示例中,我试图找出3.7.4a在哪里排序。

[root@rhel56 ~]# python
Python 2.4.3 (#1, Dec 10 2010, 17:24:35) 
[GCC 4.1.2 20080704 (Red Hat 4.1.2-50)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import rpmUtils.miscutils
>>> rpmUtils.miscutils.compareEVR(("foo", "3.7.4", "1"), ("foo", "3.7.4", "1"))
0
>>> rpmUtils.miscutils.compareEVR(("foo", "3.7.4", "1"), ("foo", "3.7.4a", "1")) 
-1
>>> rpmUtils.miscutils.compareEVR(("foo", "3.7.4a", "1"), ("foo", "3.7.4", "1")) 
1