如何在Python中生成唯一ID? [重复]

时间:2021-07-06 19:38:32

This question already has an answer here:

这个问题在这里已有答案:

I need to generate a unique ID based on a random value.

我需要根据随机值生成唯一ID。

9 个解决方案

#1


125  

Perhaps uuid.uuid4() might do the job. See uuid for more information.

也许uuid.uuid4()可能会完成这项工作。有关更多信息,请参阅uuid。

#2


90  

You might want Python's UUID functions:

您可能需要Python的UUID功能:

21.15. uuid — UUID objects according to RFC 4122

21.15。 uuid - 根据RFC 4122的UUID对象

eg:

例如:

import uuid
print uuid.uuid4()

7d529dd4-548b-4258-aa8e-23e34dc8d43d

7d529dd4-548b-4258-aa8e-23e34dc8d43d

#3


15  

unique and random are mutually exclusive. perhaps you want this?

独特和随机是相互排斥的。也许你想要这个?

import random
def uniqueid():
    seed = random.getrandbits(32)
    while True:
       yield seed
       seed += 1

Usage:

用法:

unique_sequence = uniqueid()
id1 = next(unique_sequence)
id2 = next(unique_sequence)
id3 = next(unique_sequence)
ids = list(itertools.islice(unique_sequence, 1000))

no two returned id is the same (Unique) and this is based on a randomized seed value

没有两个返回的id是相同的(唯一),这是基于随机种子值

#4


5  

import time
import random
import socket
import hashlib

def guid( *args ):
    """
    Generates a universally unique ID.
    Any arguments only create more randomness.
    """
    t = long( time.time() * 1000 )
    r = long( random.random()*100000000000000000L )
    try:
        a = socket.gethostbyname( socket.gethostname() )
    except:
        # if we can't get a network address, just imagine one
        a = random.random()*100000000000000000L
    data = str(t)+' '+str(r)+' '+str(a)+' '+str(args)
    data = hashlib.md5(data).hexdigest()

    return data

#5


4  

Maybe the uuid module?

也许是uuid模块?

#6


4  

here you can find an implementation :

在这里你可以找到一个实现:

def __uniqueid__():
    """
      generate unique id with length 17 to 21.
      ensure uniqueness even with daylight savings events (clocks adjusted one-hour backward).

      if you generate 1 million ids per second during 100 years, you will generate 
      2*25 (approx sec per year) * 10**6 (1 million id per sec) * 100 (years) = 5 * 10**9 unique ids.

      with 17 digits (radix 16) id, you can represent 16**17 = 295147905179352825856 ids (around 2.9 * 10**20).
      In fact, as we need far less than that, we agree that the format used to represent id (seed + timestamp reversed)
      do not cover all numbers that could be represented with 35 digits (radix 16).

      if you generate 1 million id per second with this algorithm, it will increase the seed by less than 2**12 per hour
      so if a DST occurs and backward one hour, we need to ensure to generate unique id for twice times for the same period.
      the seed must be at least 1 to 2**13 range. if we want to ensure uniqueness for two hours (100% contingency), we need 
      a seed for 1 to 2**14 range. that's what we have with this algorithm. You have to increment seed_range_bits if you
      move your machine by airplane to another time zone or if you have a glucky wallet and use a computer that can generate
      more than 1 million ids per second.

      one word about predictability : This algorithm is absolutely NOT designed to generate unpredictable unique id.
      you can add a sha-1 or sha-256 digest step at the end of this algorithm but you will loose uniqueness and enter to collision probability world.
      hash algorithms ensure that for same id generated here, you will have the same hash but for two differents id (a pair of ids), it is
      possible to have the same hash with a very little probability. You would certainly take an option on a bijective function that maps
      35 digits (or more) number to 35 digits (or more) number based on cipher block and secret key. read paper on breaking PRNG algorithms 
      in order to be convinced that problems could occur as soon as you use random library :)

      1 million id per second ?... on a Intel(R) Core(TM)2 CPU 6400 @ 2.13GHz, you get :

      >>> timeit.timeit(uniqueid,number=40000)
      1.0114529132843018

      an average of 40000 id/second
    """
    mynow=datetime.now
    sft=datetime.strftime
    # store old datetime each time in order to check if we generate during same microsecond (glucky wallet !)
    # or if daylight savings event occurs (when clocks are adjusted backward) [rarely detected at this level]
    old_time=mynow() # fake init - on very speed machine it could increase your seed to seed + 1... but we have our contingency :)
    # manage seed
    seed_range_bits=14 # max range for seed
    seed_max_value=2**seed_range_bits - 1 # seed could not exceed 2**nbbits - 1
    # get random seed
    seed=random.getrandbits(seed_range_bits)
    current_seed=str(seed)
    # producing new ids
    while True:
        # get current time 
        current_time=mynow()
        if current_time <= old_time:
            # previous id generated in the same microsecond or Daylight saving time event occurs (when clocks are adjusted backward)
            seed = max(1,(seed + 1) % seed_max_value)
            current_seed=str(seed)
        # generate new id (concatenate seed and timestamp as numbers)
        #newid=hex(int(''.join([sft(current_time,'%f%S%M%H%d%m%Y'),current_seed])))[2:-1]
        newid=int(''.join([sft(current_time,'%f%S%M%H%d%m%Y'),current_seed]))
        # save current time
        old_time=current_time
        # return a new id
        yield newid

""" you get a new id for each call of uniqueid() """
uniqueid=__uniqueid__().next

import unittest
class UniqueIdTest(unittest.TestCase):
    def testGen(self):
        for _ in range(3):
            m=[uniqueid() for _ in range(10)]
            self.assertEqual(len(m),len(set(m)),"duplicates found !")

hope it helps !

希望能帮助到你 !

#7


3  

This will work very quickly but will not generate random values but monotonously increasing ones (for a given thread).

这将非常快速地工作,但不会生成随机值,而是单调增加(对于给定的线程)。

import threading

_uid = threading.local()
def genuid():
    if getattr(_uid, "uid", None) is None:
        _uid.tid = threading.current_thread().ident
        _uid.uid = 0
    _uid.uid += 1
    return (_uid.tid, _uid.uid)

It is thread safe and working with tuples may have benefit as opposed to strings (shorter if anything). If you do not need thread safety feel free remove the threading bits (in stead of threading.local, use object() and remove tid altogether).

它是线程安全的,使用元组可能有益而不是字符串(如果有的话更短)。如果您不需要线程安全,请随意删除线程位(而不是threading.local,使用object()并完全删除tid)。

Hope that helps.

希望有所帮助。

#8


2  

Maybe this work for u

也许这对你有用

str(uuid.uuid4().fields[-1])[:5]

#9


-6  

import time
def new_id():
    time.sleep(0.000001)
    return time.time()

On my system, time.time() seems to offer 6 significant figures after the decimal point. With a brief sleep it should be guaranteed unique with at least a moderate amount of randomness down in the last two or three digits.

在我的系统上,time.time()似乎在小数点后提供了6位有效数字。在短暂的睡眠中,应该保证唯一,在最后两位或三位数中至少有适度的随机性。

You could hash it as well if you're worried.

如果你担心,你也可以哈希。

#1


125  

Perhaps uuid.uuid4() might do the job. See uuid for more information.

也许uuid.uuid4()可能会完成这项工作。有关更多信息,请参阅uuid。

#2


90  

You might want Python's UUID functions:

您可能需要Python的UUID功能:

21.15. uuid — UUID objects according to RFC 4122

21.15。 uuid - 根据RFC 4122的UUID对象

eg:

例如:

import uuid
print uuid.uuid4()

7d529dd4-548b-4258-aa8e-23e34dc8d43d

7d529dd4-548b-4258-aa8e-23e34dc8d43d

#3


15  

unique and random are mutually exclusive. perhaps you want this?

独特和随机是相互排斥的。也许你想要这个?

import random
def uniqueid():
    seed = random.getrandbits(32)
    while True:
       yield seed
       seed += 1

Usage:

用法:

unique_sequence = uniqueid()
id1 = next(unique_sequence)
id2 = next(unique_sequence)
id3 = next(unique_sequence)
ids = list(itertools.islice(unique_sequence, 1000))

no two returned id is the same (Unique) and this is based on a randomized seed value

没有两个返回的id是相同的(唯一),这是基于随机种子值

#4


5  

import time
import random
import socket
import hashlib

def guid( *args ):
    """
    Generates a universally unique ID.
    Any arguments only create more randomness.
    """
    t = long( time.time() * 1000 )
    r = long( random.random()*100000000000000000L )
    try:
        a = socket.gethostbyname( socket.gethostname() )
    except:
        # if we can't get a network address, just imagine one
        a = random.random()*100000000000000000L
    data = str(t)+' '+str(r)+' '+str(a)+' '+str(args)
    data = hashlib.md5(data).hexdigest()

    return data

#5


4  

Maybe the uuid module?

也许是uuid模块?

#6


4  

here you can find an implementation :

在这里你可以找到一个实现:

def __uniqueid__():
    """
      generate unique id with length 17 to 21.
      ensure uniqueness even with daylight savings events (clocks adjusted one-hour backward).

      if you generate 1 million ids per second during 100 years, you will generate 
      2*25 (approx sec per year) * 10**6 (1 million id per sec) * 100 (years) = 5 * 10**9 unique ids.

      with 17 digits (radix 16) id, you can represent 16**17 = 295147905179352825856 ids (around 2.9 * 10**20).
      In fact, as we need far less than that, we agree that the format used to represent id (seed + timestamp reversed)
      do not cover all numbers that could be represented with 35 digits (radix 16).

      if you generate 1 million id per second with this algorithm, it will increase the seed by less than 2**12 per hour
      so if a DST occurs and backward one hour, we need to ensure to generate unique id for twice times for the same period.
      the seed must be at least 1 to 2**13 range. if we want to ensure uniqueness for two hours (100% contingency), we need 
      a seed for 1 to 2**14 range. that's what we have with this algorithm. You have to increment seed_range_bits if you
      move your machine by airplane to another time zone or if you have a glucky wallet and use a computer that can generate
      more than 1 million ids per second.

      one word about predictability : This algorithm is absolutely NOT designed to generate unpredictable unique id.
      you can add a sha-1 or sha-256 digest step at the end of this algorithm but you will loose uniqueness and enter to collision probability world.
      hash algorithms ensure that for same id generated here, you will have the same hash but for two differents id (a pair of ids), it is
      possible to have the same hash with a very little probability. You would certainly take an option on a bijective function that maps
      35 digits (or more) number to 35 digits (or more) number based on cipher block and secret key. read paper on breaking PRNG algorithms 
      in order to be convinced that problems could occur as soon as you use random library :)

      1 million id per second ?... on a Intel(R) Core(TM)2 CPU 6400 @ 2.13GHz, you get :

      >>> timeit.timeit(uniqueid,number=40000)
      1.0114529132843018

      an average of 40000 id/second
    """
    mynow=datetime.now
    sft=datetime.strftime
    # store old datetime each time in order to check if we generate during same microsecond (glucky wallet !)
    # or if daylight savings event occurs (when clocks are adjusted backward) [rarely detected at this level]
    old_time=mynow() # fake init - on very speed machine it could increase your seed to seed + 1... but we have our contingency :)
    # manage seed
    seed_range_bits=14 # max range for seed
    seed_max_value=2**seed_range_bits - 1 # seed could not exceed 2**nbbits - 1
    # get random seed
    seed=random.getrandbits(seed_range_bits)
    current_seed=str(seed)
    # producing new ids
    while True:
        # get current time 
        current_time=mynow()
        if current_time <= old_time:
            # previous id generated in the same microsecond or Daylight saving time event occurs (when clocks are adjusted backward)
            seed = max(1,(seed + 1) % seed_max_value)
            current_seed=str(seed)
        # generate new id (concatenate seed and timestamp as numbers)
        #newid=hex(int(''.join([sft(current_time,'%f%S%M%H%d%m%Y'),current_seed])))[2:-1]
        newid=int(''.join([sft(current_time,'%f%S%M%H%d%m%Y'),current_seed]))
        # save current time
        old_time=current_time
        # return a new id
        yield newid

""" you get a new id for each call of uniqueid() """
uniqueid=__uniqueid__().next

import unittest
class UniqueIdTest(unittest.TestCase):
    def testGen(self):
        for _ in range(3):
            m=[uniqueid() for _ in range(10)]
            self.assertEqual(len(m),len(set(m)),"duplicates found !")

hope it helps !

希望能帮助到你 !

#7


3  

This will work very quickly but will not generate random values but monotonously increasing ones (for a given thread).

这将非常快速地工作,但不会生成随机值,而是单调增加(对于给定的线程)。

import threading

_uid = threading.local()
def genuid():
    if getattr(_uid, "uid", None) is None:
        _uid.tid = threading.current_thread().ident
        _uid.uid = 0
    _uid.uid += 1
    return (_uid.tid, _uid.uid)

It is thread safe and working with tuples may have benefit as opposed to strings (shorter if anything). If you do not need thread safety feel free remove the threading bits (in stead of threading.local, use object() and remove tid altogether).

它是线程安全的,使用元组可能有益而不是字符串(如果有的话更短)。如果您不需要线程安全,请随意删除线程位(而不是threading.local,使用object()并完全删除tid)。

Hope that helps.

希望有所帮助。

#8


2  

Maybe this work for u

也许这对你有用

str(uuid.uuid4().fields[-1])[:5]

#9


-6  

import time
def new_id():
    time.sleep(0.000001)
    return time.time()

On my system, time.time() seems to offer 6 significant figures after the decimal point. With a brief sleep it should be guaranteed unique with at least a moderate amount of randomness down in the last two or three digits.

在我的系统上,time.time()似乎在小数点后提供了6位有效数字。在短暂的睡眠中,应该保证唯一,在最后两位或三位数中至少有适度的随机性。

You could hash it as well if you're worried.

如果你担心,你也可以哈希。