We have a system where we want to prevent the same credit card number being registered for two different accounts. As we don't store the credit card number internally - just the last four digits and expiration date - we cannot simply compare credit card numbers and expiration dates.

我们有一个系统,我们希望阻止为两个不同的帐户注册相同的信用卡号。由于我们不在内部存储信用卡号码 - 只是最后四位数和到期日期 - 我们不能简单地比较信用卡号码和到期日期。

Our current idea is to store a hash (SHA-1) in our system of the credit card information when the card is registered, and to compare hashes to determine if a card has been used before.


Usually, a salt is used to avoid dictionary attacks. I assume we are vulnerable in this case, so we should probably store a salt along with the hash.


Do you guys see any flaws in this method? Is this a standard way of solving this problem?


Let's do a little math: Credit card numbers are 16 digits long. The first seven digits are 'major industry' and issuer numbers, and the last digit is the luhn checksum. That leaves 8 digits 'free', for a total of 100,000,000 account numbers, multiplied by the number of potential issuer numbers (which is not likely to be very high). There are implementations that can do millions of hashes per second on everyday hardware, so no matter what salting you do, this is not going to be a big deal to brute force.


By sheer coincidence, when looking for something giving hash algorithm benchmarks, I found this article about storing credit card hashes, which says:


Storing credit cards using a simple single pass of a hash algorithm, even when salted, is fool-hardy. It is just too easy to brute force the credit card numbers if the hashes are compromised.



When hashing credit card number, the hashing must be carefully designed to protect against brute forcing by using strongest available cryptographic hash functions, large salt values, and multiple iterations.


The full article is well worth a thorough read. Unfortunately, the upshot seems to be that any circumstance that makes it 'safe' to store hashed credit card numbers will also make it prohibitively expensive to search for duplicates.




People are over thinking the design of this, I think. Use a salted, highly secure (e.g. "computationally expensive") hash like sha-256, with a per-record unique salt.


You should do a low-cost, high accuracy check first, then do the high-cost definitive check only if that check hits.


Step 1:

Look for matches to the last 4 digits (and possibly also the exp. date, though there's some subtleties there that may need addressing).


Step 2:

If the simple check hits, use the salt, get the hash value, do the in depth check.


The last 4 digits of the cc# are the most unique (partly because it includes the LUHN check digit as well) so the percentage of in depth checks you will do that won't ultimately match (the false positive rate) will be very, very low (a fraction of a percent), which saves you a tremendous amount of overhead relative to the naive "do the hash check every time" design.




Do not store a simple SHA-1 of the credit card number, it would be way to easy to crack (especially since the last 4 digits are known). We had the same problem in my company: here is how we solved it.


First solution

  1. For each credit card, we store the last 4 digits, the expiration date, a long random salt (50 bytes long), and the salted hash of the CC number. We use the bcrypt hash algorithm because it is very secure and can be tuned to be as CPU-intensive as you wish. We tuned it to be very expensive (about 1 second per hash on our server!). But I guess you could use SHA-256 instead and iterate as many times as needed.
  2. 对于每张信用卡,我们存储最后4位数,到期日期,长随机盐(50字节长)和CC号的盐渍哈希值。我们使用bcrypt哈希算法,因为它非常安全,可以根据需要调整为CPU密集型。我们将其调整为非常昂贵(在我们的服务器上每个哈希大约1秒!)。但我想你可以使用SHA-256代替并根据需要迭代多次。

  3. When a new CC number is entered, we start by finding all the existing CC numbers that end with the same 4 digits and have the same expiration date. Then, for each matching CC, we check whether its stored salted hash matches the salted hash calculated from its salt and the new CC number. In other words, we check whether or not hash(stored_CC1_salt+CC2)==stored_CC1_hash.
  4. 当输入新的CC编号时,我们首先查找所有现有的CC编号,这些编号以相同的4位数结尾并具有相同的到期日期。然后,对于每个匹配的CC,我们检查其存储的盐渍散列是否与从其盐和新CC号计算的盐渍散列相匹配。换句话说,我们检查是否hash(stored_CC1_salt + CC2)== stored_CC1_hash。

Since we have roughly 100k credit cards in our database, we need to calculate about 10 hashes, so we get the result in about 10 seconds. In our case, this is fine, but you may want to tune bcrypt down a bit. Unfortunately, if you do, this solution will be less secure. On the other hand, if you tune bcrypt to be even more CPU-intensive, it will take more time to match CC numbers.


Even though I believe that this solution is way better than simply storing an unsalted hash of the CC number, it will not prevent a very motivated pirate (who manages to get a copy of the database) to break one credit card in an average time of 2 to 5 years. So if you have 100k credit cards in your database, and if the pirate has a lot of CPU, then he can can recover a few credit card numbers every day!


This leads me to the belief that you should not calculate the hash yourself: you have to delegate that to someone else. This is the second solution (we are in the process of migrating to this second solution).


Second solution

Simply have your payment provider generate an alias for your credit card.


  1. for each credit card, you simply store whatever you want to store (for example the last 4 digits & the expiration date) plus a credit card number alias.
  2. 对于每张信用卡,您只需存储您想要存储的任何内容(例如最后4位数和到期日期)以及信用卡号别名。

  3. when a new credit card number is entered, you contact your payment provider and give it the CC number (or you redirect the client to the payment provider, and he enters the CC number directly on the payment provider's web site). In return, you get the credit card alias! That's it. Of course you should make sure that your payment provider offers this option, and that the generated alias is actually secure (for example, make sure they don't simply calculate a SHA-1 on the credit card number!). Now the pirate has to break your system plus your payment provider's system if he wants to recover the credit card numbers.
  4. 当输入新的信用卡号码时,您联系您的支付提供商并给它CC号码(或者您将客户端重定向到支付提供商,他直接在支付提供商的网站上输入CC号码)。作为回报,您将获得信用卡别名!而已。当然,您应该确保您的支付提供商提供此选项,并且生成的别名实际上是安全的(例如,确保他们不只是计算信用卡号码上的SHA-1!)。现在,如果他想要恢复信用卡号码,盗版者必须破坏你的系统和你的支付提供商的系统。

It's simple, it's fast, it's secure (well, at least if your payment provider is). The only problem I see is that it ties you to your payment provider.


Hope this helps.




PCI DSS states that you can store PANs (credit card numbers) using a strong one-way hash. They don't even require that it be salted. That said you should salt it with a unique per card value. The expiry date is a good start but perhaps a bit too short. You could add in other pieces of information from the card, such as the issuer. You should not use the CVV/security number as you are not allowed to store it. If you do use the expiry date then when the cardholder gets issued a new card with the same number it will count as a different card. This could be a good or bad thing depending on your requirements.

PCI DSS声明您可以使用强大的单向散列来存储PAN(信用卡号)。它们甚至不要求它被腌制。那就是说你应该用每张卡的唯一值加盐。到期日是一个良好的开端,但也许有点太短。您可以从卡中添加其他信息,例如发卡行。您不应使用CVV /安全号码,因为您不允许存储它。如果您确实使用了有效期,那么当持卡人获得一张具有相同号码的新卡时,它将被视为另一张卡。根据您的要求,这可能是好事也可能是坏事。

An approach to make your data more secure is to make each operation computationally expensive. For instance if you md5 twice it will take an attacker longer to crack the codes.


Its fairly trivial to generate valid credit card numbers and to attempt a charge through for each possible expiry date. However, it is computationally expensive. If you make it more expensive to crack your hashes then it wouldn't be worthwhile for anyone to bother; even if they had the salts, hashes and the method you used.




SHA 1 isn't broken, per se. What the article shows is that it's possible to generate 2 strings which have the same hash value in less than brute force time. You still aren't able to generate a string that equates to a SPECIFIC hash in a reasonable amount of time. There is a big difference between the two.

SHA 1本身并未破坏。文章显示的是,可以生成2个字符串,这些字符串具有相同的哈希值,而不是暴力时间。您仍然无法在合理的时间内生成等同于SPECIFIC哈希的字符串。两者之间有很大的不同。



I believe I have found a fool-proof way to solve this problem. Someone please correct me if there is a flaw in my solution.


  1. Create a secure server on EC2, Heroku, etc. This server will serve ONE purpose and ONLY one purpose: hashing your credit card.
  2. 在EC2,Heroku等上创建一个安全的服务器。这个服务器只有一个目的,只有一个目的:哈希你的信用卡。

  3. Install a secure web server (Node.js, Rails, etc) on that server and set up the REST API call.
  4. 在该服务器上安装安全Web服务器(Node.js,Rails等)并设置REST API调用。

  5. On that server, use a unique salt (1000 characters) and SHA512 it 1000 times.
  6. 在该服务器上,使用一个唯一的盐(1000个字符)和SHA512 1000次。

That way, even if hackers get your hashes, they would need to break into your server to find your formula.




Comparing hashes is a good solution. Make sure that you don't just salt all the credit card numbers with the same constant salt, though. Use a different salt (like the expiration date) on each card. This should make you fairly impervious to dictionary attacks.


From this Coding Horror article:

从这篇Coding Horror文章:

Add a long, unique random salt to each password you store. The point of a salt (or nonce, if you prefer) is to make each password unique and long enough that brute force attacks are a waste of time. So, the user's password, instead of being stored as the hash of "myspace1", ends up being stored as the hash of 128 characters of random unicode string + "myspace1". You're now completely immune to rainbow table attack.




Almost a good idea.


Storing just the hash is a good idea, it has served in the password world for decades.


Adding a salt seems like a fair idea, and indeed makes a brute force attack that much harder for the attacker. But that salt is going to cost you a lot of extra effort when you actually check to ensure that a new CC is unique: You'll have to SHA-1 your new CC number N times, where N is the number of salts you have already used for all of the CCs you are comparing it to. If indeed you choose good random salts you'll have to do the hash for every other card in your system. So now it is you doing the brute force. So I would say this is not a scalable solution.


You see, in the password world a salt adds no cost because we just want to know if the clear text + salt hashes to what we have stored for this particular user. Your requirement is actually pretty different.


You'll have to weigh the trade off yourself. Adding salt doesn't make your database secure if it does get stolen, it just makes decoding it harder. How much harder? If it changes the attack from requiring 30 seconds to requiring one day you have achieved nothing -- it will still be decoded. If it changes it from one day to 30 years you have achived someting worth considering.

你必须自己权衡利弊。添加盐不会使数据库安全,如果它被盗,它只会使解码更难。多难多少?如果它将攻击从需要30秒改为需要一天你就没有取得任何成果 - 它仍将被解码。如果它从一天变为30年,那么你已经想到了值得考虑的东西。



Yes, comparing hashes should work fine in this case.




A salted hash should work just fine. Having a salt-per-user system should be plenty of security.




SHA1 is broken. Course, there isn't much information out on what a good replacement is. SHA2?

SHA1坏了。当然,没有太多关于什么是好的替代品的信息。 SHA2?



If you combine the last 4 digits of the card number with the card holder's name (or just last name) and the expiration date you should have enough information to make the record unique. Hashing is nice for security, but wouldn't you need to store/recall the salt in order to replicate the hash for a duplicate check?




I think a good solution as hinted to above, would be to store a hash value of say Card Number, Expiration date, and name. That way you can still do quick comparisons...




Sha1 broken is not a problem here. All broken means is that it's possible to calculate collisions (2 data sets that have the same sha1) more easily than you would expect. This might be a problem for accepting arbitrary files based on their sha1 but it has no relevence for an internal hashing application.




If you are using a payment processor like Stripe / Braintree, let them do the "heavy lifting".

如果您使用像Stripe / Braintree这样的支付处理器,那么让他们进行“繁重的工作”。

They both offer card fingerprinting that you can safely store in your db and compare later to see if a card already exists:


  • Stripe returns fingerprint string - see doc
  • Stripe返回指纹字符串 - 请参阅doc

  • Braintree returns unique_number_identifier string - see doc
  • Braintree返回unique_number_identifier字符串 - 请参阅doc



