I've got an XML file that includes email addresses as part of each record. I'd like to obscure the email addresses (for privacy), but also keep their "uniqeness" to allow combining of records (purchases in this case) if there is more than one from the same email address.
我有一个XML文件,其中包含电子邮件地址作为每条记录的一部分。我想隐藏电子邮件地址(隐私),但如果同一个电子邮件地址中有多个记录,则还要保留“uniqeness”以允许组合记录(在这种情况下为购买)。
Figured there might be a way using regex to replace the characters before and after the "@" with * or similar. Figuring that 3 or 4 characters before and after preserves the privacy and (for the most part) keeps the "uniqueness".
想象可能有一种方法使用正则表达式来替换“@”之前和之后的字符*或类似。确定前后3或4个字符保留隐私和(大部分)保持“唯一性”。
Suggestions on the best way to do this (including some completely different options than what I'm thinking)?
建议最好的方法(包括一些完全不同的选项,而不是我的想法)?
Thanks.
1 个解决方案
#1
1
The regex would look something like this: ([^@]{1,4})@(.{1,4}) which gets up to 4 characters before and after the @.
正则表达式看起来像这样:([^ @] {1,4})@(。{1,4})在@之前和之后最多4个字符。
How you would do the replacements would depend on your language, and how you are loading the file. If you are just doing this once in a Text Editor like Ultra Edit, and not in the middle of a program then I would do something like this:
如何进行替换取决于您的语言以及加载文件的方式。如果您只是在像Ultra Edit这样的文本编辑器中执行此操作,而不是在程序中间,那么我会执行以下操作:
Replace all [^@>]@[^<] with *@*
Replace all [^@>]{2}@[^<]{2} with **@**
Replace all [^@>]{3}@[^<]{3} with ***@***
Replace all [^@>]{4}@[^<]{4} with ****@****
That way it will still do something on short email addresses. (Tweaked to not include your xml tags)
这样它仍然可以在短电子邮件地址上执行某些操作。 (调整为不包含您的xml标签)
#1
1
The regex would look something like this: ([^@]{1,4})@(.{1,4}) which gets up to 4 characters before and after the @.
正则表达式看起来像这样:([^ @] {1,4})@(。{1,4})在@之前和之后最多4个字符。
How you would do the replacements would depend on your language, and how you are loading the file. If you are just doing this once in a Text Editor like Ultra Edit, and not in the middle of a program then I would do something like this:
如何进行替换取决于您的语言以及加载文件的方式。如果您只是在像Ultra Edit这样的文本编辑器中执行此操作,而不是在程序中间,那么我会执行以下操作:
Replace all [^@>]@[^<] with *@*
Replace all [^@>]{2}@[^<]{2} with **@**
Replace all [^@>]{3}@[^<]{3} with ***@***
Replace all [^@>]{4}@[^<]{4} with ****@****
That way it will still do something on short email addresses. (Tweaked to not include your xml tags)
这样它仍然可以在短电子邮件地址上执行某些操作。 (调整为不包含您的xml标签)