I have a line of DNA code and I'm trying to use a Java regex expression to match the codon (3 letter sequence) to an amino acid. Below is an example of one of the patterns:
我有一行DNA代码,我正在尝试使用Java regex表达式将密码子(3个字母序列)与氨基酸匹配。下面是其中一种模式的例子:
Pattern A = Pattern.compile(("gct")||("gcc")||("gca")||("gcg"));
This syntax does not seem to be working with or without the round brackets. Ultimately the aim of the code is to count the number of times the amino acid is found in the DNA string, and since there are 20 or so amino acids I have that many patterns. Can anyone help me find an elegant way of doing this?
这种语法似乎不使用或不使用圆括号。代码的最终目的是计算在DNA链中发现氨基酸的次数,因为有大约20个氨基酸,所以我有那么多的模式。谁能帮我找到一个优雅的方法吗?
I know I could use string1.equals(string2) etc but I would really rather use regex for it. Any help would be much appreciated!
我知道我可以使用string1.equals(string2)等等但是我更愿意使用regex。如有任何帮助,我们将不胜感激!
2 个解决方案
#1
4
You're passing Pattern.compile()
a boolean value, where it should be a string:
您正在传递模式。compile()一个布尔值,其中它应该是一个字符串:
Pattern A = Pattern.compile("(gct)|(gcc)|(gca)|(gcg)");
#2
-1
This:
这样的:
/("gct")||("gcc")||("gca")||("gcg")/
Equals to :
等于:
/("gtc")/
Because double || means match nothing. And guess what? It will always match!
因为双||意味着什么都不匹配。你猜怎么着?它总是匹配!
Instead try to use one |
相反,尝试使用一个|
/("gct")|("gcc")|("gca")|("gcg")/
Or even better:
或者更好的是:
"gc[tcag]"
Edit:
编辑:
Wow didn't notice the boolean :) +1 to @Tim
Wow没有注意到布尔:)+1到@Tim。
#1
4
You're passing Pattern.compile()
a boolean value, where it should be a string:
您正在传递模式。compile()一个布尔值,其中它应该是一个字符串:
Pattern A = Pattern.compile("(gct)|(gcc)|(gca)|(gcg)");
#2
-1
This:
这样的:
/("gct")||("gcc")||("gca")||("gcg")/
Equals to :
等于:
/("gtc")/
Because double || means match nothing. And guess what? It will always match!
因为双||意味着什么都不匹配。你猜怎么着?它总是匹配!
Instead try to use one |
相反,尝试使用一个|
/("gct")|("gcc")|("gca")|("gcg")/
Or even better:
或者更好的是:
"gc[tcag]"
Edit:
编辑:
Wow didn't notice the boolean :) +1 to @Tim
Wow没有注意到布尔:)+1到@Tim。