Freeswitch pocketsphinx不认识我。

时间:2020-12-19 09:00:55

Today i need help with the speech recognition Pocketsphinx which i use in Freeswitch. So there is a demo "pizza demo" which does not work because the programm doesn't "hear" me.

今天我需要帮助的是我在Freeswitch中使用的语音识别Pocketsphinx。因此,有一个演示“披萨演示”,它不起作用,因为程序没有“听到”我。

I tried another example with an lua script. And also here the Pocketsphinx does not "hear" me.

我尝试了另一个带有lua脚本的例子。在这里,Pocketsphinx并没有“听到”我。

So maybe somebody knows whats not working. Because i don't implement anything, i don't know which code i can paste here. So if you need some code or configurations let me know.

也许有人知道为什么不工作。因为我没有实现任何东西,我不知道我可以在这里粘贴什么代码。如果你需要一些代码或配置,请告诉我。

My idea: maybe i must set which .dic file the pocketsphinx must use. I hope somebody can help me.

我的想法是:也许我必须设置,pocketsphinx必须使用的。dic文件。我希望有人能帮助我。

EDIT://

编辑:/ /

2014-10-14 15:13:08.923330 [NOTICE] switch_channel.c:1055 New Channel sofia/internal/1001@myip [326a4157-aa80-48d2-bd7e-db8d8afd525b]
2014-10-14 15:13:09.042378 [INFO] mod_dialplan_xml.c:558 Processing me <1001>->74992 in context default
2014-10-14 15:13:09.042378 [CRIT] mod_dptools.c:1628 WARNING WARNING WARNING WARNING WARNING WARNING WARNING WARNING WARNING
2014-10-14 15:13:09.042378 [CRIT] mod_dptools.c:1628 Open /usr/local/freeswitch/conf/vars.xml and change the default_password.
2014-10-14 15:13:09.042378 [CRIT] mod_dptools.c:1628 Once changed type 'reloadxml' at the console.
2014-10-14 15:13:09.042378 [CRIT] mod_dptools.c:1628 WARNING WARNING WARNING WARNING WARNING WARNING WARNING WARNING WARNING
2014-10-14 15:13:19.932900 [INFO] switch_core_media.c:5162 Activating RTCP PORT 4077
2014-10-14 15:13:19.932900 [NOTICE] sofia_media.c:92 Pre-Answer sofia/internal/1001@myip!
2014-10-14 15:13:19.943925 [NOTICE] fssession.cpp:1167 Channel [sofia/internal/1001@myip] has been answered
INFO: cmd_ln.c(691): Parsing command line:
\
    -samprate 8000 \
    -hmm /usr/local/freeswitch/grammar/model/communicator \
    -jsgf /usr/local/freeswitch/grammar/pizza_order.gram \
    -lw 6.5 \
    -dict /usr/local/freeswitch/grammar/default.dic \
    -frate 50 \
    -silprob 0.005

Current configuration:
[NAME]      [DEFLT]     [VALUE]
-agc        none        none
-agcthresh  2.0     2.000000e+00
-alpha      0.97        9.700000e-01
-ascale     20.0        2.000000e+01
-aw     1       1
-backtrace  no      no
-beam       1e-48       1.000000e-48
-bestpath   yes     yes
-bestpathlw 9.5     9.500000e+00
-bghist     no      no
-ceplen     13      13
-cmn        current     current
-cmninit    8.0     8.0
-compallsen no      no
-debug              0
-dict               /usr/local/freeswitch/grammar/default.dic
-dictcase   no      no
-dither     no      no
-doublebw   no      no
-ds     1       1
-fdict
-feat       1s_c_d_dd   1s_c_d_dd
-featparams
-fillprob   1e-8        1.000000e-08
-frate      100     50
-fsg
-fsgusealtpron  yes     yes
-fsgusefiller   yes     yes
-fwdflat    yes     yes
-fwdflatbeam    1e-64       1.000000e-64
-fwdflatefwid   4       4
-fwdflatlw  8.5     8.500000e+00
-fwdflatsfwin   25      25
-fwdflatwbeam   7e-29       7.000000e-29
-fwdtree    yes     yes
-hmm                /usr/local/freeswitch/grammar/model/communicator
-input_endian   little      little
-jsgf               /usr/local/freeswitch/grammar/pizza_order.gram
-kdmaxbbi   -1      -1
-kdmaxdepth 0       0
-kdtree
-latsize    5000        5000
-lda
-ldadim     0       0
-lextreedump    0       0
-lifter     0       0
-lm
-lmctl
-lmname     default     default
-logbase    1.0001      1.000100e+00
-logfn
-logspec    no      no
-lowerf     133.33334   1.333333e+02
-lpbeam     1e-40       1.000000e-40
-lponlybeam 7e-29       7.000000e-29
-lw     6.5     6.500000e+00
-maxhmmpf   -1      -1
-maxnewoov  20      20
-maxwpf     -1      -1
-mdef
-mean
-mfclogdir
-min_endfr  0       0
-mixw
-mixwfloor  0.0000001   1.000000e-07
-mllr
-mmap       yes     yes
-ncep       13      13
-nfft       512     512
-nfilt      40      40
-nwpen      1.0     1.000000e+00
-pbeam      1e-48       1.000000e-48
-pip        1.0     1.000000e+00
-pl_beam    1e-10       1.000000e-10
-pl_pbeam   1e-5        1.000000e-05
-pl_window  0       0
-rawlogdir
-remove_dc  no      no
-round_filters  yes     yes
-samprate   16000       8.000000e+03
-seed       -1      -1
-sendump
-senlogdir
-senmgau
-silprob    0.005       5.000000e-03
-smoothspec no      no
-svspec
-tmat
-tmatfloor  0.0001      1.000000e-04
-topn       4       4
-topn_beam  0       0
-toprule
-transform  legacy      legacy
-unit_area  yes     yes
-upperf     6855.4976   6.855498e+03
-usewdphones    no      no
-uw     1.0     1.000000e+00
-var
-varfloor   0.0001      1.000000e-04
-varnorm    no      no
-verbose    no      no
-warp_params
-warp_type  inverse_linear  inverse_linear
-wbeam      7e-29       7.000000e-29
-wip        0.65        6.500000e-01
-wlen       0.025625    2.562500e-02

INFO: cmd_ln.c(691): Parsing command line:
\
    -alpha 0.97 \
    -dither yes \
    -doublebw no \
    -nfilt 31 \
    -ncep 13 \
    -lowerf 200 \
    -upperf 3500 \
    -nfft 256 \
    -wlen 0.0256 \
    -transform legacy \
    -feat s2_4x \
    -agc none \
    -cmn current \
    -varnorm no

Current configuration:
[NAME]      [DEFLT]     [VALUE]
-agc        none        none
-agcthresh  2.0     2.000000e+00
-alpha      0.97        9.700000e-01
-ceplen     13      13
-cmn        current     current
-cmninit    8.0     8.0
-dither     no      yes
-doublebw   no      no
-feat       1s_c_d_dd   s2_4x
-frate      100     50
-input_endian   little      little
-lda
-ldadim     0       0
-lifter     0       0
-logspec    no      no
-lowerf     133.33334   2.000000e+02
-ncep       13      13
-nfft       512     256
-nfilt      40      31
-remove_dc  no      no
-round_filters  yes     yes
-samprate   16000       8.000000e+03
-seed       -1      -1
-smoothspec no      no
-svspec
-transform  legacy      legacy
-unit_area  yes     yes
-upperf     6855.4976   3.500000e+03
-varnorm    no      no
-verbose    no      no
-warp_params
-warp_type  inverse_linear  inverse_linear
-wlen       0.025625    2.560000e-02

INFO: acmod.c(246): Parsed model-specific feature parameters from /usr/local/freeswitch/grammar/model/communicator/feat.params
INFO: fe_interface.c(299): You are using the internal mechanism to generate the seed.
INFO: feat.c(713): Initializing feature stream to type: 's2_4x', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(142): mean[0]= 12.00, mean[1..12]= 0.0
INFO: mdef.c(517): Reading model definition: /usr/local/freeswitch/grammar/model/communicator/mdef
INFO: bin_mdef.c(179): Allocating 104160 * 8 bytes (813 KiB) for CD tree
INFO: tmat.c(205): Reading HMM transition probability matrices: /usr/local/freeswitch/grammar/model/communicator/transition_matrices
INFO: acmod.c(121): Attempting to use SCHMM computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/local/freeswitch/grammar/model/communicator/means
INFO: ms_gauden.c(292): 1 codebook, 4 feature, size:
INFO: ms_gauden.c(294):  256x12
INFO: ms_gauden.c(294):  256x24
INFO: ms_gauden.c(294):  256x3
INFO: ms_gauden.c(294):  256x12
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/local/freeswitch/grammar/model/communicator/variances
INFO: ms_gauden.c(292): 1 codebook, 4 feature, size:
INFO: ms_gauden.c(294):  256x12
INFO: ms_gauden.c(294):  256x24
INFO: ms_gauden.c(294):  256x3
INFO: ms_gauden.c(294):  256x12
INFO: ms_gauden.c(354): 59 variance values floored
INFO: s2_semi_mgau.c(903): Loading senones from dump file /usr/local/freeswitch/grammar/model/communicator/sendump
INFO: s2_semi_mgau.c(927): BEGIN FILE FORMAT DESCRIPTION
INFO: s2_semi_mgau.c(990): Rows: 256, Columns: 6256
INFO: s2_semi_mgau.c(1022): Using memory-mapped I/O for senones
INFO: s2_semi_mgau.c(1296): Maximum top-N: 4 Top-N beams: 0 0 0 0
INFO: dict.c(317): Allocating 137549 * 32 bytes (4298 KiB) for word entries
INFO: dict.c(332): Reading main dictionary: /usr/local/freeswitch/grammar/default.dic
INFO: dict.c(211): Allocated 1010 KiB for strings, 1664 KiB for phones
INFO: dict.c(335): 133436 words read
INFO: dict.c(341): Reading filler dictionary: /usr/local/freeswitch/grammar/model/communicator/noisedict
INFO: dict.c(211): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(344): 17 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(404): Allocating 51^3 * 2 bytes (259 KiB) for word-initial triphones
INFO: dict2pid.c(131): Allocated 62832 bytes (61 KiB) for word-final triphones
INFO: dict2pid.c(195): Allocated 62832 bytes (61 KiB) for single-phone word triphones
INFO: fsg_search.c(145): FSG(beam: -1080, pbeam: -1080, wbeam: -634; wip: -26, pip: 0)
INFO: jsgf.c(581): Defined rule: <pizza_order.g00000>
INFO: jsgf.c(581): Defined rule: PUBLIC <pizza_order.delivery>
INFO: fsg_model.c(215): Computing transitive closure for null transitions
INFO: fsg_model.c(270): 9 null transitions added
INFO: fsg_model.c(421): Adding silence transitions for <sil> to FSG
INFO: fsg_model.c(441): Added 8 silence word transitions
INFO: fsg_model.c(421): Adding silence transitions for ++AE++ to FSG
INFO: fsg_model.c(441): Added 8 silence word transitions
INFO: fsg_model.c(421): Adding silence transitions for ++AH++ to FSG
INFO: fsg_model.c(441): Added 8 silence word transitions
INFO: fsg_model.c(421): Adding silence transitions for ++BACKGROUND++ to FSG
INFO: fsg_model.c(441): Added 8 silence word transitions
INFO: fsg_model.c(421): Adding silence transitions for ++BREATH++ to FSG
INFO: fsg_model.c(441): Added 8 silence word transitions
INFO: fsg_model.c(421): Adding silence transitions for ++COUGH++ to FSG
INFO: fsg_model.c(441): Added 8 silence word transitions
INFO: fsg_model.c(421): Adding silence transitions for ++EH++ to FSG
INFO: fsg_model.c(441): Added 8 silence word transitions
INFO: fsg_model.c(421): Adding silence transitions for ++ER++ to FSG
INFO: fsg_model.c(441): Added 8 silence word transitions
INFO: fsg_model.c(421): Adding silence transitions for ++LAUGH++ to FSG
INFO: fsg_model.c(441): Added 8 silence word transitions
INFO: fsg_model.c(421): Adding silence transitions for ++MM++ to FSG
INFO: fsg_model.c(441): Added 8 silence word transitions
INFO: fsg_model.c(421): Adding silence transitions for ++MUMBLE++ to FSG
INFO: fsg_model.c(441): Added 8 silence word transitions
INFO: fsg_model.c(421): Adding silence transitions for ++NOISE++ to FSG
INFO: fsg_model.c(441): Added 8 silence word transitions
INFO: fsg_model.c(421): Adding silence transitions for ++OH++ to FSG
INFO: fsg_model.c(441): Added 8 silence word transitions
INFO: fsg_model.c(421): Adding silence transitions for ++SMACK++ to FSG
INFO: fsg_model.c(441): Added 8 silence word transitions
INFO: fsg_model.c(421): Adding silence transitions for ++UH++ to FSG
INFO: fsg_model.c(441): Added 8 silence word transitions
INFO: fsg_model.c(421): Adding silence transitions for ++UH_NOISE++ to FSG
INFO: fsg_model.c(441): Added 8 silence word transitions
INFO: fsg_model.c(421): Adding silence transitions for ++UM++ to FSG
INFO: fsg_model.c(441): Added 8 silence word transitions
INFO: fsg_model.c(421): Adding silence transitions for ++UM_NOISE++ to FSG
INFO: fsg_model.c(441): Added 8 silence word transitions
INFO: fsg_search.c(366): Added 0 alternate word transitions
INFO: fsg_lextree.c(108): Allocated 832 bytes (0 KiB) for left and right context phones
INFO: fsg_lextree.c(253): 213 HMM nodes in lextree (199 leaves)
INFO: fsg_lextree.c(255): Allocated 27264 bytes (26 KiB) for all lextree nodes
INFO: fsg_lextree.c(258): Allocated 25472 bytes (24 KiB) for lextree leafnodes
2014-10-14 15:13:25.442814 [NOTICE] switch_rtp.c:5132 Receiving an RTCP packet[2014-14-09 13:13:25.442953] SSRC[1123956418]RTT[0.001266] A[2683662693] - DLSR[22111] - LSR[2683640499]
INFO: cmn_prior.c(121): cmn_prior_update: from <  8.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00 >
INFO: cmn_prior.c(139): cmn_prior_update: to   <  7.58  0.08 -0.24 -0.08 -0.24 -0.18 -0.21 -0.15 -0.06 -0.18 -0.08 -0.11 -0.11 >
INFO: fsg_search.c(1032): 86 frames, 1666 HMMs (19/fr), 6967 senones (81/fr), 886 history entries (10/fr)

INFO: fsg_search.c(1417): Start node <sil>.0:22:85
INFO: fsg_search.c(1417): Start node <sil>.0:22:55
INFO: fsg_search.c(1417): Start node <sil>.0:22:85
INFO: fsg_search.c(1417): Start node <sil>.0:22:55
INFO: fsg_search.c(1417): Start node <sil>.0:22:85
INFO: fsg_search.c(1417): Start node takeout.0:21:33
INFO: fsg_search.c(1417): Start node pickup.0:19:71
INFO: fsg_search.c(1456): End node <sil>.56:58:85 (-1076)
INFO: fsg_search.c(1456): End node <sil>.56:58:85 (-1076)
INFO: fsg_search.c(1456): End node <sil>.56:58:85 (-1076)
INFO: fsg_search.c(1456): End node <sil>.26:28:85 (-1180)
INFO: fsg_search.c(1456): End node <sil>.0:22:85 (-6201)
INFO: fsg_search.c(1456): End node <sil>.0:22:85 (-6201)
INFO: fsg_search.c(1456): End node <sil>.0:22:85 (-6201)
INFO: fsg_search.c(1680): lattice start node <s>.0 end node </s>.86
INFO: ps_lattice.c(1365): Normalizer P(O) = alpha(</s>:86:86) = -333411
INFO: ps_lattice.c(1403): Joint P(O,S) = -333414 P(S|O) = -3
2014-10-14 15:13:28.822614 [WARNING] mod_pocketsphinx.c:348 Lost the text, never mind....
2014-10-14 15:13:30.922352 [NOTICE] switch_rtp.c:5132 Receiving an RTCP packet[2014-14-09 13:13:30.922476] SSRC[1123956418]RTT[0.001648] A[2684021799] - DLSR[53573] - LSR[2683968118]
2014-10-14 15:13:36.403317 [NOTICE] switch_rtp.c:5132 Receiving an RTCP packet[2014-14-09 13:13:36.403451] SSRC[1123956418]RTT[0.002731] A[2684381000] - DLSR[85028] - LSR[2684295793]
INFO: fsg_search.c(1032): 149 frames, 1750 HMMs (11/fr), 8700 senones (58/fr), 1006 history entries (6/fr)

INFO: fsg_search.c(1417): Start node <sil>.0:2:90
INFO: fsg_search.c(1417): Start node <sil>.0:2:90
INFO: fsg_search.c(1456): End node <sil>.122:124:148 (-955)
INFO: fsg_search.c(1456): End node <sil>.122:124:148 (-955)
INFO: fsg_search.c(1456): End node <sil>.122:124:148 (-955)
INFO: fsg_search.c(1456): End node pickup.87:107:148 (-4233)
INFO: fsg_search.c(1680): lattice start node <s>.0 end node </s>.149
INFO: ps_lattice.c(1365): Normalizer P(O) = alpha(</s>:149:149) = -927641
INFO: ps_lattice.c(1403): Joint P(O,S) = -927641 P(S|O) = 0
2014-10-14 15:13:41.883453 [NOTICE] switch_rtp.c:5132 Receiving an RTCP packet[2014-14-09 13:13:41.883618] SSRC[1123956418]RTT[0.002487] A[2684740148] - DLSR[116488] - LSR[2684623497]
2014-10-14 15:13:44.732381 [NOTICE] sofia.c:952 Hangup sofia/internal/1001@myip [CS_EXECUTE] [NORMAL_CLEARING]
2014-10-14 15:13:44.732381 [ERR] SpeechTools.jm:368 Exception: Session is not active! (near: "          rv = this.asr.session.collectInput(this.asr.onInput, this.asr, 500);")
INFO: fsg_search.c(1032): 33 frames, 377 HMMs (11/fr), 1733 senones (52/fr), 275 history entries (8/fr)

2014-10-14 15:13:44.802526 [INFO] mod_pocketsphinx.c:257 Port Closed.
2014-10-14 15:13:44.823711 [NOTICE] switch_core_session.c:1633 Session 25 (sofia/internal/1001@myip) Ended
2014-10-14 15:13:44.823711 [NOTICE] switch_core_session.c:1637 Close Channel sofia/internal/1001@myip [CS_DESTROY]

EDIT 2:

编辑2:

I find out that the speech recognition works and it detect my speech. So the problem is that in SpeechTools.jm the result from the xml can not be load and is undefined.

我发现语音识别是有效的,它能检测到我的演讲。所以问题在于,在演讲工具中。jm的结果来自xml不能加载和未定义。

body = body.replace(/<\?.*?\?>/g, '');
console_log("debug", "----XML:\n" + body + "\n");
xml = new XML("<xml>" + body + "</xml>");
result = xml.result; //undefined

and my output from console_log

还有来自console_log的输出。

<result grammar="pizza_order">
  <interpretation grammar="pizza_order" confidence="100">
    <input mode="speech">pickup</input>
  </interpretation>
</result>

1 个解决方案

#1


2  

Okay, the speech recognition works the whole time (see edit). Real problem is that the whole script (SpeechTools.jm) is not working. They switched from mozilla javascript engine to google v8 without editing the script. However fixing the script is an javascript problem and has nothing to do with this question anymore.

好的,语音识别一直都是有效的(见编辑)。真正的问题是整个脚本(演讲工具。jm)都不起作用。他们在不编辑脚本的情况下从mozilla javascript引擎切换到谷歌v8。但是,修复脚本是一个javascript问题,与此问题无关。

#1


2  

Okay, the speech recognition works the whole time (see edit). Real problem is that the whole script (SpeechTools.jm) is not working. They switched from mozilla javascript engine to google v8 without editing the script. However fixing the script is an javascript problem and has nothing to do with this question anymore.

好的,语音识别一直都是有效的(见编辑)。真正的问题是整个脚本(演讲工具。jm)都不起作用。他们在不编辑脚本的情况下从mozilla javascript引擎切换到谷歌v8。但是,修复脚本是一个javascript问题,与此问题无关。