Today i need help with the speech recognition Pocketsphinx which i use in Freeswitch. So there is a demo "pizza demo" which does not work because the programm doesn't "hear" me.
今天我需要帮助的是我在Freeswitch中使用的语音识别Pocketsphinx。因此,有一个演示“披萨演示”,它不起作用,因为程序没有“听到”我。
I tried another example with an lua script. And also here the Pocketsphinx does not "hear" me.
我尝试了另一个带有lua脚本的例子。在这里,Pocketsphinx并没有“听到”我。
So maybe somebody knows whats not working. Because i don't implement anything, i don't know which code i can paste here. So if you need some code or configurations let me know.
也许有人知道为什么不工作。因为我没有实现任何东西,我不知道我可以在这里粘贴什么代码。如果你需要一些代码或配置,请告诉我。
My idea: maybe i must set which .dic file the pocketsphinx must use. I hope somebody can help me.
我的想法是:也许我必须设置,pocketsphinx必须使用的。dic文件。我希望有人能帮助我。
EDIT://
编辑:/ /
2014-10-14 15:13:08.923330 [NOTICE] switch_channel.c:1055 New Channel sofia/internal/1001@myip [326a4157-aa80-48d2-bd7e-db8d8afd525b]
2014-10-14 15:13:09.042378 [INFO] mod_dialplan_xml.c:558 Processing me <1001>->74992 in context default
2014-10-14 15:13:09.042378 [CRIT] mod_dptools.c:1628 WARNING WARNING WARNING WARNING WARNING WARNING WARNING WARNING WARNING
2014-10-14 15:13:09.042378 [CRIT] mod_dptools.c:1628 Open /usr/local/freeswitch/conf/vars.xml and change the default_password.
2014-10-14 15:13:09.042378 [CRIT] mod_dptools.c:1628 Once changed type 'reloadxml' at the console.
2014-10-14 15:13:09.042378 [CRIT] mod_dptools.c:1628 WARNING WARNING WARNING WARNING WARNING WARNING WARNING WARNING WARNING
2014-10-14 15:13:19.932900 [INFO] switch_core_media.c:5162 Activating RTCP PORT 4077
2014-10-14 15:13:19.932900 [NOTICE] sofia_media.c:92 Pre-Answer sofia/internal/1001@myip!
2014-10-14 15:13:19.943925 [NOTICE] fssession.cpp:1167 Channel [sofia/internal/1001@myip] has been answered
INFO: cmd_ln.c(691): Parsing command line:
\
-samprate 8000 \
-hmm /usr/local/freeswitch/grammar/model/communicator \
-jsgf /usr/local/freeswitch/grammar/pizza_order.gram \
-lw 6.5 \
-dict /usr/local/freeswitch/grammar/default.dic \
-frate 50 \
-silprob 0.005
Current configuration:
[NAME] [DEFLT] [VALUE]
-agc none none
-agcthresh 2.0 2.000000e+00
-alpha 0.97 9.700000e-01
-ascale 20.0 2.000000e+01
-aw 1 1
-backtrace no no
-beam 1e-48 1.000000e-48
-bestpath yes yes
-bestpathlw 9.5 9.500000e+00
-bghist no no
-ceplen 13 13
-cmn current current
-cmninit 8.0 8.0
-compallsen no no
-debug 0
-dict /usr/local/freeswitch/grammar/default.dic
-dictcase no no
-dither no no
-doublebw no no
-ds 1 1
-fdict
-feat 1s_c_d_dd 1s_c_d_dd
-featparams
-fillprob 1e-8 1.000000e-08
-frate 100 50
-fsg
-fsgusealtpron yes yes
-fsgusefiller yes yes
-fwdflat yes yes
-fwdflatbeam 1e-64 1.000000e-64
-fwdflatefwid 4 4
-fwdflatlw 8.5 8.500000e+00
-fwdflatsfwin 25 25
-fwdflatwbeam 7e-29 7.000000e-29
-fwdtree yes yes
-hmm /usr/local/freeswitch/grammar/model/communicator
-input_endian little little
-jsgf /usr/local/freeswitch/grammar/pizza_order.gram
-kdmaxbbi -1 -1
-kdmaxdepth 0 0
-kdtree
-latsize 5000 5000
-lda
-ldadim 0 0
-lextreedump 0 0
-lifter 0 0
-lm
-lmctl
-lmname default default
-logbase 1.0001 1.000100e+00
-logfn
-logspec no no
-lowerf 133.33334 1.333333e+02
-lpbeam 1e-40 1.000000e-40
-lponlybeam 7e-29 7.000000e-29
-lw 6.5 6.500000e+00
-maxhmmpf -1 -1
-maxnewoov 20 20
-maxwpf -1 -1
-mdef
-mean
-mfclogdir
-min_endfr 0 0
-mixw
-mixwfloor 0.0000001 1.000000e-07
-mllr
-mmap yes yes
-ncep 13 13
-nfft 512 512
-nfilt 40 40
-nwpen 1.0 1.000000e+00
-pbeam 1e-48 1.000000e-48
-pip 1.0 1.000000e+00
-pl_beam 1e-10 1.000000e-10
-pl_pbeam 1e-5 1.000000e-05
-pl_window 0 0
-rawlogdir
-remove_dc no no
-round_filters yes yes
-samprate 16000 8.000000e+03
-seed -1 -1
-sendump
-senlogdir
-senmgau
-silprob 0.005 5.000000e-03
-smoothspec no no
-svspec
-tmat
-tmatfloor 0.0001 1.000000e-04
-topn 4 4
-topn_beam 0 0
-toprule
-transform legacy legacy
-unit_area yes yes
-upperf 6855.4976 6.855498e+03
-usewdphones no no
-uw 1.0 1.000000e+00
-var
-varfloor 0.0001 1.000000e-04
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wbeam 7e-29 7.000000e-29
-wip 0.65 6.500000e-01
-wlen 0.025625 2.562500e-02
INFO: cmd_ln.c(691): Parsing command line:
\
-alpha 0.97 \
-dither yes \
-doublebw no \
-nfilt 31 \
-ncep 13 \
-lowerf 200 \
-upperf 3500 \
-nfft 256 \
-wlen 0.0256 \
-transform legacy \
-feat s2_4x \
-agc none \
-cmn current \
-varnorm no
Current configuration:
[NAME] [DEFLT] [VALUE]
-agc none none
-agcthresh 2.0 2.000000e+00
-alpha 0.97 9.700000e-01
-ceplen 13 13
-cmn current current
-cmninit 8.0 8.0
-dither no yes
-doublebw no no
-feat 1s_c_d_dd s2_4x
-frate 100 50
-input_endian little little
-lda
-ldadim 0 0
-lifter 0 0
-logspec no no
-lowerf 133.33334 2.000000e+02
-ncep 13 13
-nfft 512 256
-nfilt 40 31
-remove_dc no no
-round_filters yes yes
-samprate 16000 8.000000e+03
-seed -1 -1
-smoothspec no no
-svspec
-transform legacy legacy
-unit_area yes yes
-upperf 6855.4976 3.500000e+03
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wlen 0.025625 2.560000e-02
INFO: acmod.c(246): Parsed model-specific feature parameters from /usr/local/freeswitch/grammar/model/communicator/feat.params
INFO: fe_interface.c(299): You are using the internal mechanism to generate the seed.
INFO: feat.c(713): Initializing feature stream to type: 's2_4x', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(142): mean[0]= 12.00, mean[1..12]= 0.0
INFO: mdef.c(517): Reading model definition: /usr/local/freeswitch/grammar/model/communicator/mdef
INFO: bin_mdef.c(179): Allocating 104160 * 8 bytes (813 KiB) for CD tree
INFO: tmat.c(205): Reading HMM transition probability matrices: /usr/local/freeswitch/grammar/model/communicator/transition_matrices
INFO: acmod.c(121): Attempting to use SCHMM computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/local/freeswitch/grammar/model/communicator/means
INFO: ms_gauden.c(292): 1 codebook, 4 feature, size:
INFO: ms_gauden.c(294): 256x12
INFO: ms_gauden.c(294): 256x24
INFO: ms_gauden.c(294): 256x3
INFO: ms_gauden.c(294): 256x12
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/local/freeswitch/grammar/model/communicator/variances
INFO: ms_gauden.c(292): 1 codebook, 4 feature, size:
INFO: ms_gauden.c(294): 256x12
INFO: ms_gauden.c(294): 256x24
INFO: ms_gauden.c(294): 256x3
INFO: ms_gauden.c(294): 256x12
INFO: ms_gauden.c(354): 59 variance values floored
INFO: s2_semi_mgau.c(903): Loading senones from dump file /usr/local/freeswitch/grammar/model/communicator/sendump
INFO: s2_semi_mgau.c(927): BEGIN FILE FORMAT DESCRIPTION
INFO: s2_semi_mgau.c(990): Rows: 256, Columns: 6256
INFO: s2_semi_mgau.c(1022): Using memory-mapped I/O for senones
INFO: s2_semi_mgau.c(1296): Maximum top-N: 4 Top-N beams: 0 0 0 0
INFO: dict.c(317): Allocating 137549 * 32 bytes (4298 KiB) for word entries
INFO: dict.c(332): Reading main dictionary: /usr/local/freeswitch/grammar/default.dic
INFO: dict.c(211): Allocated 1010 KiB for strings, 1664 KiB for phones
INFO: dict.c(335): 133436 words read
INFO: dict.c(341): Reading filler dictionary: /usr/local/freeswitch/grammar/model/communicator/noisedict
INFO: dict.c(211): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(344): 17 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(404): Allocating 51^3 * 2 bytes (259 KiB) for word-initial triphones
INFO: dict2pid.c(131): Allocated 62832 bytes (61 KiB) for word-final triphones
INFO: dict2pid.c(195): Allocated 62832 bytes (61 KiB) for single-phone word triphones
INFO: fsg_search.c(145): FSG(beam: -1080, pbeam: -1080, wbeam: -634; wip: -26, pip: 0)
INFO: jsgf.c(581): Defined rule: <pizza_order.g00000>
INFO: jsgf.c(581): Defined rule: PUBLIC <pizza_order.delivery>
INFO: fsg_model.c(215): Computing transitive closure for null transitions
INFO: fsg_model.c(270): 9 null transitions added
INFO: fsg_model.c(421): Adding silence transitions for <sil> to FSG
INFO: fsg_model.c(441): Added 8 silence word transitions
INFO: fsg_model.c(421): Adding silence transitions for ++AE++ to FSG
INFO: fsg_model.c(441): Added 8 silence word transitions
INFO: fsg_model.c(421): Adding silence transitions for ++AH++ to FSG
INFO: fsg_model.c(441): Added 8 silence word transitions
INFO: fsg_model.c(421): Adding silence transitions for ++BACKGROUND++ to FSG
INFO: fsg_model.c(441): Added 8 silence word transitions
INFO: fsg_model.c(421): Adding silence transitions for ++BREATH++ to FSG
INFO: fsg_model.c(441): Added 8 silence word transitions
INFO: fsg_model.c(421): Adding silence transitions for ++COUGH++ to FSG
INFO: fsg_model.c(441): Added 8 silence word transitions
INFO: fsg_model.c(421): Adding silence transitions for ++EH++ to FSG
INFO: fsg_model.c(441): Added 8 silence word transitions
INFO: fsg_model.c(421): Adding silence transitions for ++ER++ to FSG
INFO: fsg_model.c(441): Added 8 silence word transitions
INFO: fsg_model.c(421): Adding silence transitions for ++LAUGH++ to FSG
INFO: fsg_model.c(441): Added 8 silence word transitions
INFO: fsg_model.c(421): Adding silence transitions for ++MM++ to FSG
INFO: fsg_model.c(441): Added 8 silence word transitions
INFO: fsg_model.c(421): Adding silence transitions for ++MUMBLE++ to FSG
INFO: fsg_model.c(441): Added 8 silence word transitions
INFO: fsg_model.c(421): Adding silence transitions for ++NOISE++ to FSG
INFO: fsg_model.c(441): Added 8 silence word transitions
INFO: fsg_model.c(421): Adding silence transitions for ++OH++ to FSG
INFO: fsg_model.c(441): Added 8 silence word transitions
INFO: fsg_model.c(421): Adding silence transitions for ++SMACK++ to FSG
INFO: fsg_model.c(441): Added 8 silence word transitions
INFO: fsg_model.c(421): Adding silence transitions for ++UH++ to FSG
INFO: fsg_model.c(441): Added 8 silence word transitions
INFO: fsg_model.c(421): Adding silence transitions for ++UH_NOISE++ to FSG
INFO: fsg_model.c(441): Added 8 silence word transitions
INFO: fsg_model.c(421): Adding silence transitions for ++UM++ to FSG
INFO: fsg_model.c(441): Added 8 silence word transitions
INFO: fsg_model.c(421): Adding silence transitions for ++UM_NOISE++ to FSG
INFO: fsg_model.c(441): Added 8 silence word transitions
INFO: fsg_search.c(366): Added 0 alternate word transitions
INFO: fsg_lextree.c(108): Allocated 832 bytes (0 KiB) for left and right context phones
INFO: fsg_lextree.c(253): 213 HMM nodes in lextree (199 leaves)
INFO: fsg_lextree.c(255): Allocated 27264 bytes (26 KiB) for all lextree nodes
INFO: fsg_lextree.c(258): Allocated 25472 bytes (24 KiB) for lextree leafnodes
2014-10-14 15:13:25.442814 [NOTICE] switch_rtp.c:5132 Receiving an RTCP packet[2014-14-09 13:13:25.442953] SSRC[1123956418]RTT[0.001266] A[2683662693] - DLSR[22111] - LSR[2683640499]
INFO: cmn_prior.c(121): cmn_prior_update: from < 8.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >
INFO: cmn_prior.c(139): cmn_prior_update: to < 7.58 0.08 -0.24 -0.08 -0.24 -0.18 -0.21 -0.15 -0.06 -0.18 -0.08 -0.11 -0.11 >
INFO: fsg_search.c(1032): 86 frames, 1666 HMMs (19/fr), 6967 senones (81/fr), 886 history entries (10/fr)
INFO: fsg_search.c(1417): Start node <sil>.0:22:85
INFO: fsg_search.c(1417): Start node <sil>.0:22:55
INFO: fsg_search.c(1417): Start node <sil>.0:22:85
INFO: fsg_search.c(1417): Start node <sil>.0:22:55
INFO: fsg_search.c(1417): Start node <sil>.0:22:85
INFO: fsg_search.c(1417): Start node takeout.0:21:33
INFO: fsg_search.c(1417): Start node pickup.0:19:71
INFO: fsg_search.c(1456): End node <sil>.56:58:85 (-1076)
INFO: fsg_search.c(1456): End node <sil>.56:58:85 (-1076)
INFO: fsg_search.c(1456): End node <sil>.56:58:85 (-1076)
INFO: fsg_search.c(1456): End node <sil>.26:28:85 (-1180)
INFO: fsg_search.c(1456): End node <sil>.0:22:85 (-6201)
INFO: fsg_search.c(1456): End node <sil>.0:22:85 (-6201)
INFO: fsg_search.c(1456): End node <sil>.0:22:85 (-6201)
INFO: fsg_search.c(1680): lattice start node <s>.0 end node </s>.86
INFO: ps_lattice.c(1365): Normalizer P(O) = alpha(</s>:86:86) = -333411
INFO: ps_lattice.c(1403): Joint P(O,S) = -333414 P(S|O) = -3
2014-10-14 15:13:28.822614 [WARNING] mod_pocketsphinx.c:348 Lost the text, never mind....
2014-10-14 15:13:30.922352 [NOTICE] switch_rtp.c:5132 Receiving an RTCP packet[2014-14-09 13:13:30.922476] SSRC[1123956418]RTT[0.001648] A[2684021799] - DLSR[53573] - LSR[2683968118]
2014-10-14 15:13:36.403317 [NOTICE] switch_rtp.c:5132 Receiving an RTCP packet[2014-14-09 13:13:36.403451] SSRC[1123956418]RTT[0.002731] A[2684381000] - DLSR[85028] - LSR[2684295793]
INFO: fsg_search.c(1032): 149 frames, 1750 HMMs (11/fr), 8700 senones (58/fr), 1006 history entries (6/fr)
INFO: fsg_search.c(1417): Start node <sil>.0:2:90
INFO: fsg_search.c(1417): Start node <sil>.0:2:90
INFO: fsg_search.c(1456): End node <sil>.122:124:148 (-955)
INFO: fsg_search.c(1456): End node <sil>.122:124:148 (-955)
INFO: fsg_search.c(1456): End node <sil>.122:124:148 (-955)
INFO: fsg_search.c(1456): End node pickup.87:107:148 (-4233)
INFO: fsg_search.c(1680): lattice start node <s>.0 end node </s>.149
INFO: ps_lattice.c(1365): Normalizer P(O) = alpha(</s>:149:149) = -927641
INFO: ps_lattice.c(1403): Joint P(O,S) = -927641 P(S|O) = 0
2014-10-14 15:13:41.883453 [NOTICE] switch_rtp.c:5132 Receiving an RTCP packet[2014-14-09 13:13:41.883618] SSRC[1123956418]RTT[0.002487] A[2684740148] - DLSR[116488] - LSR[2684623497]
2014-10-14 15:13:44.732381 [NOTICE] sofia.c:952 Hangup sofia/internal/1001@myip [CS_EXECUTE] [NORMAL_CLEARING]
2014-10-14 15:13:44.732381 [ERR] SpeechTools.jm:368 Exception: Session is not active! (near: " rv = this.asr.session.collectInput(this.asr.onInput, this.asr, 500);")
INFO: fsg_search.c(1032): 33 frames, 377 HMMs (11/fr), 1733 senones (52/fr), 275 history entries (8/fr)
2014-10-14 15:13:44.802526 [INFO] mod_pocketsphinx.c:257 Port Closed.
2014-10-14 15:13:44.823711 [NOTICE] switch_core_session.c:1633 Session 25 (sofia/internal/1001@myip) Ended
2014-10-14 15:13:44.823711 [NOTICE] switch_core_session.c:1637 Close Channel sofia/internal/1001@myip [CS_DESTROY]
EDIT 2:
编辑2:
I find out that the speech recognition works and it detect my speech. So the problem is that in SpeechTools.jm the result from the xml can not be load and is undefined.
我发现语音识别是有效的,它能检测到我的演讲。所以问题在于,在演讲工具中。jm的结果来自xml不能加载和未定义。
body = body.replace(/<\?.*?\?>/g, '');
console_log("debug", "----XML:\n" + body + "\n");
xml = new XML("<xml>" + body + "</xml>");
result = xml.result; //undefined
and my output from console_log
还有来自console_log的输出。
<result grammar="pizza_order">
<interpretation grammar="pizza_order" confidence="100">
<input mode="speech">pickup</input>
</interpretation>
</result>
1 个解决方案
#1
2
Okay, the speech recognition works the whole time (see edit). Real problem is that the whole script (SpeechTools.jm) is not working. They switched from mozilla javascript engine to google v8 without editing the script. However fixing the script is an javascript problem and has nothing to do with this question anymore.
好的,语音识别一直都是有效的(见编辑)。真正的问题是整个脚本(演讲工具。jm)都不起作用。他们在不编辑脚本的情况下从mozilla javascript引擎切换到谷歌v8。但是,修复脚本是一个javascript问题,与此问题无关。
#1
2
Okay, the speech recognition works the whole time (see edit). Real problem is that the whole script (SpeechTools.jm) is not working. They switched from mozilla javascript engine to google v8 without editing the script. However fixing the script is an javascript problem and has nothing to do with this question anymore.
好的,语音识别一直都是有效的(见编辑)。真正的问题是整个脚本(演讲工具。jm)都不起作用。他们在不编辑脚本的情况下从mozilla javascript引擎切换到谷歌v8。但是,修复脚本是一个javascript问题,与此问题无关。