So python is acting like acting like it can't hear ANYTHING from my microphone at all.
所以python的表现就好像它根本听不到我的麦克风。
Here's the problem. I have a Python ( 2.7 ) script that is suppose to be using Gstreamer to access my microphone and do speech recognition for me via Pocketsphinx. I'm using Pulse Audio and my device is a Raspberry Pi. My microphone is a Playstation 3 Eye.
问题就在这里。我有一个Python(2.7)脚本,它应该使用Gstreamer访问我的麦克风,并通过Pocketsphinx为我进行语音识别。我用的是脉冲音频,我的设备是覆盆子派。我的麦克风是Playstation 3。
Now off the bat, I have already gotten pocketsphinx_continuous to run correctly and recognize the words I have defined in my .dict and .lm files. The accuracy is around 85-90% accurate after a couple trial runs I've had. So off the bat I know my microphone is picking up sound normally via pocketsphinx + pulse audio.
现在,我已经获得了pocketsphinx_continuous,可以正确运行并识别我在.dict和.lm文件中定义的单词。在我进行了几次测试后,准确率在85-90%左右。我知道我的麦克风是通过pocketsphinx + pulse音频接收声音的。
FYI I ran the following:
我跑了一圈:
pocketsphinx_continuous -lm /home/pi/dev/scarlettPi/config/speech/lm/scarlett.lm -dict /home/pi/dev/scarlettPi/config/speech/dict/scarlett.dic -hmm /home/pi/dev/scarlettPi/config/speech/model/hmm/en_US/hub4wsj_sc_8k -silprob 0.1 -wip 1e-4 -bestpath 0
In my python code i'm attempting to do the same thing, but i'm using gstreamer to access the microphone in python. ( Note: I'm a bit new to Python )
在我的python代码中,我尝试做同样的事情,但是我使用gstreamer访问python中的麦克风。(注意:我对Python有点陌生)
Here is my code ( Thanks Josip Lisec for getting me this far ):
这是我的代码(感谢Josip Lisec让我走得这么远):
import pi
from pi.becore import ScarlettConfig
from recorder import Recorder
from brain import Brain
import os
import json
import tempfile
#import sys
import pygtk
pygtk.require('2.0')
import gtk
import gobject
import pygst
pygst.require('0.10')
gobject.threads_init()
import gst
scarlett_config=ScarlettConfig()
class Listener:
def __init__(self, gobject, gst):
self.failed = 0
self.pipeline = gst.parse_launch(' ! '.join(['pulsesrc',
'audioconvert',
'audioresample',
'vader name=vader auto-threshold=true',
'pocketsphinx lm=' + scarlett_config.get('LM') + ' dict=' + scarlett_config.get('DICT') + ' hmm=' + scarlett_config.get('HMM') + ' name=listener',
'fakesink']))
listener = self.pipeline.get_by_name('listener')
listener.connect('result', self.__result__)
listener.set_property('configured', True)
print "KEYWORDS WE'RE LOOKING FOR: " + scarlett_config.get('ourkeywords')
bus = self.pipeline.get_bus()
bus.add_signal_watch()
bus.connect('message::application', self.__application_message__)
self.pipeline.set_state(gst.STATE_PLAYING)
def result(self, hyp, uttid):
if hyp in scarlett_config.get('ourkeywords'):
self.failed = 0
self.listen()
else:
self.failed += 1
if self.failed > 4:
pi.speak("" + scarlett_config.get('scarlett_owner') + ", if you need me, just say my name.")
self.failed = 0
def listen(self):
self.pipeline.set_state(gst.STATE_PAUSED)
pi.play('pi-listening')
Recorder(self)
def cancel_listening(self):
pi.play('pi-cancel')
self.pipeline.set_state(gst.STATE_PLAYING)
# question - sound recording
def answer(self, question):
pi.play('pi-cancel')
print " * Contacting Google"
destf = tempfile.mktemp(suffix='piresult')
os.system('wget --post-file %s --user-agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.77 Safari/535.7" --header="Content-Type: audio/x-flac; rate=16000" -O %s -q "https://www.google.com/speech-api/v1/recognize?client=chromium&lang=en-US"' % (question, destf))
#os.system("speech2text %s > %s" % (question, destf))
b = open(destf)
result = b.read()
b.close()
os.unlink(question)
os.unlink(destf)
if len(result) == 0:
print " * nop"
pi.play('pi-cancel')
else:
brain = Brain(json.loads(result))
if brain.think() == False:
print " * nop2"
pi.play('pi-cancel')
self.pipeline.set_state(gst.STATE_PLAYING)
def __result__(self, listener, text, uttid):
struct = gst.Structure('result')
struct.set_value('hyp', text)
struct.set_value('uttid', uttid)
listener.post_message(gst.message_new_application(listener, struct))
def __application_message__(self, bus, msg):
msgtype = msg.structure.get_name()
if msgtype == 'result':
self.result(msg.structure['hyp'], msg.structure['uttid'])
The application is suppose to match on the keyword "Scarlett" then perform an action after that.
应用程序假定匹配关键字“Scarlett”,然后在此之后执行一个操作。
When I run my application, I get the following output:
当我运行我的应用程序时,我得到以下输出:
pi@scarlettpi ~/dev/scarlettPi/scripts/pi/bin $ ./pi
/usr/lib/python2.7/dist-packages/gtk-2.0/gtk/__init__.py:57: GtkWarning: could not open display
warnings.warn(str(e), _gtk.Warning)
INFO: cmd_ln.c(691): Parsing command line:
gst-pocketsphinx \
-samprate 8000 \
-cmn prior \
-fwdflat no \
-bestpath no \
-maxhmmpf 2000 \
-maxwpf 20
Current configuration:
[NAME] [DEFLT] [VALUE]
-agc none none
-agcthresh 2.0 2.000000e+00
-alpha 0.97 9.700000e-01
-ascale 20.0 2.000000e+01
-aw 1 1
-backtrace no no
-beam 1e-48 1.000000e-48
-bestpath no no
-bestpathlw 9.5 9.500000e+00
-bghist no no
-ceplen 13 13
-cmn current prior
-cmninit 8.0 8.0
-compallsen no no
-debug 0
-dict
-dictcase no no
-dither no no
-doublebw no no
-ds 1 1
-fdict
-feat 1s_c_d_dd 1s_c_d_dd
-featparams
-fillprob 1e-8 1.000000e-08
-frate 100 100
-fsg
-fsgusealtpron yes yes
-fsgusefiller yes yes
-fwdflat yes no
-fwdflatbeam 1e-64 1.000000e-64
-fwdflatefwid 4 4
-fwdflatlw 8.5 8.500000e+00
-fwdflatsfwin 25 25
-fwdflatwbeam 7e-29 7.000000e-29
-fwdtree yes yes
-hmm
-input_endian little little
-jsgf
-kdmaxbbi -1 -1
-kdmaxdepth 0 0
-kdtree
-latsize 5000 5000
-lda
-ldadim 0 0
-lextreedump 0 0
-lifter 0 0
-lm
-lmctl
-lmname default default
-logbase 1.0001 1.000100e+00
-logfn
-logspec no no
-lowerf 133.33334 1.333333e+02
-lpbeam 1e-40 1.000000e-40
-lponlybeam 7e-29 7.000000e-29
-lw 6.5 6.500000e+00
-maxhmmpf -1 2000
-maxnewoov 20 20
-maxwpf -1 20
-mdef
-mean
-mfclogdir
-min_endfr 0 0
-mixw
-mixwfloor 0.0000001 1.000000e-07
-mllr
-mmap yes yes
-ncep 13 13
-nfft 512 512
-nfilt 40 40
-nwpen 1.0 1.000000e+00
-pbeam 1e-48 1.000000e-48
-pip 1.0 1.000000e+00
-pl_beam 1e-10 1.000000e-10
-pl_pbeam 1e-5 1.000000e-05
-pl_window 0 0
-rawlogdir
-remove_dc no no
-round_filters yes yes
-samprate 16000 8.000000e+03
-seed -1 -1
-sendump
-senlogdir
-senmgau
-silprob 0.1 1.000000e-01
-smoothspec no no
-svspec
-tmat
-tmatfloor 0.0001 1.000000e-04
-topn 4 4
-topn_beam 0 0
-toprule
-transform legacy legacy
-unit_area yes yes
-upperf 6855.4976 6.855498e+03
-usewdphones no no
-uw 1.0 1.000000e+00
-var
-varfloor 0.0001 1.000000e-04
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wbeam 7e-29 7.000000e-29
-wip 1e-4 1.000000e-04
-wlen 0.025625 2.562500e-02
INFO: cmd_ln.c(691): Parsing command line:
\
-nfilt 20 \
-lowerf 1 \
-upperf 4000 \
-wlen 0.025 \
-transform dct \
-round_filters no \
-remove_dc yes \
-svspec 0-12/13-25/26-38 \
-feat 1s_c_d_dd \
-agc none \
-cmn current \
-cmninit 56,-3,1 \
-varnorm no
Current configuration:
[NAME] [DEFLT] [VALUE]
-agc none none
-agcthresh 2.0 2.000000e+00
-alpha 0.97 9.700000e-01
-ceplen 13 13
-cmn current current
-cmninit 8.0 56,-3,1
-dither no no
-doublebw no no
-feat 1s_c_d_dd 1s_c_d_dd
-frate 100 100
-input_endian little little
-lda
-ldadim 0 0
-lifter 0 0
-logspec no no
-lowerf 133.33334 1.000000e+00
-ncep 13 13
-nfft 512 512
-nfilt 40 20
-remove_dc no yes
-round_filters yes no
-samprate 16000 8.000000e+03
-seed -1 -1
-smoothspec no no
-svspec 0-12/13-25/26-38
-transform legacy dct
-unit_area yes yes
-upperf 6855.4976 4.000000e+03
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wlen 0.025625 2.500000e-02
INFO: acmod.c(246): Parsed model-specific feature parameters from /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/feat.params
INFO: feat.c(713): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(142): mean[0]= 12.00, mean[1..12]= 0.0
INFO: acmod.c(167): Using subvector specification 0-12/13-25/26-38
INFO: mdef.c(517): Reading model definition: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/mdef
INFO: mdef.c(528): Found byte-order mark BMDF, assuming this is a binary mdef file
INFO: bin_mdef.c(336): Reading binary model definition: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/mdef
INFO: bin_mdef.c(513): 50 CI-phone, 143047 CD-phone, 3 emitstate/phone, 150 CI-sen, 5150 Sen, 27135 Sen-Seq
INFO: tmat.c(205): Reading HMM transition probability matrices: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/transition_matrices
INFO: acmod.c(121): Attempting to use SCHMM computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/means
INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/variances
INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(354): 0 variance values floored
INFO: s2_semi_mgau.c(903): Loading senones from dump file /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/sendump
INFO: s2_semi_mgau.c(927): BEGIN FILE FORMAT DESCRIPTION
INFO: s2_semi_mgau.c(1022): Using memory-mapped I/O for senones
INFO: s2_semi_mgau.c(1296): Maximum top-N: 4 Top-N beams: 0 0 0
INFO: dict.c(317): Allocating 4120 * 20 bytes (80 KiB) for word entries
INFO: dict.c(332): Reading main dictionary: /home/pi/dev/scarlettPi/config/speech/dict/scarlett.dic
INFO: dict.c(211): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(335): 13 words read
INFO: dict.c(341): Reading filler dictionary: /usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k/noisedict
INFO: dict.c(211): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(344): 11 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(404): Allocating 50^3 * 2 bytes (244 KiB) for word-initial triphones
INFO: dict2pid.c(131): Allocated 30200 bytes (29 KiB) for word-final triphones
INFO: dict2pid.c(195): Allocated 30200 bytes (29 KiB) for single-phone word triphones
INFO: ngram_model_arpa.c(477): ngrams 1=12, 2=18, 3=17
INFO: ngram_model_arpa.c(135): Reading unigrams
INFO: ngram_model_arpa.c(516): 12 = #unigrams created
INFO: ngram_model_arpa.c(195): Reading bigrams
INFO: ngram_model_arpa.c(533): 18 = #bigrams created
INFO: ngram_model_arpa.c(534): 3 = #prob2 entries
INFO: ngram_model_arpa.c(542): 3 = #bo_wt2 entries
INFO: ngram_model_arpa.c(292): Reading trigrams
INFO: ngram_model_arpa.c(555): 17 = #trigrams created
INFO: ngram_model_arpa.c(556): 2 = #prob3 entries
INFO: ngram_search_fwdtree.c(99): 12 unique initial diphones
INFO: ngram_search_fwdtree.c(147): 0 root, 0 non-root channels, 12 single-phone words
INFO: ngram_search_fwdtree.c(186): Creating search tree
INFO: ngram_search_fwdtree.c(191): before: 0 root, 0 non-root channels, 12 single-phone words
INFO: ngram_search_fwdtree.c(326): after: max nonroot chan increased to 152
INFO: ngram_search_fwdtree.c(338): after: 12 root, 24 non-root channels, 11 single-phone words
KEYWORDS WE'RE LOOKING FOR: [ 'scarlett', 'SCARLETT' ]
But it fails to match on anything. I almost think python can not hear anything from the microphone, there aren't even any attempts to recognize anything. In pocketsphinx_continuious it usually prints out a READY state when its prepared to start listening...I expect the same in python?
但它在任何事情上都不匹配。我几乎认为python听不到来自麦克风的任何声音,甚至没有任何尝试去识别任何东西。在pocketsphinx_continuation中,当它准备开始监听时,通常会打印出一个就绪状态……我想在python里也是这样吧?
Here are my python packages:
以下是我的python包:
pi@scarlettpi ~/dev/scarlettPi/scripts/pi/bin $ dpkg -l | grep -i python
ii idle 2.7.3-4 all IDE for Python using Tkinter (default version)
ii idle-python2.7 2.7.3-6 all IDE for Python (v2.7) using Tkinter
rc idle3 3.2.3-6 all IDE for Python using Tkinter (default version)
ii libpyside1.1:armhf 1.1.1-3 armhf Python bindings for Qt 4 (base files)
ii libpython2.6 2.6.8-1.1 armhf Shared Python runtime library (version 2.6)
ii libpython2.7 2.7.3-6 armhf Shared Python runtime library (version 2.7)
ii libshiboken1.1:armhf 1.1.1-1 armhf CPython bindings generator for C++ libraries - shared library
ii python 2.7.3-4 all interactive high-level object-oriented language (default version)
ii python-alsaaudio 0.5+svn36-1 armhf Alsa bindings for Python
ii python-cairo 1.8.8-1 armhf Python bindings for the Cairo vector graphics library
ii python-dbg 2.7.3-4 all debug build of the Python Interpreter (version 2.7)
ii python-dbus 1.1.1-1 armhf simple interprocess messaging system (Python interface)
ii python-dbus-dev 1.1.1-1 all main loop integration development files for python-dbus
ii python-dev 2.7.3-4 all header files and a static library for Python (default)
ii python-gi 3.2.2-2 armhf Python 2.x bindings for gobject-introspection libraries
ii python-gi-dbg 3.2.2-2 armhf Python bindings for the GObject library (debug extension)
ii python-gi-dev 3.2.2-2 all development headers for GObject Python bindings
ii python-gobject 3.2.2-2 all Python 2.x bindings for GObject - transitional package
ii python-gobject-2 2.28.6-10 armhf deprecated static Python bindings for the GObject library
ii python-gobject-2-dbg 2.28.6-10 armhf deprecated static Python bindings for the GObject library (debug extension)
ii python-gobject-2-dev 2.28.6-10 all development headers for the static GObject Python bindings
ii python-gobject-dbg 3.2.2-2 all Python 2.x debugging modules for GObject - transitional package
ii python-gobject-dev 3.2.2-2 all Python 2.x development headers for GObject - transitional package
ii python-gst0.10 0.10.22-3 armhf generic media-playing framework (Python bindings)
ii python-gst0.10-dbg 0.10.22-3 armhf generic media-playing framework (Python debug bindings)
ii python-gst0.10-dev 0.10.22-3 armhf generic media-playing framework (Python bindings)
ii python-gst0.10-rtsp 0.10.8-3 armhf GStreamer RTSP server plugin (Python bindings)
ii python-gtk2 2.24.0-3 armhf Python bindings for the GTK+ widget set
ii python-iplib 1.1-3 all Python library to convert amongst many different IPv4 notations
ii python-libxml2 2.8.0+dfsg1-7+nmu1 armhf Python bindings for the GNOME XML library
ii python-minimal 2.7.3-4 all minimal subset of the Python language (default version)
ii python-numpy 1:1.6.2-1.2 armhf Numerical Python adds a fast array facility to the Python language
ii python-pexpect 2.4-1 all Python module for automating interactive applications
ii python-pip 1.1-3 all alternative Python package installer
ii python-pkg-resources 0.6.24-1 all Package Discovery and Resource Access using pkg_resources
ii python-pyalsa 1.0.25-1 armhf Official ALSA Python binding library
ii python-pyside 1.1.1-3 all Python bindings for Qt4 (big metapackage)
ii python-pyside.phonon 1.1.1-3 armhf Qt 4 Phonon module - Python bindings
ii python-pyside.qtcore 1.1.1-3 armhf Qt 4 core module - Python bindings
ii python-pyside.qtdeclarative 1.1.1-3 armhf Qt 4 Declarative module - Python bindings
ii python-pyside.qtgui 1.1.1-3 armhf Qt 4 GUI module - Python bindings
ii python-pyside.qthelp 1.1.1-3 armhf Qt 4 help module - Python bindings
ii python-pyside.qtnetwork 1.1.1-3 armhf Qt 4 network module - Python bindings
ii python-pyside.qtopengl 1.1.1-3 armhf Qt 4 OpenGL module - Python bindings
ii python-pyside.qtscript 1.1.1-3 armhf Qt 4 script module - Python bindings
ii python-pyside.qtsql 1.1.1-3 armhf Qt 4 SQL module - Python bindings
ii python-pyside.qtsvg 1.1.1-3 armhf Qt 4 SVG module - Python bindings
ii python-pyside.qttest 1.1.1-3 armhf Qt 4 test module - Python bindings
ii python-pyside.qtuitools 1.1.1-3 armhf Qt 4 UI tools module - Python bindings
ii python-pyside.qtwebkit 1.1.1-3 armhf Qt 4 WebKit module - Python bindings
ii python-pyside.qtxml 1.1.1-3 armhf Qt 4 XML module - Python bindings
ii python-rpi.gpio 0.5.3a-1 armhf Python GPIO module for Raspberry Pi
ii python-setuptools 0.6.24-1 all Python Distutils Enhancements (setuptools compatibility)
ii python-simplejson 2.5.2-1 armhf simple, fast, extensible JSON encoder/decoder for Python
ii python-support 1.0.15 all automated rebuilding support for Python modules
ii python-tk 2.7.3-1 armhf Tkinter - Writing Tk applications with Python
ii python-yaml 3.10-4 armhf YAML parser and emitter for Python
ii python-yaml-dbg 3.10-4 armhf YAML parser and emitter for Python (debug build)
ii python2.6 2.6.8-1.1 armhf Interactive high-level object-oriented language (version 2.6)
ii python2.6-minimal 2.6.8-1.1 armhf Minimal subset of the Python language (version 2.6)
ii python2.7 2.7.3-6 armhf Interactive high-level object-oriented language (version 2.7)
ii python2.7-dbg 2.7.3-6 armhf Debug Build of the Python Interpreter (version 2.7)
ii python2.7-dev 2.7.3-6 armhf Header files and a static library for Python (v2.7)
ii python2.7-minimal 2.7.3-6 armhf Minimal subset of the Python language (version 2.7)
pi@scarlettpi ~/dev/scarlettPi/scripts/pi/bin $
Also just to confirm that pocketsphinx is complied correctly against the right libaries:
也只是为了确认pocketsphinx与正确的libaries是否相符:
pi@scarlettpi ~ $ ldd /usr/local/bin/pocketsphinx_continuous
/usr/lib/arm-linux-gnueabihf/libcofi_rpi.so (0xb6f9b000)
libpocketsphinx.so.1 => /usr/local/lib/libpocketsphinx.so.1 (0xb6f5a000)
libsphinxad.so.0 => /usr/local/lib/libsphinxad.so.0 (0xb6f4e000)
libsphinxbase.so.1 => /usr/local/lib/libsphinxbase.so.1 (0xb6f07000)
libpulse.so.0 => /usr/lib/arm-linux-gnueabihf/libpulse.so.0 (0xb6ea8000)
libpulse-simple.so.0 => /usr/lib/arm-linux-gnueabihf/libpulse-simple.so.0 (0xb6e9c000)
libpthread.so.0 => /lib/arm-linux-gnueabihf/libpthread.so.0 (0xb6e7d000)
libm.so.6 => /lib/arm-linux-gnueabihf/libm.so.6 (0xb6e0c000)
libc.so.6 => /lib/arm-linux-gnueabihf/libc.so.6 (0xb6cdd000)
libjson.so.0 => /lib/arm-linux-gnueabihf/libjson.so.0 (0xb6ccd000)
libpulsecommon-2.0.so => /usr/lib/arm-linux-gnueabihf/pulseaudio/libpulsecommon-2.0.so (0xb6c6b000)
libdbus-1.so.3 => /lib/arm-linux-gnueabihf/libdbus-1.so.3 (0xb6c29000)
libcap.so.2 => /lib/arm-linux-gnueabihf/libcap.so.2 (0xb6c1e000)
librt.so.1 => /lib/arm-linux-gnueabihf/librt.so.1 (0xb6c0f000)
libdl.so.2 => /lib/arm-linux-gnueabihf/libdl.so.2 (0xb6c04000)
libgcc_s.so.1 => /lib/arm-linux-gnueabihf/libgcc_s.so.1 (0xb6bdb000)
/lib/ld-linux-armhf.so.3 (0xb6fa8000)
libX11-xcb.so.1 => /usr/lib/arm-linux-gnueabihf/libX11-xcb.so.1 (0xb6bd2000)
libX11.so.6 => /usr/lib/arm-linux-gnueabihf/libX11.so.6 (0xb6abe000)
libxcb.so.1 => /usr/lib/arm-linux-gnueabihf/libxcb.so.1 (0xb6a9f000)
libICE.so.6 => /usr/lib/arm-linux-gnueabihf/libICE.so.6 (0xb6a82000)
libSM.so.6 => /usr/lib/arm-linux-gnueabihf/libSM.so.6 (0xb6a73000)
libXtst.so.6 => /usr/lib/arm-linux-gnueabihf/libXtst.so.6 (0xb6a67000)
libwrap.so.0 => /lib/arm-linux-gnueabihf/libwrap.so.0 (0xb6a57000)
libsndfile.so.1 => /usr/lib/arm-linux-gnueabihf/libsndfile.so.1 (0xb69ee000)
libasyncns.so.0 => /usr/lib/arm-linux-gnueabihf/libasyncns.so.0 (0xb69e2000)
libattr.so.1 => /lib/arm-linux-gnueabihf/libattr.so.1 (0xb69d4000)
libXau.so.6 => /usr/lib/arm-linux-gnueabihf/libXau.so.6 (0xb69ca000)
libXdmcp.so.6 => /usr/lib/arm-linux-gnueabihf/libXdmcp.so.6 (0xb69be000)
libuuid.so.1 => /lib/arm-linux-gnueabihf/libuuid.so.1 (0xb69b1000)
libXext.so.6 => /usr/lib/arm-linux-gnueabihf/libXext.so.6 (0xb699b000)
libXi.so.6 => /usr/lib/arm-linux-gnueabihf/libXi.so.6 (0xb6986000)
libnsl.so.1 => /lib/arm-linux-gnueabihf/libnsl.so.1 (0xb696a000)
libFLAC.so.8 => /usr/lib/arm-linux-gnueabihf/libFLAC.so.8 (0xb691f000)
libvorbisenc.so.2 => /usr/lib/arm-linux-gnueabihf/libvorbisenc.so.2 (0xb67b2000)
libvorbis.so.0 => /usr/lib/arm-linux-gnueabihf/libvorbis.so.0 (0xb6782000)
libogg.so.0 => /usr/lib/arm-linux-gnueabihf/libogg.so.0 (0xb6775000)
libresolv.so.2 => /lib/arm-linux-gnueabihf/libresolv.so.2 (0xb6761000)
pi@scarlettpi ~ $
And if you need to see any information about my microphone ( ps3 eye ):
如果你需要了解我的麦克风信息(ps3 eye):
Had to throw this in pastebin, ran out of room in this post.
不得不把这个扔进了pastebin,在这个岗位上跑出了房间。
http://pastebin.com/gSDZwRHc
Does anyone have any ideas why this isn't working? Please let me know if my question needs any clarification or if I can provide any more information to aid with debugging.
有没有人知道为什么这个行不通?如果我的问题需要澄清,或者我能提供更多的信息帮助调试,请告诉我。
Thanks.
谢谢。
1 个解决方案
#1
4
So I finally got this guy working.
所以我终于让这个家伙工作了。
Couple key things I needed to realize:
我需要意识到以下几点:
1. Even if you're using Pulseaudio on your Raspberry Pi, as long as Alsa is still installed you're still able to use it. ( This might seem like a no brainer to others, but I honestly didn't realize I could still use both of these at the same time ) Hint via (syb0rg).
1。即使您在树莓派上使用Pulseaudio,只要Alsa仍然安装,您仍然可以使用它。(对于其他人来说,这似乎不需要动脑筋,但老实说,我没有意识到我仍然可以同时使用这两种方法)通过(syb0rg)提示。
2. When it comes to sending large amounts of raw audio data ( .wav format in my case ) to Pocketsphinx via Gstreamer, (queues) are your friend.
2。当需要通过Gstreamer向Pocketsphinx发送大量原始音频数据(在我的例子中是.wav格式)时,(队列)是您的朋友。
After messing around with gst-launch-0.10 on the command line for a while I came across something that actually worked:
在使用gst-发射-0.10在命令行上折腾了一段时间之后,我发现了一些真正有用的东西:
gst-launch-0.10 alsasrc device=hw:1 ! queue ! audioconvert ! audioresample ! queue ! vader name=vader auto-threshold=true ! pocketsphinx lm=/home/pi/dev/scarlettPi/config/speech/lm/scarlett.lm dict=/home/pi/dev/scarlettPi/config/speech/dict/scarlett.dic hmm=/usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k name=listener ! fakesink dump=1
So what's happening here?
这里发生了什么?
- Gstreamer is listening to device hw:1 ( Which is my Ps3 Eye USB device ). This device might vary, you can determine this by running :
- Gstreamer正在监听设备hw:1(这是我的Ps3 Eye USB设备)。此设备可能会有所不同,您可以通过以下操作来确定:
pi@scarlettpi ~ $ pacmd dump Welcome to PulseAudio! Use "help" for usage information. .... load-module module-alsa-card device_id="0" name="platform-bcm2835_AUD0.0"
card_name="alsa_card.platform-bcm2835_AUD0.0" namereg_fail=false tsched=yes fixed_latency_range=no ignore_dB=no deferred_volume=yes card_properties="module-udev-detect.discovered=1"
card_name="alsa_card.platform-bcm2835_AUD0.0" namereg_fail=false tsched=yes fixed_latency_range=no ignore_dB=no ignore_dB=no deferred_volume=yes card_properties="module-udev-detect.discovered=1"
load-module module-udev-detect load-module module-bluetooth-discover load-module module-esound-protocol-unix load-module module-native-protocol-unix load-module module-gconf load-module module-default-device-restore load-module module-rescue-streams load-module module-always-sink load-module module-intended-roles load-module module-console-kit load-module module-systemd-login load-module module-position-event-sounds load-module module-role-cork load-module module-filter-heuristics load-module module-filter-apply load-module module-dbus-protocol load-module module-switch-on-port-available load-module module-cli-protocol-unix load-module module-alsa-card device_id="1" name="usb-OmniVision_Technologies__Inc._USB_Camera-B4.09.24.1-01-CameraB409241" card_name="alsa_card.usb-OmniVision_Technologies__Inc._USB_Camera-B4.09.24.1-01-CameraB409241" namereg_fail=false tsched=yes fixed_latency_range=no ignore_dB=no
deferred_volume=yes card_properties="module-udev-detect.discovered=1"
deferred_volume = yes card_properties = " module-udev-detect.discovered = 1 "
....
The important line to notice is:
需要注意的要点是:
load-module module-alsa-card device_id="1" name="usb-OmniVision_Technologies__Inc._USB_Camera-B4.09.24.1-01-CameraB409241" card_name="alsa_card.usb-OmniVision_Technologies__Inc._USB_Camera-B4.09.24.1-01-CameraB409241" namereg_fail=false tsched=yes fixed_latency_range=no ignore_dB=no deferred_volume=yes card_properties="module-udev-detect.discovered=1"
Thats my Playstation 3 Eye, and thats on device_id=1. Hence hw:1
那是我的Playstation 3,那是device_id=1。因此hw:1
-
The audio data coming in from the ps3 eye gets resampled and added to a gstreamer queue and has to pass through a (vader) element before moving on to pocketsphinx. By passing the audio through the vader element w/ the auto-threshold=true flag on, gstreamer can determine the background noise level, which can be important if you have a lousy soundcard or a far-field microphone. This is how the pocketsphinx element will know when an utterance starts and ends.
来自ps3 eye的音频数据将被重新采样并添加到gstreamer队列中,并必须经过一个(维达)元素,然后才能转移到pocketsphinx。通过通过vader元素w/ the auto-threshold=true flag来传递音频,gstreamer可以确定背景噪声级别,如果你有一个糟糕的声卡或一个远场麦克风,这一点很重要。这就是pocketsphinx元素如何知道一个语句何时开始和结束。
-
Add the regular pocketsphix arguments to the pipeline that we already determined (here).
将常规的pocketsphix参数添加到我们已经确定的管道(这里)。
-
Pass everything into a fakesink since we don't need to hear anything right now, we only need pocketsphinx to listen to everything. The dump=1 flag provides us with more debugging information to see what's being processed / if audio is being accepted at all.
把所有的东西都放进伪造的文件里,因为我们现在不需要听任何东西,我们只需要口袋里的狮身人面像来听所有的东西。dump=1标志为我们提供了更多的调试信息,以查看正在处理的内容/是否完全接受音频。
** After getting that to run successfully, the new python code looks like this: **
在成功运行该代码之后,新的python代码如下所示:**
self.pipeline = gst.parse_launch(' ! '.join(['alsasrc device=' + scarlett_config.gimmie('audio_input_device'),
'queue',
'audioconvert',
'audioresample',
'queue',
'vader name=vader auto-threshold=true',
'pocketsphinx lm=' + scarlett_config.gimmie('LM') + ' dict=' + scarlett_config.gimmie('DICT') + ' hmm=' + scarlett_config.gimmie('HMM') + ' name=listener',
'fakesink dump=1']))
Hope this helps someone.
希望这可以帮助别人。
NOTE: Please excuse me if my Gstreamer pipline is using excessive elements. I'm fairly new to Gstreamer, and i'm opener to more efficient ways of doing this.
注意:如果我的Gstreamer pipline使用了过多的元素,请见谅。我是Gstreamer的新手,我可以开始使用更有效的方法。
#1
4
So I finally got this guy working.
所以我终于让这个家伙工作了。
Couple key things I needed to realize:
我需要意识到以下几点:
1. Even if you're using Pulseaudio on your Raspberry Pi, as long as Alsa is still installed you're still able to use it. ( This might seem like a no brainer to others, but I honestly didn't realize I could still use both of these at the same time ) Hint via (syb0rg).
1。即使您在树莓派上使用Pulseaudio,只要Alsa仍然安装,您仍然可以使用它。(对于其他人来说,这似乎不需要动脑筋,但老实说,我没有意识到我仍然可以同时使用这两种方法)通过(syb0rg)提示。
2. When it comes to sending large amounts of raw audio data ( .wav format in my case ) to Pocketsphinx via Gstreamer, (queues) are your friend.
2。当需要通过Gstreamer向Pocketsphinx发送大量原始音频数据(在我的例子中是.wav格式)时,(队列)是您的朋友。
After messing around with gst-launch-0.10 on the command line for a while I came across something that actually worked:
在使用gst-发射-0.10在命令行上折腾了一段时间之后,我发现了一些真正有用的东西:
gst-launch-0.10 alsasrc device=hw:1 ! queue ! audioconvert ! audioresample ! queue ! vader name=vader auto-threshold=true ! pocketsphinx lm=/home/pi/dev/scarlettPi/config/speech/lm/scarlett.lm dict=/home/pi/dev/scarlettPi/config/speech/dict/scarlett.dic hmm=/usr/local/share/pocketsphinx/model/hmm/en_US/hub4wsj_sc_8k name=listener ! fakesink dump=1
So what's happening here?
这里发生了什么?
- Gstreamer is listening to device hw:1 ( Which is my Ps3 Eye USB device ). This device might vary, you can determine this by running :
- Gstreamer正在监听设备hw:1(这是我的Ps3 Eye USB设备)。此设备可能会有所不同,您可以通过以下操作来确定:
pi@scarlettpi ~ $ pacmd dump Welcome to PulseAudio! Use "help" for usage information. .... load-module module-alsa-card device_id="0" name="platform-bcm2835_AUD0.0"
card_name="alsa_card.platform-bcm2835_AUD0.0" namereg_fail=false tsched=yes fixed_latency_range=no ignore_dB=no deferred_volume=yes card_properties="module-udev-detect.discovered=1"
card_name="alsa_card.platform-bcm2835_AUD0.0" namereg_fail=false tsched=yes fixed_latency_range=no ignore_dB=no ignore_dB=no deferred_volume=yes card_properties="module-udev-detect.discovered=1"
load-module module-udev-detect load-module module-bluetooth-discover load-module module-esound-protocol-unix load-module module-native-protocol-unix load-module module-gconf load-module module-default-device-restore load-module module-rescue-streams load-module module-always-sink load-module module-intended-roles load-module module-console-kit load-module module-systemd-login load-module module-position-event-sounds load-module module-role-cork load-module module-filter-heuristics load-module module-filter-apply load-module module-dbus-protocol load-module module-switch-on-port-available load-module module-cli-protocol-unix load-module module-alsa-card device_id="1" name="usb-OmniVision_Technologies__Inc._USB_Camera-B4.09.24.1-01-CameraB409241" card_name="alsa_card.usb-OmniVision_Technologies__Inc._USB_Camera-B4.09.24.1-01-CameraB409241" namereg_fail=false tsched=yes fixed_latency_range=no ignore_dB=no
deferred_volume=yes card_properties="module-udev-detect.discovered=1"
deferred_volume = yes card_properties = " module-udev-detect.discovered = 1 "
....
The important line to notice is:
需要注意的要点是:
load-module module-alsa-card device_id="1" name="usb-OmniVision_Technologies__Inc._USB_Camera-B4.09.24.1-01-CameraB409241" card_name="alsa_card.usb-OmniVision_Technologies__Inc._USB_Camera-B4.09.24.1-01-CameraB409241" namereg_fail=false tsched=yes fixed_latency_range=no ignore_dB=no deferred_volume=yes card_properties="module-udev-detect.discovered=1"
Thats my Playstation 3 Eye, and thats on device_id=1. Hence hw:1
那是我的Playstation 3,那是device_id=1。因此hw:1
-
The audio data coming in from the ps3 eye gets resampled and added to a gstreamer queue and has to pass through a (vader) element before moving on to pocketsphinx. By passing the audio through the vader element w/ the auto-threshold=true flag on, gstreamer can determine the background noise level, which can be important if you have a lousy soundcard or a far-field microphone. This is how the pocketsphinx element will know when an utterance starts and ends.
来自ps3 eye的音频数据将被重新采样并添加到gstreamer队列中,并必须经过一个(维达)元素,然后才能转移到pocketsphinx。通过通过vader元素w/ the auto-threshold=true flag来传递音频,gstreamer可以确定背景噪声级别,如果你有一个糟糕的声卡或一个远场麦克风,这一点很重要。这就是pocketsphinx元素如何知道一个语句何时开始和结束。
-
Add the regular pocketsphix arguments to the pipeline that we already determined (here).
将常规的pocketsphix参数添加到我们已经确定的管道(这里)。
-
Pass everything into a fakesink since we don't need to hear anything right now, we only need pocketsphinx to listen to everything. The dump=1 flag provides us with more debugging information to see what's being processed / if audio is being accepted at all.
把所有的东西都放进伪造的文件里,因为我们现在不需要听任何东西,我们只需要口袋里的狮身人面像来听所有的东西。dump=1标志为我们提供了更多的调试信息,以查看正在处理的内容/是否完全接受音频。
** After getting that to run successfully, the new python code looks like this: **
在成功运行该代码之后,新的python代码如下所示:**
self.pipeline = gst.parse_launch(' ! '.join(['alsasrc device=' + scarlett_config.gimmie('audio_input_device'),
'queue',
'audioconvert',
'audioresample',
'queue',
'vader name=vader auto-threshold=true',
'pocketsphinx lm=' + scarlett_config.gimmie('LM') + ' dict=' + scarlett_config.gimmie('DICT') + ' hmm=' + scarlett_config.gimmie('HMM') + ' name=listener',
'fakesink dump=1']))
Hope this helps someone.
希望这可以帮助别人。
NOTE: Please excuse me if my Gstreamer pipline is using excessive elements. I'm fairly new to Gstreamer, and i'm opener to more efficient ways of doing this.
注意:如果我的Gstreamer pipline使用了过多的元素,请见谅。我是Gstreamer的新手,我可以开始使用更有效的方法。