基于Accord.Audio和百度语言识别

需要最新源码，或技术提问，请加QQ群：538327407

我的各种github 开源项目和代码：https://github.com/linbin524

目标需求

使用录音形式，模拟微信语音聊天。按住录音，松开发送语音，并完成语音识别。

ps：百度的语言识别有60秒长度限制，需要自己做好控制。

实现方案

采用C# winform 程序实现桌面版，采用Accord 实现语音录制停止等基础语音操作，操作停止按钮，

自动调用百度语言识别接口将识别内容显示在文本框中。

备注，语音识别需要配套阵列麦克风，（请先注册百度开发者）百度语音识别接口请参考：http://ai.baidu.com/docs#/ASR-Online-Csharp-SDK/top

实现效果展示

实现过程

1、下载Accord 完成语音操作引用

accord 官方地址：http://accord-framework.net/intro.html

官网中有示例demo，笔者的就是在示例demo上做改造的。

基于Accord.Audio和百度语言识别

建立自己的项目，引用包中的dll

界面代码：

using System;

using System.Drawing;

using System.IO;

using System.Windows.Forms;

using Accord.Audio;

using Accord.Audio.Formats;

using Accord.DirectSound;

using Accord.Audio.Filters;

using Baidu.Aip.API;

namespace SampleApp

{

    public partial class MainForm : Form

    {

        private MemoryStream stream;

        private IAudioSource source;

        private IAudioOutput output;

        private WaveEncoder encoder;

        private WaveDecoder decoder;

        private float[] current;

        private int frames;

        private int samples;

        private TimeSpan duration;

        /// <summary>

        /// 备注，语音识别需要配套阵列麦克风

        /// </summary>

        public MainForm()

        {

            InitializeComponent();

            // Configure the wavechart

            chart.SimpleMode = true;

            chart.AddWaveform("wave", Color.Green, , false);

            updateButtons();

           // Application.Idle += ProcessFrame;

        }

        void ProcessFrame(object sender, EventArgs e) {

        }

        /// <summary>

        ///   从声卡开始录制音频

        /// </summary>

        ///

        private void btnRecord_Click(object sender, EventArgs e)

        {

            // Create capture device

            source = new AudioCaptureDevice()//这里是核心

            {

                // Listen on 22050 Hz

                DesiredFrameSize = ,

                SampleRate = ,//采样率

                //SampleRate = 22050,//采样率

                Channels=,

                // We will be reading 16-bit PCM

                Format = SampleFormat.Format16Bit

            };

            // Wire up some events

            source.NewFrame += source_NewFrame;

            source.AudioSourceError += source_AudioSourceError;

            // Create buffer for wavechart control

            current = new float[source.DesiredFrameSize];

            // Create stream to store file

            stream = new MemoryStream();

            encoder = new WaveEncoder(stream);

            // Start

            source.Start();

            updateButtons();

        }

        /// <summary>

        ///   播放录制的音频流。

        /// </summary>

        ///

        private void btnPlay_Click(object sender, EventArgs e)

        {

            // First, we rewind the stream

            stream.Seek(, SeekOrigin.Begin);

            // Then we create a decoder for it

            decoder = new WaveDecoder(stream);

            // Configure the track bar so the cursor

            // can show the proper current position

            if (trackBar1.Value < decoder.Frames)

                decoder.Seek(trackBar1.Value);

            trackBar1.Maximum = decoder.Samples;

            // Here we can create the output audio device that will be playing the recording

            output = new AudioOutputDevice(this.Handle, decoder.SampleRate, decoder.Channels);

            // Wire up some events

            output.FramePlayingStarted += output_FramePlayingStarted;

            output.NewFrameRequested += output_NewFrameRequested;

            output.Stopped += output_PlayingFinished;

            // Start playing!

            output.Play();

            updateButtons();

        }

        /// <summary>

        /// 停止录制或播放流。

        /// </summary>

        ///

        private void btnStop_Click(object sender, EventArgs e)

        {

            // Stops both cases

            if (source != null)

            {

                // If we were recording

                source.SignalToStop();

                source.WaitForStop();

            }

            if (output != null)

            {

                // If we were playing

                output.SignalToStop();

                output.WaitForStop();

            }

            updateButtons();

            // Also zero out the buffers and screen

            Array.Clear(current, , current.Length);

            updateWaveform(current, current.Length);

            SpeechAPI speechApi = new SpeechAPI();

            string result = speechApi.AsrData(stream,"wav");

            tb_result.Text = "语音识别结果："+result;

        }

        /// <summary>

        /// 当音频有错误时，将调用这个回调函数。

        ///

        ///

        /// </summary>

        ///

        private void source_AudioSourceError(object sender, AudioSourceErrorEventArgs e)

        {

            throw new Exception(e.Description);

        }

        /// <summary>

        ///

        ///  每当有新的输入音频帧时，该方法将被调用。

        ///

        /// </summary>

        ///

        private void source_NewFrame(object sender, NewFrameEventArgs eventArgs)

        {

            eventArgs.Signal.CopyTo(current);

            updateWaveform(current, eventArgs.Signal.Length);

            encoder.Encode(eventArgs.Signal);

            duration += eventArgs.Signal.Duration;

            samples += eventArgs.Signal.Samples;

            frames += eventArgs.Signal.Length;

        }

        private void output_FramePlayingStarted(object sender, PlayFrameEventArgs e)

        {

            updateTrackbar(e.FrameIndex);

            if (e.FrameIndex + e.Count < decoder.Frames)

            {

                int previous = decoder.Position;

                decoder.Seek(e.FrameIndex);

                Signal s = decoder.Decode(e.Count);

                decoder.Seek(previous);

                updateWaveform(s.ToFloat(), s.Length);

            }

        }

        private void output_PlayingFinished(object sender, EventArgs e)

        {

            updateButtons();

            Array.Clear(current, , current.Length);

            updateWaveform(current, current.Length);

        }

        ///

        private void output_NewFrameRequested(object sender, NewFrameRequestedEventArgs e)

        {

            e.FrameIndex = decoder.Position;

            Signal signal = decoder.Decode(e.Frames);

            if (signal == null)

            {

                e.Stop = true;

                return;

            }

            e.Frames = signal.Length;

            signal.CopyTo(e.Buffer);

        }

        private void updateWaveform(float[] samples, int length)

        {

            if (InvokeRequired)

            {

                BeginInvoke(new Action(() =>

                {

                    chart.UpdateWaveform("wave", samples, length);

                }));

            }

            else

            {

                chart.UpdateWaveform("wave", current, length);

            }

        }

        ///

        private void updateTrackbar(int value)

        {

            if (InvokeRequired)

            {

                BeginInvoke(new Action(() =>

                {

                    trackBar1.Value = Math.Max(trackBar1.Minimum, Math.Min(trackBar1.Maximum, value));

                }));

            }

            else

            {

                trackBar1.Value = Math.Max(trackBar1.Minimum, Math.Min(trackBar1.Maximum, value));

            }

        }

        private void updateButtons()

        {

            if (InvokeRequired)

            {

                BeginInvoke(new Action(updateButtons));

                return;

            }

            if (source != null && source.IsRunning)

            {

                btnBwd.Enabled = false;

                btnFwd.Enabled = false;

                btnPlay.Enabled = false;

                btnStop.Enabled = true;

                btnRecord.Enabled = false;

                trackBar1.Enabled = false;

            }

            else if (output != null && output.IsRunning)

            {

                btnBwd.Enabled = false;

                btnFwd.Enabled = false;

                btnPlay.Enabled = false;

                btnStop.Enabled = true;

                btnRecord.Enabled = false;

                trackBar1.Enabled = true;

            }

            else

            {

                btnBwd.Enabled = false;

                btnFwd.Enabled = false;

                btnPlay.Enabled = stream != null;

                btnStop.Enabled = false;

                btnRecord.Enabled = true;

                trackBar1.Enabled = decoder != null;

                trackBar1.Value = ;

            }

        }

        private void MainFormFormClosed(object sender, FormClosedEventArgs e)

        {

            if (source != null) source.SignalToStop();

            if (output != null) output.SignalToStop();

        }

        private void saveFileDialog1_FileOk(object sender, System.ComponentModel.CancelEventArgs e)

        {

            Stream fileStream = saveFileDialog1.OpenFile();

            stream.WriteTo(fileStream);

            fileStream.Close();

        }

        private void saveToolStripMenuItem_Click(object sender, EventArgs e)

        {

            saveFileDialog1.ShowDialog(this);

        }

        private void updateTimer_Tick(object sender, EventArgs e)

        {

            lbLength.Text = String.Format("Length: {0:00.00} sec.", duration.Seconds);

        }

        private void aboutToolStripMenuItem_Click(object sender, EventArgs e)

        {

            new AboutBox().ShowDialog(this);

        }

        private void closeToolStripMenuItem_Click(object sender, EventArgs e)

        {

            Close();

        }

        private void btnIncreaseVolume_Click(object sender, EventArgs e)

        {

            adjustVolume(1.25f);

        }

        private void btnDecreaseVolume_Click(object sender, EventArgs e)

        {

            adjustVolume(0.75f);

        }

        private void adjustVolume(float value)

        {

            stream.Seek(, SeekOrigin.Begin);

            decoder = new WaveDecoder(stream);

            var signal = decoder.Decode();

            var volume = new VolumeFilter(value);

            volume.ApplyInPlace(signal);

            stream.Seek(, SeekOrigin.Begin);

            encoder = new WaveEncoder(stream);

            encoder.Encode(signal);

        }

    }

}

百度语音识别接口

百度已经提供sdk，对于支持语音格式如下。

支持的语音格式

原始 PCM 的录音参数必须符合 8k/16k 采样率、16bit 位深、单声道，支持的格式有：pcm（不压缩）、wav（不压缩，pcm编码）、amr（压缩格式）。

        public string AsrData(string filePath, string format = "pcm", int rate = )

        {

            var data =File.ReadAllBytes(filePath);

            var result = _asrClient.Recognize(data, format, );

            return result.ToString();

        }

结果评测：

对于普通的语言识别效果不好，需要阵列麦克风才可以。

基于Accord.Audio和百度语言识别的更多相关文章

一款基于jQuery的仿百度首页滑动选项卡
今天给大家分享一款基于jQuery的仿百度首页滑动选项卡.这款选项卡适用浏览器:IE8.360.FireFox.Chrome.Safari.Opera.傲游.搜狗.世界之窗.效果图如下: 在线预览 ...
基于HTML5 audio元素播放声音jQuery小插件
by zhangxinxu from http://www.zhangxinxu.com本文地址:http://www.zhangxinxu.com/wordpress/?p=1609 一.前面的些唠 ...
基于位置的服务——百度地图SDK练习
基于位置的服务所围绕的核心就是要先确定出用户所在的位置.通常有两种技术方式可以实现:一种是通过GPS定位,一种是通过网络定位.Android对这两种定位方式都提供了相应的API支持.但由于众所周知的原 ...
基于 Golang 完整获取百度地图POI数据的方案
百度地图为web开发者提供了基于HTTP/HTTPS协议的丰富接口,其中包括地点检索服务,web开发者通过此接口可以检索区域内的POI数据.百度地图处于数据保护对接口做了限制,每次访问服务,最多只能检 ...
基于指定文本的百度地图poi城市检索的使用（思路最重要）
(转载请注明出处哦)具体的百度地图权限和apikey配置以及基础地图的配置不叙述,百度地图定位可以看这个链接的http://blog.csdn.net/heweigzf/article/details ...
AI 系列总目录
AI 系列答应了园区大牛张善友要写AI 的系列博客,所以开始了AI 系列之旅. 一.四大平台系列(百度AI.阿里ET.腾讯.讯飞) 1.百度篇 (1) 百度OCR文字识别-身份证识别 (2) 基 ...
ASP&period;NET MVC WebApi 返回数据类型序列化控制（json,xml) 用javascript在客户端删除某一个cookie键值对 input点击链接另一个页面，各种操作。 C&num; 往线程里传参数的方法总结 TCP/IP 协议用C&num;+Selenium+ChromeDriver 生成我的咕咚跑步路线地图 (转)值得学习百度开源70+项目
ASP.NET MVC WebApi 返回数据类型序列化控制(json,xml) 我们都知道在使用WebApi的时候Controller会自动将Action的返回值自动进行各种序列化处理(序列化为 ...
&period;NET平台开源项目速览(13)机器学习组件Accord&period;NET框架功能介绍
Accord.NET Framework是在AForge.NET项目的基础上封装和进一步开发而来.因为AForge.NET更注重与一些底层和广度,而Accord.NET Framework更注重与机器 ...
5款帮助简化的HTML5 Audio开发的Javascript类库
HTML5的audio标签提供了我们方便控制声音的功能,可是使用原生的HTML5来开发声音或者音乐相关的项目仍旧很的麻烦.在今天这篇文章中,我们将介绍5款帮助你简化开发的javascript audi ...

随机推荐

2016&period;10&period;29 清北学堂NOIP冲刺班Day1 AM 考试总结
成绩:满分300,我得了200, 1:90//前两个题目都是模拟,没用到什么其他算法,第一题有可能少考虑了一点细节 2:100 3:10//感觉是个DP,但是毫无思路,只打了个普通背包,10分而已. ...
javap生成的字节码
https://www.zhihu.com/question/49470442/answer/135812845http://blog.csdn.net/tzs_1041218129
id 选择器
id 选择器 1.id 选择器可以为标有特定 id 的 HTML 元素指定特定的样式. (即也可以说,可以将已经预先定义的特定样式,通过id选择器,赋值指向HTML 元素) 2.HTML元素以id属性 ...
&lt&semi;杂记&gt&semi;该换个背景图了
..当然我刚开始也是懵逼的,我有发现这里可以写css,但是还是缺个图片地址,想了想,这不是还有个相册功能吗. 那应该就是把自己要换的图片上传到相册吧. 右击图片,选择检查元素找到图片的src 如:ht ...
百度AI搜索引擎
一.爬虫协议与其它爬虫不同,全站爬虫意图爬取网站所有页面,由于爬虫对网页的爬取速度比人工浏览快几百倍,对网站服务器来说压力山大,很容易造成网站崩溃. 为了避免双输的场面,大家约定,如果网站建设者不愿 ...
proxy config (firefox config)
sudo apt-get install * sudo apt-get install polipo 编辑polipo config: sudo vim /etc/polipo/c ...
马哥Linux base学习笔记
介绍课程: 中级: 初级:系统基础中级:系统管理.服务安全及服务管理.shell脚本高级: MySQL数据库: Cache & storgae 集群: Cluster lb: 4la ...
【Jenkins学习】安装配置和使用（一）
为了能够频繁地将软件的最新版本,及时.持续地交付给测试团队及质量控制团队,以供评审,所以引入持续集成工具Jenkins,从而实现公司新产品持续集成,自动化部署. 环境准备 ●操作系统:Windows1 ...
docker部署jenkins环境
首先获取jenkins的镜像: docker pull jenkins 设置jenkins_home映射: sudo mkidr -p /jenkins_home /jenkins_home 启动容器 ...
mysql事件的开启和调用
检测事件是否开启 show variables like 'event_scheduler'; 开启事件 set global event_scheduler = on; 创建一个存储过程 delim ...