使用 Python 进行语音识别

4445 阅读 0 评论 0 点赞

您有没有想过如何使用 Python 编码简单的魔力开始使用语音识别？你想告诉计算机只用你自己脚本中的声音在网上搜索什么吗？

首先，像往常一样，我们需要使用一些很棒的模块，等待我们构建很棒的脚本。对于语音识别，最先发生的事情之一就是准确识别我们的声音，这是由一个名为 Recognizer 的流行类完成的，该类存在于这些技术的大多数流行模块中

当我们创建这个类的实例时，它将承担识别语音的任务。这个或课程有几个需要配置的设置，以便在识别来自音频源的语音时更有效地工作。

但是，等一下，抓住那些马，好吗？我们将在本文中使用哪个包或模块来一劳永逸地使用语音识别功能？好吧，让我马上告诉你，亲爱的朋友。

识别器类的每个实例都有七种方法，可使用各种 API 从音频源识别语音。这些是：

recognize_bing: Microsoft Bing Speech
recognize_google: Google Web Speech API This is the one we will use in this article.
recognize_google_cloud: Google Cloud Speech - requires installation of the google-cloud-speech package
recognize_houndify: Houndify by SoundHound
recognize_ibm: IBM Speech to Text
recognize_sphinx: CMU Sphinx - requires installing PocketSphinx
recognize_wit: Wit.ai

这七种识别方法中，只有recognize_sphinx与CMU Sphinx 引擎脱机工作。其他都需要互联网连接

澄清了这一点后，让我们继续安装语音识别模块。让我们首先创建一个文件夹，其中将包含我们的脚本以及我们将要安装的所有这些依赖项。我叫我的语音识别，但你在寻找你能想到的最好的名字，好吗？

创建文件夹后，让我们更改目录，一旦进入，让我们键入以下命令来安装这个神奇的模块：

pip install SpeechRecognition

除此之外，由于我们将使用麦克风来指示我们的查询，以便我们的脚本使用浏览器进行查找，因此我们将需要一些其他依赖项才能使其正常工作。让我们安装这些额外的软件包：

PyAudio 包：安装 PyAudio 的过程会因您的操作系统而异。具有讽刺意味的是，最简单的安装是使用 windows，我觉得这有点奇怪：没有 windows 讨厌，但我知道对吗？

Debian Linux：如果您使用基于 Debian 的 Linux（如 Ubuntu），您可以使用 apt 安装 PyAudio：

sudo apt-get install python-pyaudio python3-pyaudio

安装后，您可能仍需要运行 pip install pyaudio，尤其是在虚拟环境中工作时。

macOS：对于 macOS，首先需要使用 Homebrew 安装 PortAudio，然后使用 pip 安装 PyAudio：

brew install portaudio
pip install pyaudio

Windows 操作系统：在 Windows 上，您可以使用 pip 安装 PyAudio：

pip install pyaudio

编写脚本

好吧，伙计们！就是这样，但现在是让那些坏男孩采取行动的时候了，对吗？现在让我们构建我们的代码。

让我们打开我们最喜欢的代码编辑器并创建一个名为 sp_recog.py 的新文件。希望你也像我一样在屏幕的这一边使用 vsCode，哈哈。

是时候编写第一行代码了。让我们通过在我们的新文件中键入以下内容来导入将执行此脚本的全部魔力的骑士：

# Importing the libraries that will do the magic part 🐵
import speech_recognition as sr
import webbrowser as wb

现在，让我们创建一个函数来保存我们的整个例程。

def fn_speech_recognition():

现在，让我们在语音识别实例中初始化麦克风，方法是使用 Microphone 方法并使用初始值或默认值 0 传递 device_index 参数。这将使我们能够在我们的计算机中获得第一个可用的麦克风

sr.Microphone(device_index = 0)

如果您出于好奇，想知道计算机中安装了多少个麦克风，可以使用以下命令

print(f"MICs Found on this Computer: \n {sr.Microphone.list_microphone_names()}")

现在，让我们创建识别器实例并设置一些最重要的参数以使其顺利运行：

# Creating a recognition object instance
r = sr.Recognizer()
r.energy_threshold=4000
r.dynamic_energy_threshold = False

现在麦克风将成为我们捕获用户给出的命令的来源。我们将使用调整环境噪声和监听方法来做到这一点：

    with sr.Microphone() as source:
        print('Please Speak Loud and Clear:')
        #reduce noise
        r.adjust_for_ambient_noise(source)
        #take voice input from the microphone
        audio = r.listen(source)

像往常一样，我建议使用 try...catch 块来管理您的错误。让我们完成本教程的代码，将这个块集成到其中，如下所示：

        try:
            phrase = r.recognize_google(audio)
            print(f"Did you just say: {phrase} ?")
            url = "https://www.google.com/search?q="
            search_url  = url+phrase
            wb.open(search_url)
        except TimeoutException as msg:
            print(msg)
        except WaitTimeoutError:
            print("listening timed out while waiting for phrase to start")
            quit()
        # speech is unintelligible
        except LookupError:
            print("Could not understand what you've requested.")
        else:
            print("Your results will appear in the default browser. Good bye for now...")

最后，我们调用函数并开始使用我们的脚本

fn_speech_recognition()

最终源代码：

现在我们已经分解了我们的脚本，让我们将最终的源代码放在一起，供你们测试并使用它。请留下您的评论以及其他一些方法或更好的解决方案，以便在需要时更新帖子


import speech_recognition as sr
import webbrowser as wb
def fn_speech_recognition():
    sr.Microphone(device_index = 0)
    print(f"MICs Found on this Computer: \n {sr.Microphone.list_microphone_names()}")
    # Creating a recognition object
    r = sr.Recognizer()
    r.energy_threshold=4000
    r.dynamic_energy_threshold = False

    with sr.Microphone() as source:
        print('Please Speak Loud and Clear:')
        #reduce noise
        r.adjust_for_ambient_noise(source)
        #take voice input from the microphone
        audio = r.listen(source)
        try:
            phrase = r.recognize_google(audio)
            print(f"Did you just say: {phrase} ?")
            url = "https://www.google.com/search?q="
            search_url  = url+phrase
            wb.open(search_url)
        except TimeoutException as msg:
            print(msg)
        except WaitTimeoutError:
            print("listening timed out while waiting for phrase to start")
            quit()
        # speech is unintelligible
        except LookupError:
            print("Could not understand what you've requested.")
        else:
            print("Your results will appear in the default browser. Good bye for now...")


fn_speech_recognition()

本文分类：Python
本文标签：Python 语音识别
浏览次数：4445 阅读
发布日期：2021-12-06 22:53:13
本文链接：http://elephdev.com/python/350.html

上一篇 > 使用 Python + Selenium 抓取亚马逊商品
下一篇 > python 遍历目录批量ZIP压缩一级文件夹

使用 Python 进行语音识别

编写脚本

最终源代码：

评论列表共有 0 评论

发表评论取消回复

使用 Python 进行语音识别

编写脚本

最终源代码：

PYTHON CDP Network.Response

Python 使用 websocket-client

Python 直接赋值、浅拷贝和深度拷贝解析

Python Loguru 日志解决方案

评论列表 共有 0 评论

发表评论 取消回复

评论列表共有 0 评论

发表评论取消回复