음성인식(STT/TTS) 빨리 시작하기

오늘 배워 오늘 쓰는 OpenAPI/Quick Start

음성인식(STT/TTS) 빨리 시작하기 - SpeechRecognition 구글

ai-creator 2020. 3. 6. 17:37

ETRI와 비교하여 구글 STT에서는 유료로 서비스를 제공하고 있습니다.

구글 STT를 실제로 이용하기 위한 절차는 다음과 같습니다.

1. 서비스 계정 키

2. 결제 정보 등록

3. gcloud tool 설치

따라서 비교적 쉽게 STT를 사용하기 위해 이 글에서는 Python - SpeechRecognition 라이브러리를 이용하겠습니다.

> 참고 URL : pypi.org/project/SpeechRecognition/

<< 작업 순서 >>

Step 1	라이브러리 설치
Step 2	구현 (Quick Start)

Step 1) 라이브러리 설치

$ pip install SpeechRecognition
$ pip install pyaudio

PyAudio는 마이크 사용을 위해 필요합니다.

( 참고 ) MacOS 라면?

$ pip install SpeechRecognition

$ brew install portaudio

$ pip install pyaudio

(참고) windows에서 설치시 cl.exe 에러가 난다면?

=> 자신의 python version과 window 환경에 맞춰 wheel 파일을 다운로드 받는다.

현재 설치하려고 하는 커맨드 창 경로에 .whl 파일을 복사해둔다.

$ pip install <wheel 파일명>

Step 2) 구현 (Quick Start)

Step 2-1) 음성 파일을 통한 음성인식

ㅁ 사용 파일

hello.wav (내용 : 안녕하세요 오늘도 멋진 하루 되세요)

hello.wav

0.11MB

ㅁ 소스코드

> 소스코드 참고 URL : github.com/Uberi/speech_recognition/blob/master/examples/microphone_recognition.py

import speech_recognition as sr

AUDIO_FILE = "hello.wav"

# audio file을 audio source로 사용합니다
r = sr.Recognizer()
with sr.AudioFile(AUDIO_FILE) as source:
    audio = r.record(source)  # 전체 audio file 읽기

# 구글 웹 음성 API로 인식하기 (하루에 제한 50회)
try:
    print("Google Speech Recognition thinks you said : " + r.recognize_google(audio, language='ko'))
except sr.UnknownValueError:
    print("Google Speech Recognition could not understand audio")
except sr.RequestError as e:
    print("Could not request results from Google Speech Recognition service; {0}".format(e))
    
# 결과
# Google Speech Recognition thinks you said : 안녕하세요 오늘도 멋진 하루 되세요

Step 2-2) 마이크를 이용한 음성인식

ㅁ 소스코드

import speech_recognition as sr

# microphone에서 auido source를 생성합니다
r = sr.Recognizer()
with sr.Microphone() as source:
    print("Say something!")
    audio = r.listen(source)

# 구글 웹 음성 API로 인식하기 (하루에 제한 50회)
try:
    print("Google Speech Recognition thinks you said : " + r.recognize_google(audio, language='ko'))
except sr.UnknownValueError:
    print("Google Speech Recognition could not understand audio")
except sr.RequestError as e:
    print("Could not request results from Google Speech Recognition service; {0}".format(e))
    
# 결과
# Google Speech Recognition thinks you said : 안녕하세요

Step 2-3) 파일 저장

# write audio to a WAV file
with open("microphone-results.wav", "wb") as f:
    f.write(audio.get_wav_data())

결과를 빠르게 확인할 수 있습니다 !

<< 참고 >>

- speech recognition documents (Link)

- audio file로 저장하기 (Link)