[Python] GPT 3 음성 인터랙션 - (Speech to Text 모듈)

컴퓨터/AI

[Python] GPT 3 음성 인터랙션 - (Speech to Text 모듈)

COMKONG 2023. 5. 4. 16:01

GPT 3 API를 호출하여 파이썬으로 질의 응답 할 수 있는 코드이다.

파이썬 코드를 실행하기 전에 준비해야 할 것들이 있다.

1. GPT 3 API 인증 키 확인하기

OpenAI 사이트에서 확인할 수 있다. (유출 안되게 조심할 것)

2. 아나콘다 환경 세팅

아나콘다가 설치 되어있는 사람이라면 새로운 environment 를 만들어 줄 것

conda create -n openchat
conda activate openchat

그리고 필요한 것들을 pip 로 설치해주자

pip install openai
pip install pyaudio
pip install SpeechRecognition

3. 파이썬 코드 실행하기

import openai
import pyaudio
import speech_recognition as sr
from google.cloud import texttospeech


openai.api_key = <생성했던 키 넣기>

# speech to text
def speech_to_text():
    r = sr.Recognizer()
    with sr.Microphone() as source:
        print("Speak something!")
        audio = r.listen(source)
    try:
        text = r.recognize_google(audio)
        print(f"You said: {text}")
        return text
    except sr.UnknownValueError:
        print("Could not understand audio")
        return ""
    except sr.RequestError as e:
        print(f"Could not request results from Google Speech Recognition service; {e}")
        return ""
  
#chat gpt api 를 호출하여 질문하는 함수
def ask_question_v2(question):

    messages.append({"role": "user", "content": f"{question}"})
    completion = openai.ChatCompletion.create(model="gpt-3.5-turbo", messages=messages)
    assistant_content = completion.choices[0].message["content"].strip()
    messages.append({"role": "assistant", "content": f"{assistant_content}"})
    return assistant_content


while True:

    # 사용자 음성 인식
    question = speech_to_text()    
    print(question)
    check = input("if this sentence is right, then type 1")
    
    if check == "1":
        print ("ok, let's ask to gpt")
        # 질문하는 함수 호출
        answer = ask_question_v2(question)
        print(answer)
    else:
        print("Repeat the speech")

        
    #print(type(question))

    # 사용자가 "exit" 이나 "stop" 이라고 말하면 음성 대화 종료
    if "exit" in question:
        print("음성 대화를 종료합니다.")
        break
    if "stop" in question:
        print("음성 대화를 종료합니다.")
        break

가끔 speech to text 가 정확하지 않아서 해당 단계를 컨펌 해주는 코드를 추가로 넣었다.

STT 결과가 맞으면 1을 입력하면 된다.

(API 호출이 1분에 3번까지 가능하기 때문에 이상한 문장으로 질문 하는 건 아까우니까 추가하였음)

저작자표시 비영리 변경금지 (새창열림)

'컴퓨터 > AI' 카테고리의 다른 글

[딥러닝 모델] SOTA(State-of-the-Art) 알고리즘, paperswithcode (0)	2022.05.12
[Python, AI] MediaPipe를 이용하여 Hand Tracking 하기 (0)	2022.03.28

현재글[Python] GPT 3 음성 인터랙션 - (Speech to Text 모듈)

KONG 블로그

CS PhD Student, 박사 유학 준비 과정/미국 생활 관련 글을 업로드 합니다.

LA, 미국, 토플시험장, 풀브라이트인터뷰, 유학준비, 미국박사, Python, 박사유학, 토플팁, 토플, 미국유학, 사후장학금, 박사인터뷰, cs박사, unity, PhD, 박사준비, 파이썬, 풀브라이트, 박사원서,

Today :
Yesterday :

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

KONG 블로그