OpenAI with TTS in Unity

nathanenglish
Feb 5, 2023
2 min read

Updated: Feb 19, 2023

As of now, OpenAI and Oculus TTS are both operating together well enough.

I'm using the OpenAI API with this GitHub package, as set up by the wonderful Sarge, below.

With a few tweaks to the TTS and ChatGPT code, the two flow into each other well enough. The responses as generated through ChatGPT are fed into the TTS Speaker's Speak() function, and then read aloud. (I had to do a little editing with the variables, setting the response variable back to null after each call, and put the function inside a Coroutine to get the timing right, but it works now).

One unforeseen problem is that the TTS has a limit of 140 characters to be read aloud at a time. Which is incredibly annoying, as all responses will be limited to very short sentences.

What I'll have to do is write a quick string parser that takes the generated responses, splits them into separate strings of less than 140 characters (ideally, splitting in natural spots where periods or commas exist) and then plays the audio in a queue. This might be a little hard to pull off, but it'll be a good experiment to try, and a great thing to get working.

The TTS code holds a queue of clips, which I think will be pretty integral to this idea. If I can create a list of some sort that loads the audio clips and depopulates as soon as all of them are read aloud, I'll be in good shape. I'm going to begin working on this parser idea, and hopefully have it implemented soon enough. Once that is working, I'll start on the mouth visemes and planning out a state machine.

Nathan・DIGM 540

OpenAI with TTS in Unity

Recent Posts

Comentarios