Affiliate links on Android Authority may earn us a commission. Learn more.
ChatGPT now lets you talk with it or submit pictures for prompts
- OpenAI is rolling out new features for ChatGPT.
- Users will now be able to prompt the chatbot with their voice or a picture.
- The features will initially only be available to users who pay for ChatGPT.
Since its debut, OpenAI has been constantly updating its chatbot — ChatGPT — with new features. The latest update will bring two new ways to give ChatGPT a prompt: through voice or by picture.
Today, OpenAI announced a couple of new capabilities coming to ChatGPT in the next two weeks. One of the new features will allow users to submit prompts with their voice and have the AI bot speak back.
Instead of typing something into the field, you’ll be able to tap a button and ask your question verbally. ChatGPT will then convert what you say by turning it into text and feeding that text to its large language model (LLM). When it answers back, it will convert the text-based answer back to speech you can hear. It’s not all that different from how you would use a virtual assistant like Google Assistant or Alexa.
Use your voice to engage in a back-and-forth conversation with ChatGPT. Speak with it on the go, request a bedtime story, or settle a dinner table debate.Sound on 🔊 pic.twitter.com/3tuWzX0wtS— OpenAI (@OpenAI) September 25, 2023
OpenAI does already have a speech recognition system for speech-to-text called Whisper. But the company is now rolling out a new model that is “capable of crafting realistic synthetic voices from just a few seconds of real speech.”
The company sees potential in this technology beyond just voice prompts as it points out that it is working with Spotify for the pilot of their Voice Translation feature. As OpenAI explains, this would allow podcasts to be translated into other languages in the podcastor’s voice.
There’s an inherent danger to such technology, such as a malicious actor using the technology to impersonate others and commit fraud. In OpenAI’s blog, the company acknowledges the risk and claims the technology will only be used for specific use cases and partnerships.
The underlying research — voice generation and image understanding — offers a glimpse at what much more advanced AI will be capable of in the future. Learn more about this update and our safety measures: https://t.co/uNZjgbR5Bm— OpenAI (@OpenAI) September 25, 2023
The other new capability is something Google’s Bard chatbot was given months ago — image prompts. Like using Google Lens, which powers Bard’s image prompt capabilities, you’ll be able to submit a picture and ChatGPT will try to figure out what you’re asking for. If you want to clarify what you’re looking for, the app has a drawing tool to help you pinpoint something specific. You’ll also have the option to speak or type questions to go with the image.
Just like the other feature, this capability comes with risks. For example, you wouldn’t want someone to have the ability to input a photo of you and have the chatbot provide them with details about you. To this, the company states:
We’ve also taken technical measures to significantly limit ChatGPT’s ability to analyze and make direct statements about people since ChatGPT is not always accurate and these systems should respect individuals’ privacy.
While these features should make ChatGPT that much more functional and easy to use, there are clear downsides to the technology. The company has implemented guardrails, but it’s unknown if it will be enough to prevent bad actors from abusing these tools.