Creating a Text-to-Speech AI Agent in JavaScript using OpenAI API

Introduction Have you ever wanted to convert text into speech using AI? OpenAI’s Text-to-Speech (TTS) API allows developers to generate high-quality speech from text. In this blog, we will build a simple AI-powered TTS agent in JavaScript using OpenAI's API. By the end, you'll have a working program that converts any text into speech and plays it back. Prerequisites Before we begin, ensure you have the following: Node.js installed (Download here) An OpenAI API Key (Get it here) Basic knowledge of JavaScript Step 1: Install DependenciesWe will use axios to interact with OpenAI’s API and play-sound to play the generated audio. npm install axios play-sound Step 2: Writing the TTS Function We will create a function that: Sends a request to OpenAI’s TTS API Saves the generated audio Plays the audio file const axios = require('axios'); const player = require('play-sound')(); const fs = require('fs'); const OPENAI_API_KEY = 'your-api-key'; async function textToSpeech(text) { try { const response = await axios.post( 'https://api.openai.com/v1/audio/speech', { model: 'tts-1', input: text, voice: 'alloy', }, { headers: { 'Authorization': `Bearer ${OPENAI_API_KEY}`, 'Content-Type': 'application/json' }, responseType: 'arraybuffer' } ); const filePath = 'output.mp3'; fs.writeFileSync(filePath, response.data); console.log('Playing audio...'); player.play(filePath); } catch (error) { console.error('Error:', error.response ? error.response.data : error.message); } } textToSpeech("Hello, this is an AI-generated voice!"); Step 3: Running the Script Save the file as tts.js and run it using: node tts.js Learn how to create image analysis with the Google Cloud Vision API. Customization Change the Voice: OpenAI provides multiple voices like alloy, echo, fable, etc. Try different voices! Integrate into a Web App: Use this in a frontend React/Next.js project by calling the API via a backend. Conclusion With just a few lines of JavaScript, we have successfully built a powerful AI-powered text-to-speech agent. Whether for accessibility, automation, or just for fun, AI-driven voice synthesis is a game-changer. Try it out and enhance your projects with realistic AI voices!

Feb 7, 2025 - 14:47
 0
Creating a Text-to-Speech AI Agent in JavaScript using OpenAI API

Introduction

Have you ever wanted to convert text into speech using AI? OpenAI’s Text-to-Speech (TTS) API allows developers to generate high-quality speech from text. In this blog, we will build a simple AI-powered TTS agent in JavaScript using OpenAI's API. By the end, you'll have a working program that converts any text into speech and plays it back.

Prerequisites

Before we begin, ensure you have the following:

  • Node.js installed (Download here)
  • An OpenAI API Key (Get it here)
  • Basic knowledge of JavaScript

Step 1: Install DependenciesWe will use axios to interact with

OpenAI’s API and play-sound to play the generated audio.

npm install axios play-sound

Step 2: Writing the TTS Function

We will create a function that:

  • Sends a request to OpenAI’s TTS API
  • Saves the generated audio
  • Plays the audio file
const axios = require('axios');
const player = require('play-sound')();
const fs = require('fs');

const OPENAI_API_KEY = 'your-api-key';

async function textToSpeech(text) {
    try {
        const response = await axios.post(
            'https://api.openai.com/v1/audio/speech',
            {
                model: 'tts-1',
                input: text,
                voice: 'alloy',
            },
            {
                headers: {
                    'Authorization': `Bearer ${OPENAI_API_KEY}`,
                    'Content-Type': 'application/json'
                },
                responseType: 'arraybuffer'
            }
        );

        const filePath = 'output.mp3';
        fs.writeFileSync(filePath, response.data);
        console.log('Playing audio...');
        player.play(filePath);
    } catch (error) {
        console.error('Error:', error.response ? error.response.data : error.message);
    }
}

textToSpeech("Hello, this is an AI-generated voice!");

Step 3: Running the Script

Save the file as tts.js and run it using:

node tts.js

Learn how to create image analysis with the Google Cloud Vision API.

Customization

  • Change the Voice: OpenAI provides multiple voices like alloy, echo, fable, etc. Try different voices!
  • Integrate into a Web App: Use this in a frontend React/Next.js project by calling the API via a backend.

Conclusion

With just a few lines of JavaScript, we have successfully built a powerful AI-powered text-to-speech agent. Whether for accessibility, automation, or just for fun, AI-driven voice synthesis is a game-changer. Try it out and enhance your projects with realistic AI voices!