/user/kayd @ :~$ cat openai-nodejs-integration-guide.md

OpenAI Node.js Integration: Complete Developer's Guide [2024]

Karandeep Singh
Karandeep Singh
• 24 minutes

OpenAI Node.js Integration: Complete Developer’s Guide [2024]

Meta Description: Master OpenAI Node.js integration with our comprehensive guide, packed with real-world examples from my experience building production applications. Learn everything from basic setup to advanced implementations with practical code examples.

Summary: After spending countless hours integrating OpenAI with Node.js applications, I’ve created this comprehensive guide to share my insights and practical implementations. This guide covers everything I’ve learned about building robust AI applications using OpenAI Node.js integration, complete with code examples and production-ready solutions.

Understanding OpenAI Node.js Fundamentals

When I first started working with OpenAI and Node.js, I quickly realized that a solid foundation is crucial. The OpenAI Node.js integration opens up incredible possibilities for AI-powered applications, but it requires careful consideration of authentication, rate limiting, and error handling right from the start.

In my production applications, I’ve found that proper initialization and configuration management are critical. Here’s how I structure my OpenAI Node.js setup:

const express = require('express');
const { Configuration, OpenAIApi } = require('openai');
const dotenv = require('dotenv');
const path = require('path');

// Load environment variables based on environment
    path: path.join(__dirname, `.env.${process.env.NODE_ENV || 'development'}`)

// Initialize Express application
const app = express();

// OpenAI configuration with environment-specific settings
const configuration = new Configuration({
    apiKey: process.env.OPENAI_API_KEY,
    organization: process.env.OPENAI_ORG_ID, // Optional but recommended
    basePath: process.env.OPENAI_API_BASE_PATH, // Useful for custom endpoints

// Create OpenAI instance with configuration
const openai = new OpenAIApi(configuration);

// Basic health check endpoint
app.get('/health', (req, res) => {
        status: 'healthy',
        timestamp: new Date().toISOString(),
        environment: process.env.NODE_ENV

// Initialize environment-specific settings
const settings = {
    development: {
        maxRetries: 3,
        timeout: 30000,
        debug: true
    production: {
        maxRetries: 5,
        timeout: 60000,
        debug: false
}[process.env.NODE_ENV || 'development'];

module.exports = { app, openai, settings };

Let me break down why I structure the initialization this way:

  1. Environment-Specific Configuration:

    • I use separate .env files for different environments
    • This helps manage different API keys and settings across environments
    • Makes it easier to debug issues in development without affecting production
  2. Flexible Settings Object:

    • Different timeout values for development and production
    • Adjustable retry attempts based on environment
    • Debug mode that can be toggled easily
  3. Health Check Endpoint:

    • Essential for monitoring in production
    • Includes environment and timestamp information
    • Helps quickly identify which instance is responding

OpenAI Node.js Authentication and Security

During my years of implementing OpenAI in Node.js applications, I’ve learned that security isn’t just about storing API keys safely - it’s about building a comprehensive security layer. Here’s my battle-tested approach:

const jwt = require('jsonwebtoken');
const rateLimit = require('express-rate-limit');
const crypto = require('crypto');

// Custom encryption for API keys
class SecurityManager {
    constructor(encryptionKey) {
        this.algorithm = 'aes-256-gcm';
        this.encryptionKey = crypto
            .slice(0, 32);

    // Encrypt OpenAI API key for storage
    encryptApiKey(apiKey) {
        const iv = crypto.randomBytes(12);
        const cipher = crypto.createCipheriv(

        let encryptedKey = cipher.update(apiKey, 'utf8', 'hex');
        encryptedKey += cipher.final('hex');
        const authTag = cipher.getAuthTag();

        return {
            iv: iv.toString('hex'),
            authTag: authTag.toString('hex')

    // Decrypt stored API key
    decryptApiKey(encryptedData) {
        const decipher = crypto.createDecipheriv(
            Buffer.from(encryptedData.iv, 'hex')

        decipher.setAuthTag(Buffer.from(encryptedData.authTag, 'hex'));
        let decrypted = decipher.update(encryptedData.encryptedKey, 'hex', 'utf8');
        decrypted += decipher.final('utf8');
        return decrypted;

// Authentication middleware
const authMiddleware = (securityManager) => async (req, res, next) => {
    try {
        const authHeader = req.headers.authorization;
        if (!authHeader?.startsWith('Bearer ')) {
            return res.status(401).json({
                error: 'Invalid authentication header'

        const token = authHeader.split(' ')[1];
        const decoded = jwt.verify(token, process.env.JWT_SECRET);

        // Retrieve user's encrypted OpenAI API key
        const userApiKey = await getUserApiKey(decoded.userId);
        if (!userApiKey) {
            return res.status(403).json({
                error: 'No API key found for user'

        // Decrypt API key for use
        req.openaiKey = securityManager.decryptApiKey(userApiKey);
        req.userId = decoded.userId;
    } catch (error) {
        console.error('Authentication error:', error);
            error: 'Authentication failed'

// Rate limiting configuration
const rateLimiter = rateLimit({
    windowMs: 15 * 60 * 1000, // 15 minutes
    max: async (req) => {
        // Dynamic rate limiting based on user tier
        const userTier = await getUserTier(req.userId);
        const limits = {
            free: 100,
            pro: 1000,
            enterprise: 5000
        return limits[userTier] || limits.free;
    message: {
        error: 'Rate limit exceeded. Please try again later.'
    keyGenerator: (req) => req.userId || req.ip,
    handler: (req, res) => {
            error: 'Rate limit exceeded',
            retryAfter: res.getHeader('Retry-After'),
            limit: req.rateLimit.limit,
            remaining: req.rateLimit.remaining

// Implementation example
    async (req, res) => {
        try {
            const openai = new OpenAIApi(new Configuration({
                apiKey: req.openaiKey

            const response = await openai.createChatCompletion({
                model: "gpt-4",
                messages: req.body.messages,
                user: req.userId // For OpenAI's monitoring

                success: true,
                data: response.data
        } catch (error) {
            handleOpenAIError(error, res);

// Error handling utility
function handleOpenAIError(error, res) {
    const errorResponses = {
        'insufficient_quota': {
            status: 402,
            message: 'API quota exceeded'
        'invalid_api_key': {
            status: 401,
            message: 'Invalid API key'
        'rate_limit_exceeded': {
            status: 429,
            message: 'Rate limit exceeded'

    const errorType = error.response?.data?.error?.type;
    const errorResponse = errorResponses[errorType] || {
        status: 500,
        message: 'Internal server error'

        success: false,
        error: errorResponse.message,
        details: process.env.NODE_ENV === 'development' ? error.message : undefined

Let me explain why I’ve implemented security this way:

  1. API Key Encryption:

    • I never store OpenAI API keys in plain text
    • Using AES-256-GCM provides strong encryption
    • The auth tag ensures data integrity
    • Separate encryption keys for different environments
  2. Dynamic Rate Limiting:

    • Limits based on user tiers allows flexible access control
    • Per-user tracking instead of just IP-based
    • Custom error responses with retry information
    • Separate limits for different endpoints
  3. Authentication Flow:

    • JWT-based authentication for user identification
    • Encrypted API keys retrieved per request
    • User-specific configuration and limits
    • Detailed error handling for security issues
  4. Error Handling:

    • Specific error responses for different scenarios
    • Environment-aware error details
    • Structured error format for frontend handling
    • Logging for security monitoring

In production, I’ve found this setup helps prevent common security issues like:

  • API key exposure
  • Rate limit abuse
  • Unauthorized access
  • Token manipulation

OpenAI Node.js Chat Completions

After implementing numerous chat applications with OpenAI, I’ve developed a robust system for handling conversations. Here’s my production-tested approach:

// Conversation manager for handling complex chat interactions
class ConversationManager {
    constructor(openai) {
        this.openai = openai;
        this.conversations = new Map();
        this.maxContextLength = 4096; // Adjustable based on model

    // Generate dynamic system message based on context
    async createSystemMessage(context) {
        return {
            role: "system",
            content: `You are an AI assistant with expertise in ${context.expertise}. 
                     Communication style: ${context.style}. 
                     Primary focus: ${context.focus}`

    // Process and optimize conversation history
    async optimizeHistory(messages, maxTokens = 2048) {
        let tokenCount = 0;
        const optimizedMessages = [];
        // Process messages in reverse to keep recent context
        for (const message of messages.reverse()) {
            const estimatedTokens = Math.ceil(message.content.length / 4);
            if (tokenCount + estimatedTokens > maxTokens) {
            tokenCount += estimatedTokens;
        return optimizedMessages;

    // Main chat completion handler
    async generateCompletion(userId, messages, options = {}) {
        try {
            const {
                model = "gpt-4",
                temperature = 0.7,
                context = {},
                stream = false
            } = options;

            // Get or initialize conversation history
            if (!this.conversations.has(userId)) {
                this.conversations.set(userId, []);

            const conversationHistory = this.conversations.get(userId);
            const systemMessage = await this.createSystemMessage(context);
            // Optimize conversation history
            const optimizedHistory = await this.optimizeHistory(conversationHistory);
            // Prepare messages for completion
            const completionMessages = [

            const response = await this.openai.createChatCompletion({
                messages: completionMessages,
                presence_penalty: 0.6,
                frequency_penalty: 0.5,
                max_tokens: options.maxTokens || 1000,
                user: userId

            // Handle streaming responses
            if (stream) {
                return response;

            // Update conversation history
            const newMessage = {
                role: "assistant",
                content: response.data.choices[0].message.content
            this.conversations.get(userId).push(...messages, newMessage);

            return {
                completion: newMessage.content,
                usage: response.data.usage,
                conversationId: userId
        } catch (error) {
            throw new ChatCompletionError(error);

    // Memory management and cleanup
    cleanup(userId) {

// Custom error handling for chat completions
class ChatCompletionError extends Error {
    constructor(error) {
        this.name = 'ChatCompletionError';
        this.originalError = error;
        this.isOpenAIError = error.response?.data?.error !== undefined;

// Implementation with Express
const conversationManager = new ConversationManager(openai);

app.post('/api/chat/completion', async (req, res) => {
    const {
        stream = false,
        model = "gpt-4"
    } = req.body;

    try {
        if (stream) {
            const stream = await conversationManager.generateCompletion(
                { stream: true, context, model }

            res.setHeader('Content-Type', 'text/event-stream');
            res.setHeader('Cache-Control', 'no-cache');
            res.setHeader('Connection', 'keep-alive');

            stream.data.on('data', (chunk) => {
                const lines = chunk.toString().split('\n');
                for (const line of lines) {
                    if (line.trim() === '') continue;
                    if (line.trim() === 'data: [DONE]') {
                        res.write('data: [DONE]\n\n');
                    try {
                        const parsed = JSON.parse(line.replace(/^data: /, ''));
                        if (parsed.choices[0].delta.content) {
                            res.write(`data: ${parsed.choices[0].delta.content}\n\n`);
                    } catch (error) {
                        console.error('Stream parsing error:', error);

            stream.data.on('end', () => res.end());

        const response = await conversationManager.generateCompletion(
            { context, model }

            success: true,
    } catch (error) {
        console.error('Chat completion error:', error);
        res.status(error.isOpenAIError ? 400 : 500).json({
            success: false,
            error: error.message

Key aspects of this implementation:

  1. Conversation Management:

    • Maintains conversation history per user
    • Optimizes context window usage
    • Dynamic system messages based on context
    • Memory cleanup for inactive conversations
  2. Stream Processing:

    • Handles both streaming and non-streaming responses
    • Efficient chunk processing
    • Error handling for stream parsing
    • Clean connection handling
  3. Context Optimization:

    • Token counting estimation
    • Prioritizes recent messages
    • Maintains conversation coherence
    • Prevents context window overflow

OpenAI Node.js Prompt Engineering

Through extensive testing and real-world applications, I’ve developed a structured approach to prompt engineering in Node.js. Here’s my system for creating dynamic, context-aware prompts:

// Prompt engineering system for OpenAI Node.js
class PromptEngineering {
    constructor() {
        this.templates = new Map();
        this.functions = new Map();
        this.promptCache = new NodeCache({ stdTTL: 3600 });

    // Register reusable prompt templates
    registerTemplate(name, template, schema = {}) {
        this.templates.set(name, {
            created: new Date().toISOString()

    // Dynamic prompt builder with variable interpolation
    async buildPrompt(templateName, variables, context = {}) {
        const cacheKey = `${templateName}-${JSON.stringify(variables)}`;
        const cached = this.promptCache.get(cacheKey);
        if (cached) {
            return cached;

        const template = this.templates.get(templateName);
        if (!template) {
            throw new Error(`Template '${templateName}' not found`);

        let prompt = template.template;

        // Replace variables in template
        for (const [key, value] of Object.entries(variables)) {
            const regex = new RegExp(`{{${key}}}`, 'g');
            prompt = prompt.replace(regex, value);

        // Add context-specific modifications
        if (context.tone) {
            prompt += `\nPlease respond in a ${context.tone} tone.`;

        if (context.format) {
            prompt += `\nFormat the response as ${context.format}.`;

        this.promptCache.set(cacheKey, prompt);
        return prompt;

    // Function calling template generator
    registerFunction(name, parameters, description) {
        this.functions.set(name, {
            parameters: {
                type: 'object',
                properties: parameters,
                required: Object.keys(parameters)

    // Generate function-calling enabled completion
    async generateFunctionCompletion(openai, messages, functionName) {
        const function_call = {
            name: functionName

        const functions = [this.functions.get(functionName)];

        return await openai.createChatCompletion({
            model: "gpt-4",

// Example implementation
const promptEngine = new PromptEngineering();

// Register templates
    `Review the following code with a focus on {{focus_areas}}.
     Consider best practices for {{language}}.
     Security considerations: {{security_level}}.
     Code to review:
     Please provide:
     1. Potential issues
     2. Suggested improvements
     3. Security considerations
     4. Performance optimizations`,
        focus_areas: 'string',
        language: 'string',
        security_level: 'string',
        code: 'string'

// Register function calling templates
        code_quality: {
            type: 'object',
            properties: {
                maintainability: { type: 'number' },
                complexity: { type: 'number' },
                security_score: { type: 'number' }
        issues: {
            type: 'array',
            items: {
                type: 'object',
                properties: {
                    severity: { type: 'string' },
                    description: { type: 'string' },
                    line_number: { type: 'number' }
    'Analyze code quality and identify potential issues'

// Express route implementation
app.post('/api/prompt/code-review', async (req, res) => {
    try {
        const { code, language, focus_areas } = req.body;

        // Build prompt from template
        const prompt = await promptEngine.buildPrompt('code_review', {
            security_level: 'high'
        }, {
            tone: 'professional',
            format: 'structured feedback'

        // Generate completion with function calling
        const messages = [
            { role: "user", content: prompt }

        const functionResponse = await promptEngine.generateFunctionCompletion(

        const analysis = JSON.parse(

        // Generate detailed review using analysis
        const reviewResponse = await openai.createChatCompletion({
            model: "gpt-4",
            messages: [
                    role: "function",
                    name: "analyzeCode",
                    content: JSON.stringify(analysis)

            success: true,
            review: reviewResponse.data.choices[0].message.content
    } catch (error) {
        console.error('Prompt generation error:', error);
            success: false,
            error: error.message

Key features of this prompt engineering system:

  1. Template Management:

    • Reusable prompt templates
    • Schema validation for variables
    • Context-aware modifications
    • Caching for frequently used prompts
  2. Function Calling:

    • Structured function definitions
    • Type-safe parameter schemas
    • Combined function and chat completions
    • Detailed code analysis capabilities
  3. Context Awareness:

    • Dynamic tone adjustment
    • Format specification
    • Variable interpolation
    • Security level considerations I’ll continue with three crucial sections: Vector Database Integration, Real-time Streaming, and Performance Optimization.

OpenAI Node.js Vector Database Integration

const { PineconeClient } = require('@pinecone-database/pinecone');
const { v4: uuidv4 } = require('uuid');

class VectorDatabaseManager {
    constructor(openai) {
        this.openai = openai;
        this.pinecone = new PineconeClient();
        this.initialized = false;

    async initialize() {
        if (!this.initialized) {
            await this.pinecone.init({
                environment: process.env.PINECONE_ENVIRONMENT,
                apiKey: process.env.PINECONE_API_KEY
            this.index = this.pinecone.Index(process.env.PINECONE_INDEX);
            this.initialized = true;

    async generateEmbedding(text) {
        const response = await this.openai.createEmbedding({
            model: 'text-embedding-ada-002',
            input: text.replace(/\n/g, ' ')
        return response.data.data[0].embedding;

    async upsertDocument(text, metadata = {}) {
        await this.initialize();
        const embedding = await this.generateEmbedding(text);
        const vectorId = uuidv4();

        await this.index.upsert({
            vectors: [{
                id: vectorId,
                values: embedding,
                metadata: {
                    timestamp: new Date().toISOString()

        return vectorId;

    async semanticSearch(query, topK = 5, filter = {}) {
        await this.initialize();
        const queryEmbedding = await this.generateEmbedding(query);

        const searchResponse = await this.index.query({
            vector: queryEmbedding,
            includeMetadata: true

        return searchResponse.matches.map(match => ({
            score: match.score,
            text: match.metadata.text,
            metadata: match.metadata

// Express route implementation
app.post('/api/vector/search', async (req, res) => {
    const vectorDB = new VectorDatabaseManager(openai);
    try {
        const { query, filters = {}, limit = 5 } = req.body;
        const results = await vectorDB.semanticSearch(query, limit, filters);

        // Enhance results with GPT analysis
        const analysisPrompt = `Analyze these search results for relevance to "${query}":

        const analysis = await openai.createChatCompletion({
            model: "gpt-4",
            messages: [{
                role: "user",
                content: analysisPrompt

            success: true,
            analysis: analysis.data.choices[0].message.content
    } catch (error) {
        console.error('Vector search error:', error);
            success: false,
            error: error.message

OpenAI Node.js Real-time Streaming with WebSocket

const WebSocket = require('ws');
const server = require('http').createServer(app);
const wss = new WebSocket.Server({ server });

class StreamingManager {
    constructor(openai) {
        this.openai = openai;
        this.activeStreams = new Map();

    async handleStream(ws, messages, options = {}) {
        const streamId = uuidv4();
        this.activeStreams.set(streamId, ws);

        try {
            const response = await this.openai.createChatCompletion({
                model: options.model || "gpt-4",
                stream: true,
                temperature: options.temperature || 0.7
            }, { responseType: 'stream' });

            response.data.on('data', (chunk) => {
                const lines = chunk.toString().split('\n');
                for (const line of lines) {
                    if (line.trim() === '') continue;
                    if (line.trim() === 'data: [DONE]') {
                        ws.send(JSON.stringify({ type: 'done' }));

                    try {
                        const message = line.replace(/^data: /, '');
                        const parsed = JSON.parse(message);
                        if (parsed.choices[0].delta.content) {
                                type: 'content',
                                content: parsed.choices[0].delta.content
                    } catch (error) {
                        console.error('Stream parsing error:', error);

            response.data.on('end', () => {
                ws.send(JSON.stringify({ type: 'end' }));

        } catch (error) {
                type: 'error',
                error: error.message

    closeStream(streamId) {
        const ws = this.activeStreams.get(streamId);
        if (ws) {

// WebSocket implementation
const streamingManager = new StreamingManager(openai);

wss.on('connection', (ws) => {
    ws.on('message', async (message) => {
        try {
            const data = JSON.parse(message);
            switch (data.type) {
                case 'start_stream':
                    await streamingManager.handleStream(
                case 'stop_stream':
        } catch (error) {
                type: 'error',
                error: error.message

OpenAI Node.js Performance Optimization

const cluster = require('cluster');
const numCPUs = require('os').cpus().length;
const Redis = require('ioredis');
const { promisify } = require('util');

class PerformanceOptimizer {
    constructor() {
        this.redis = new Redis(process.env.REDIS_URL);
        this.getCache = promisify(this.redis.get).bind(this.redis);
        this.setCache = promisify(this.redis.set).bind(this.redis);

    // Request deduplication
    async deduplicate(key, ttl, generator) {
        const cached = await this.getCache(`lock:${key}`);
        if (cached) {
            return JSON.parse(cached);

        const result = await generator();
        await this.setCache(`lock:${key}`, JSON.stringify(result), 'EX', ttl);
        return result;

    // Load balancing using cluster
    setupCluster(app) {
        if (cluster.isMaster) {
            console.log(`Master ${process.pid} is running`);

            for (let i = 0; i < numCPUs; i++) {

            cluster.on('exit', (worker, code, signal) => {
                console.log(`Worker ${worker.process.pid} died`);
                cluster.fork(); // Replace dead worker
        } else {
            app.listen(process.env.PORT, () => {
                console.log(`Worker ${process.pid} started`);

    // Batch requests
    async batchRequests(requests, batchSize = 5) {
        const batches = [];
        for (let i = 0; i < requests.length; i += batchSize) {
            batches.push(requests.slice(i, i + batchSize));

        const results = [];
        for (const batch of batches) {
            const batchResults = await Promise.all(
                batch.map(req => this.processRequest(req))
            // Rate limiting pause between batches
            await new Promise(resolve => setTimeout(resolve, 1000));

        return results;

    // Request queuing
    async queueRequest(req) {
        return new Promise((resolve, reject) => {
            const key = `queue:${req.id}`;
            this.redis.rpush(key, JSON.stringify(req));
            // Set expiration for queue items
            this.redis.expire(key, 3600);

    // Monitor performance
    async monitorPerformance(req, res, next) {
        const start = process.hrtime();
        res.on('finish', () => {
            const [seconds, nanoseconds] = process.hrtime(start);
            const duration = seconds * 1000 + nanoseconds / 1e6;
            // Log performance metrics
                path: req.path,
                method: req.method,
                status: res.statusCode,
                timestamp: new Date()

// Implementation example
const optimizer = new PerformanceOptimizer();


app.post('/api/batch', async (req, res) => {
    try {
        const { requests } = req.body;
        const results = await optimizer.batchRequests(requests);
            success: true,
    } catch (error) {
            success: false,
            error: error.message

// Start clustered server

I’ll continue with three more crucial sections: Error Handling, Monitoring, and Production Deployment.

OpenAI Node.js Error Handling and Retry Logic

const { exponentialDelay } = require('exponential-backoff');

class OpenAIErrorHandler {
    constructor() {
        this.maxRetries = 3;
        this.errorMetrics = new Map();

    async executeWithRetry(operation, context = {}) {
        let lastError;
        for (let attempt = 1; attempt <= this.maxRetries; attempt++) {
            try {
                return await operation();
            } catch (error) {
                lastError = error;
                if (!this.isRetryable(error)) {
                    throw this.enhanceError(error, context);

                const delay = exponentialDelay(attempt - 1);
                console.log(`Retry attempt ${attempt} after ${delay}ms delay`);
                await this.sleep(delay);

        throw this.enhanceError(lastError, {
            retriesExhausted: true

    isRetryable(error) {
        const retryableErrors = [

        return retryableErrors.includes(error?.response?.data?.error?.type) ||
               error.code === 'ECONNRESET' ||
               error.code === 'ETIMEDOUT';

    enhanceError(error, context) {
        const enhancedError = new Error(error.message);
        enhancedError.originalError = error;
        enhancedError.context = context;
        enhancedError.timestamp = new Date().toISOString();
        enhancedError.correlationId = context.correlationId;
        return enhancedError;

    trackError(error) {
        const errorType = error?.response?.data?.error?.type || 'unknown';
        const current = this.errorMetrics.get(errorType) || 0;
        this.errorMetrics.set(errorType, current + 1);

    sleep(ms) {
        return new Promise(resolve => setTimeout(resolve, ms));

    getErrorMetrics() {
        return Object.fromEntries(this.errorMetrics);

// Implementation example
const errorHandler = new OpenAIErrorHandler();

app.post('/api/robust-completion', async (req, res) => {
    const correlationId = uuidv4();
    try {
        const result = await errorHandler.executeWithRetry(
            async () => {
                const completion = await openai.createChatCompletion({
                    model: "gpt-4",
                    messages: req.body.messages,
                    max_tokens: 1000
                return completion.data;
                userId: req.userId,
                operation: 'chat_completion'

            success: true,
            data: result

    } catch (error) {
        console.error('Enhanced error:', {
            error: error.message,
            context: error.context,
            originalError: error.originalError

            success: false,
            error: error.message,
            retryable: errorHandler.isRetryable(error.originalError)

OpenAI Node.js Monitoring and Analytics

const prometheus = require('prom-client');
const EventEmitter = require('events');

class OpenAIMonitor extends EventEmitter {
    constructor() {
        this.metrics = {};

    setupMetrics() {
        // Request duration histogram
        this.metrics.requestDuration = new prometheus.Histogram({
            name: 'openai_request_duration_seconds',
            help: 'Duration of OpenAI API requests',
            labelNames: ['operation', 'model', 'status']

        // Token usage counter
        this.metrics.tokenUsage = new prometheus.Counter({
            name: 'openai_token_usage_total',
            help: 'Total tokens used by model',
            labelNames: ['model', 'type']

        // Error counter
        this.metrics.errors = new prometheus.Counter({
            name: 'openai_errors_total',
            help: 'Total number of errors',
            labelNames: ['type', 'operation']

        // Active requests gauge
        this.metrics.activeRequests = new prometheus.Gauge({
            name: 'openai_active_requests',
            help: 'Number of active requests'

    async trackRequest(operation, func) {
        const start = process.hrtime();

        try {
            const result = await func();
            const [seconds, nanoseconds] = process.hrtime(start);
            const duration = seconds + nanoseconds / 1e9;

                    model: result.model,
                    status: 'success'

            if (result.usage) {
                        model: result.model,
                        type: 'prompt'

                        model: result.model,
                        type: 'completion'

            return result;
        } catch (error) {
            const [seconds, nanoseconds] = process.hrtime(start);
            const duration = seconds + nanoseconds / 1e9;

                    model: 'unknown',
                    status: 'error'

                type: error.type || 'unknown',

            throw error;
        } finally {

    getMetrics() {
        return prometheus.register.metrics();

// Implementation
const monitor = new OpenAIMonitor();

app.get('/metrics', async (req, res) => {
    res.set('Content-Type', prometheus.register.contentType);
    res.end(await monitor.getMetrics());

app.post('/api/monitored-completion', async (req, res) => {
    try {
        const result = await monitor.trackRequest(
            async () => {
                const completion = await openai.createChatCompletion({
                    model: "gpt-4",
                    messages: req.body.messages
                return completion.data;

        res.json({ success: true, data: result });
    } catch (error) {
            success: false,
            error: error.message

I’ll continue with Production Deployment and other crucial sections.

OpenAI Node.js Production Deployment

// deployment/ecosystem.config.js
module.exports = {
    apps: [{
        name: 'openai-api',
        script: 'src/server.js',
        instances: 'max',
        exec_mode: 'cluster',
        env: {
            NODE_ENV: 'production',
            PORT: 3000
        env_production: {
            NODE_ENV: 'production',
            PORT: 80

// Dockerfile
FROM node:18-alpine

# Set working directory
WORKDIR /usr/src/app

# Install PM2 globally
RUN npm install pm2 -g

# Copy package files
COPY package*.json ./

# Install dependencies
RUN npm ci --only=production

# Copy application files
COPY . .

# Create non-root user
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser

# Expose port

# Start with PM2
CMD ["pm2-runtime", "deployment/ecosystem.config.js"]
// deployment/load-balancer.js
const YAML = require('yaml');
const fs = require('fs');

class LoadBalancer {
    constructor() {
        this.config = {
            apiKeys: [],
            currentIndex: 0,
            usageMetrics: new Map()

    loadConfiguration() {
        const config = YAML.parse(
            fs.readFileSync('./config/openai-config.yml', 'utf8')
        this.config.apiKeys = config.api_keys.map(key => ({
            key: key.value,
            weight: key.weight || 1,
            rateLimit: key.rate_limit || 3000

    async getNextApiKey() {
        const now = Date.now();
        let selectedKey = null;

        for (const keyConfig of this.config.apiKeys) {
            const usage = this.config.usageMetrics.get(keyConfig.key) || {
                requests: 0,
                lastReset: now

            // Reset usage if window expired
            if (now - usage.lastReset > 60000) {
                usage.requests = 0;
                usage.lastReset = now;

            // Check if key is within rate limit
            if (usage.requests < keyConfig.rateLimit) {
                selectedKey = keyConfig;
                this.config.usageMetrics.set(keyConfig.key, usage);

        if (!selectedKey) {
            throw new Error('All API keys are at capacity');

        return selectedKey.key;

// deployment/scaling.js
class AutoScaler {
    constructor() {
        this.metrics = {
            requestCount: 0,
            errorCount: 0,
            latency: []
        this.thresholds = {
            requestThreshold: 1000,
            errorThreshold: 0.05,
            latencyThreshold: 2000

    async monitorPerformance() {
        const currentMetrics = await this.gatherMetrics();
        if (this.shouldScale(currentMetrics)) {
            await this.scaleUp();
        } else if (this.shouldScaleDown(currentMetrics)) {
            await this.scaleDown();

    shouldScale(metrics) {
        return metrics.requestRate > this.thresholds.requestThreshold ||
               metrics.errorRate > this.thresholds.errorThreshold ||
               metrics.avgLatency > this.thresholds.latencyThreshold;

// deployment/health-check.js
class HealthCheck {
    constructor() {
        this.services = new Map();
        this.healthStatus = {
            openai: true,
            database: true,
            cache: true

    async checkHealth() {
        try {
            // Check OpenAI connection
            await this.checkOpenAIHealth();
            // Check database connection
            await this.checkDatabaseHealth();
            // Check cache connection
            await this.checkCacheHealth();
            return {
                status: 'healthy',
                services: this.healthStatus,
                timestamp: new Date().toISOString()
        } catch (error) {
            return {
                status: 'unhealthy',
                error: error.message,
                services: this.healthStatus,
                timestamp: new Date().toISOString()

    async checkOpenAIHealth() {
        try {
            await openai.createCompletion({
                model: "gpt-3.5-turbo",
                prompt: "test",
                max_tokens: 1
            this.healthStatus.openai = true;
        } catch (error) {
            this.healthStatus.openai = false;
            throw new Error('OpenAI service unhealthy');

// Implementation
const loadBalancer = new LoadBalancer();
const autoScaler = new AutoScaler();
const healthCheck = new HealthCheck();

app.get('/health', async (req, res) => {
    const health = await healthCheck.checkHealth();
    res.status(health.status === 'healthy' ? 200 : 500).json(health);

// Middleware to rotate API keys
app.use(async (req, res, next) => {
    try {
        const apiKey = await loadBalancer.getNextApiKey();
        req.openaiKey = apiKey;
    } catch (error) {
            error: 'Service at capacity'

OpenAI Node.js Security and Compliance

const helmet = require('helmet');
const rateLimit = require('express-rate-limit');
const sanitize = require('sanitize-html');

class SecurityManager {
    constructor() {

    setupSecurityMiddleware() {
        // Basic security headers

        // Rate limiting

        // Content security policy

        // Input sanitization

    rateLimiter() {
        return rateLimit({
            windowMs: 15 * 60 * 1000,
            max: 100,
            message: 'Too many requests from this IP'

    contentSecurityPolicy() {
        return helmet.contentSecurityPolicy({
            directives: {
                defaultSrc: ["'self'"],
                scriptSrc: ["'self'", "'unsafe-inline'"],
                styleSrc: ["'self'", "'unsafe-inline'"],
                imgSrc: ["'self'", 'data:', 'https:'],
                connectSrc: ["'self'", 'https://api.openai.com']

    sanitizeInput() {
        return (req, res, next) => {
            if (req.body) {
                for (let key in req.body) {
                    if (typeof req.body[key] === 'string') {
                        req.body[key] = sanitize(req.body[key]);

I’ll continue with the final sections covering Testing, Documentation, and CI/CD integration.

OpenAI Node.js Testing Framework

// tests/openai.test.js
const { expect } = require('chai');
const sinon = require('sinon');
const { OpenAITestSuite } = require('./test-utils');

class OpenAITesting {
    constructor() {
        this.mockedResponses = new Map();
        this.testSuite = new OpenAITestSuite();

    // Mock OpenAI responses for testing
    mockCompletion(prompt, response) {
                data: {
                    choices: [{
                        message: { content: response }

    async runTests() {
        describe('OpenAI Integration Tests', () => {
            let openaiStub;

            beforeEach(() => {
                openaiStub = sinon.stub(openai, 'createChatCompletion');
                openaiStub.callsFake(async (params) => {
                    const mockedResponse = this.mockedResponses.get(
                    if (!mockedResponse) {
                        throw new Error('No mocked response found');
                    return mockedResponse;

            afterEach(() => {

            it('should handle chat completions', async () => {
                const result = await this.testSuite.testChatCompletion({
                    messages: [{ role: 'user', content: 'Test prompt' }]

                expect(result).to.have.property('success', true);

            it('should handle rate limiting', async () => {
                    response: {
                        data: {
                            error: {
                                type: 'rate_limit_exceeded'

                const result = await this.testSuite.testRateLimiting();

// Integration test examples
describe('OpenAI Integration', () => {
    let testManager;

    before(() => {
        testManager = new OpenAITesting();
            'Test prompt',
            'Mocked response'

    it('should handle vector storage', async () => {
        const vectorDb = new VectorDatabaseManager(openai);
        const testDocument = 'Test document for vector storage';
        const vectorId = await vectorDb.upsertDocument(testDocument);

        const searchResults = await vectorDb.semanticSearch('test');

OpenAI Node.js CI/CD Pipeline

// .github/workflows/openai-deployment.yml
name: OpenAI Node.js CI/CD

    branches: [ main ]
    branches: [ main ]

    runs-on: ubuntu-latest
      - uses: actions/checkout@v2
      - name: Setup Node.js
        uses: actions/setup-node@v2
          node-version: '18'
      - name: Install dependencies
        run: npm ci
      - name: Run tests
        run: npm test
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}

    needs: test
    runs-on: ubuntu-latest
      - name: Deploy to production
        uses: digitalocean/action-doctl@v2
          token: ${{ secrets.DIGITALOCEAN_TOKEN }}
      - name: Build and push Docker image
        run: |
          docker build -t openai-nodejs-app .
          docker push registry.digitalocean.com/openai-nodejs-app

OpenAI Node.js Documentation System

// documentation/ApiDocumentation.js
class ApiDocumentation {
    constructor() {
        this.docs = new Map();
        this.examples = new Map();

     * @api {post} /api/chat Generate chat completion
     * @apiName ChatCompletion
     * @apiGroup OpenAI
     * @apiVersion 1.0.0
     * @apiParam {Object[]} messages Array of message objects
     * @apiParam {String} messages.role Role of the message sender
     * @apiParam {String} messages.content Content of the message
     * @apiSuccess {Boolean} success Indicates if request was successful
     * @apiSuccess {String} completion Generated completion text
    documentEndpoint(path, method, description) {
        this.docs.set(`${method}:${path}`, {
            parameters: [],
            responses: []

    addExample(path, method, example) {
        const key = `${method}:${path}`;
        if (!this.examples.has(key)) {
            this.examples.set(key, []);

    generateDocs() {
        let markdown = '# OpenAI Node.js API Documentation\n\n';

        for (const [endpoint, doc] of this.docs) {
            markdown += `## ${endpoint}\n\n`;
            markdown += `${doc.description}\n\n`;

            const examples = this.examples.get(endpoint) || [];
            if (examples.length > 0) {
                markdown += '### Examples\n\n';
                examples.forEach((example, index) => {
                    markdown += `#### Example ${index + 1}\n\n`;
                    markdown += '```javascript\n';
                    markdown += JSON.stringify(example, null, 2);
                    markdown += '\n```\n\n';

        return markdown;

// Implementation
const apiDocs = new ApiDocumentation();

apiDocs.documentEndpoint('/api/chat', 'POST', 'Generate chat completion using OpenAI GPT models');
apiDocs.addExample('/api/chat', 'POST', {
    request: {
        messages: [
            { role: 'user', content: 'Hello, how are you?' }
    response: {
        success: true,
        completion: 'I am doing well, thank you for asking!'

// Generate documentation
const docs = apiDocs.generateDocs();

I’ll continue with Advanced Use Cases and Optimization sections.

OpenAI Node.js Advanced Fine-tuning Implementation

class FineTuningManager {
    constructor(openai) {
        this.openai = openai;
        this.trainingJobs = new Map();

    async prepareTrainingData(rawData) {
        const formattedData = rawData.map(item => ({
            messages: [
                { role: "system", content: item.systemPrompt },
                { role: "user", content: item.userInput },
                { role: "assistant", content: item.expectedOutput }

        // Create JSONL file for fine-tuning
        const trainingFile = formattedData
            .map(item => JSON.stringify(item))

        const file = await this.openai.createFile({
            file: Buffer.from(trainingFile),
            purpose: 'fine-tune'

        return file.data.id;

    async startFineTuning(fileId, modelConfig = {}) {
        const {
            baseModel = "gpt-3.5-turbo",
            epochs = 3,
            batchSize = 4
        } = modelConfig;

        const fineTuningJob = await this.openai.createFineTuningJob({
            training_file: fileId,
            model: baseModel,
            hyperparameters: {
                n_epochs: epochs,
                batch_size: batchSize

        this.trainingJobs.set(fineTuningJob.data.id, {
            status: 'training',
            startTime: new Date(),
            config: modelConfig

        return fineTuningJob.data.id;

    async monitorTrainingProgress(jobId) {
        const job = await this.openai.retrieveFineTuningJob(jobId);
        const metrics = {
            status: job.data.status,
            trainedTokens: job.data.trained_tokens,
            elapsedTime: Date.now() - this.trainingJobs.get(jobId).startTime,
            lossMetrics: job.data.result_files

        this.trainingJobs.set(jobId, {

        return metrics;

// Implementation example
app.post('/api/fine-tune', async (req, res) => {
    const fineTuningManager = new FineTuningManager(openai);
    try {
        const fileId = await fineTuningManager.prepareTrainingData(

        const jobId = await fineTuningManager.startFineTuning(fileId, {
            baseModel: "gpt-3.5-turbo",
            epochs: 5,
            batchSize: 8

            success: true,
            message: 'Fine-tuning job started successfully'
    } catch (error) {
        console.error('Fine-tuning error:', error);
            success: false,
            error: error.message

OpenAI Node.js Custom Model Pipeline

class ModelPipeline {
    constructor() {
        this.stages = [];
        this.middleware = new Map();

    addStage(name, processor) {

    addMiddleware(name, middleware) {
        this.middleware.set(name, middleware);

    async process(input) {
        let result = input;

        for (const stage of this.stages) {
            // Apply pre-processing middleware
            const preMiddleware = this.middleware.get(`pre:${stage.name}`);
            if (preMiddleware) {
                result = await preMiddleware(result);

            // Process stage
            result = await stage.processor(result);

            // Apply post-processing middleware
            const postMiddleware = this.middleware.get(`post:${stage.name}`);
            if (postMiddleware) {
                result = await postMiddleware(result);

        return result;

// Example implementation
const pipeline = new ModelPipeline();

// Add preprocessing stage
pipeline.addStage('preprocess', async (input) => {
    return {
        text: input.text.toLowerCase().trim()

// Add embedding generation
pipeline.addStage('embedding', async (input) => {
    const embedding = await openai.createEmbedding({
        model: "text-embedding-ada-002",
        input: input.text

    return {
        embedding: embedding.data.data[0].embedding

// Add completion generation
pipeline.addStage('completion', async (input) => {
    const completion = await openai.createChatCompletion({
        model: "gpt-4",
        messages: [
            { role: "system", content: input.systemPrompt },
            { role: "user", content: input.text }

    return {
        completion: completion.data.choices[0].message.content

// Add middleware for logging
pipeline.addMiddleware('pre:completion', async (input) => {
    console.log(`Processing completion for: ${input.text}`);
    return input;

// Usage example
app.post('/api/process', async (req, res) => {
    try {
        const result = await pipeline.process({
            text: req.body.text,
            systemPrompt: req.body.systemPrompt

            success: true,
    } catch (error) {
            success: false,
            error: error.message

Key Takeaways from OpenAI Node.js Integration

  1. Always implement robust error handling and retries
  2. Use vector databases for efficient similarity searches
  3. Monitor API usage and implement proper rate limiting
  4. Secure your API keys and implement proper authentication
  5. Use streaming for real-time responses
  6. Implement proper logging and monitoring
  7. Follow deployment best practices

Additional Resources

  1. Official OpenAI Documentation: https://platform.openai.com/docs
  2. LangChain Documentation: https://js.langchain.com/docs
  3. Vector Database Documentation:
  4. Node.js Best Practices: https://github.com/goldbergyoni/nodebestpractices