article Part 4 of 6

Choosing an AI API Provider

Most web developers integrate AI through APIs rather than building models from scratch. Several providers offer powerful AI capabilities you can access with a few lines of code.

Major AI API Providers

Getting Started with OpenAI API

Let's walk through a complete example using OpenAI's API (the most popular choice).

Step 1: Get API Key

  1. Sign up at platform.openai.com
  2. Navigate to API keys section
  3. Create a new API key
  4. Store it securely (never commit to GitHub!)

Step 2: Install SDK

# Node.js
npm install openai

# Python
pip install openai

Step 3: Basic Implementation

// Node.js/JavaScript example
import OpenAI from 'openai';

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY, // Store in environment variable
});

async function generateResponse(userMessage) {
  try {
    const completion = await openai.chat.completions.create({
      model: "gpt-4o",
      messages: [
        {
          role: "system",
          content: "You are a helpful customer support assistant."
        },
        {
          role: "user",
          content: userMessage
        }
      ],
      temperature: 0.7,
      max_tokens: 500,
    });

    return completion.choices[0].message.content;
  } catch (error) {
    console.error('OpenAI API error:', error);
    throw error;
  }
}

// Usage
const response = await generateResponse("How do I reset my password?");
console.log(response);

Key Parameters Explained

  • model – Which AI model to use (gpt-4o, gpt-4-turbo, gpt-3.5-turbo, etc.)
  • messages – Array of conversation messages with roles (system, user, assistant)
  • temperature – Randomness/creativity (0 = deterministic, 2 = very creative)
  • max_tokens – Maximum length of response (limits cost)
  • top_p – Alternative to temperature for controlling randomness
  • presence_penalty – Encourages talking about new topics (-2 to 2)
  • frequency_penalty – Reduces repetition (-2 to 2)

Architecture Patterns for AI Integration

Pattern 1: Client-Side Direct (Simple but Limited)

Why avoid: Your API key is exposed in client code, allowing anyone to use (and abuse) your quota.

Pattern 2: Backend Proxy (Recommended)

Implementation example:

// Backend API endpoint (Express.js)
import express from 'express';
import OpenAI from 'openai';

const app = express();
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

app.post('/api/chat', async (req, res) => {
  // 1. Authenticate user
  const user = await authenticateUser(req);
  if (!user) {
    return res.status(401).json({ error: 'Unauthorized' });
  }

  // 2. Rate limiting
  if (await isRateLimited(user.id)) {
    return res.status(429).json({ error: 'Too many requests' });
  }

  // 3. Validate and sanitize input
  const { message } = req.body;
  if (!message || message.length > 1000) {
    return res.status(400).json({ error: 'Invalid message' });
  }

  try {
    // 4. Call OpenAI API
    const completion = await openai.chat.completions.create({
      model: "gpt-4o",
      messages: [
        { role: "system", content: "You are a helpful assistant." },
        { role: "user", content: message }
      ],
      max_tokens: 500,
    });

    // 5. Log usage for billing
    await logUsage(user.id, completion.usage);

    // 6. Return response
    res.json({
      response: completion.choices[0].message.content,
      usage: completion.usage
    });
  } catch (error) {
    console.error('OpenAI error:', error);
    res.status(500).json({ error: 'AI service unavailable' });
  }
});

// Frontend code
async function askAI(message) {
  const response = await fetch('/api/chat', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ message }),
  });

  if (!response.ok) throw new Error('Request failed');
  return response.json();
}

Pattern 3: Streaming Responses

For better UX, stream AI responses token-by-token instead of waiting for the complete response.

// Backend - streaming endpoint
app.post('/api/chat/stream', async (req, res) => {
  res.setHeader('Content-Type', 'text/event-stream');
  res.setHeader('Cache-Control', 'no-cache');
  res.setHeader('Connection', 'keep-alive');

  const stream = await openai.chat.completions.create({
    model: "gpt-4o",
    messages: [{ role: "user", content: req.body.message }],
    stream: true,
  });

  for await (const chunk of stream) {
    const content = chunk.choices[0]?.delta?.content || '';
    if (content) {
      res.write(`data: ${JSON.stringify({ content })}\n\n`);
    }
  }

  res.write('data: [DONE]\n\n');
  res.end();
});

// Frontend - receive streaming response
async function streamAIResponse(message) {
  const response = await fetch('/api/chat/stream', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ message }),
  });

  const reader = response.body.getReader();
  const decoder = new TextDecoder();

  while (true) {
    const { value, done } = await reader.read();
    if (done) break;

    const chunk = decoder.decode(value);
    const lines = chunk.split('\n');

    for (const line of lines) {
      if (line.startsWith('data: ')) {
        const data = line.slice(6);
        if (data === '[DONE]') return;

        const parsed = JSON.parse(data);
        updateUI(parsed.content); // Update UI incrementally
      }
    }
  }
}

Cost Management Strategies

AI API costs can add up quickly. Implement these strategies to control expenses:

1. Caching

import Redis from 'ioredis';
const redis = new Redis();

async function getCachedAIResponse(prompt) {
  // Check cache first
  const cached = await redis.get(`ai:${hash(prompt)}`);
  if (cached) return JSON.parse(cached);

  // Call AI if not cached
  const response = await openai.chat.completions.create({...});
  const result = response.choices[0].message.content;

  // Cache for 1 hour
  await redis.setex(`ai:${hash(prompt)}`, 3600, JSON.stringify(result));

  return result;
}

2. Token Limits & Truncation

import { encoding_for_model } from 'tiktoken';

function truncateToTokenLimit(text, maxTokens = 4000) {
  const encoding = encoding_for_model('gpt-4');
  const tokens = encoding.encode(text);

  if (tokens.length <= maxTokens) return text;

  // Truncate and decode back to text
  const truncated = tokens.slice(0, maxTokens);
  return encoding.decode(truncated);
}

3. Model Selection

// Use cheaper models for simple tasks
function chooseModel(taskComplexity) {
  if (taskComplexity === 'simple') {
    return 'gpt-3.5-turbo'; // $0.0005 per 1K tokens
  } else {
    return 'gpt-4o'; // $0.0025-$0.01 per 1K tokens
  }
}

4. Rate Limiting

import rateLimit from 'express-rate-limit';

const aiRateLimiter = rateLimit({
  windowMs: 15 * 60 * 1000, // 15 minutes
  max: 50, // Limit each user to 50 requests per window
  message: 'Too many AI requests, please try again later.',
});

app.post('/api/chat', aiRateLimiter, async (req, res) => {
  // ... handle request
});

Error Handling Best Practices

async function robustAICall(prompt, options = {}) {
  const maxRetries = 3;
  let lastError;

  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      const response = await openai.chat.completions.create({
        ...options,
        messages: [{ role: "user", content: prompt }],
      });

      return response.choices[0].message.content;

    } catch (error) {
      lastError = error;

      // Don't retry on certain errors
      if (error.status === 400) {
        throw new Error('Invalid request: ' + error.message);
      }

      // Retry on rate limits with exponential backoff
      if (error.status === 429) {
        const delay = Math.pow(2, attempt) * 1000;
        console.log(`Rate limited, retrying in ${delay}ms...`);
        await new Promise(resolve => setTimeout(resolve, delay));
        continue;
      }

      // Retry on server errors
      if (error.status >= 500) {
        console.log(`Server error, retrying attempt ${attempt + 1}...`);
        await new Promise(resolve => setTimeout(resolve, 1000));
        continue;
      }

      throw error;
    }
  }

  throw new Error(`Failed after ${maxRetries} attempts: ${lastError.message}`);
}

Key Takeaways

  • Major AI providers: OpenAI (most popular), Anthropic (best reasoning), Google (free tier), open source (privacy/control).
  • Never expose API keys in client code—always use a backend proxy.
  • Backend proxy pattern: frontend → your server → AI API (enables auth, rate limiting, input validation).
  • Stream responses for better UX—show text as it's generated instead of waiting.
  • Manage costs: cache responses, set token limits, choose cheaper models for simple tasks, implement rate limiting.
  • Robust error handling: retry with exponential backoff for rate limits, don't retry on client errors (400s).
  • Key parameters: model, temperature (randomness), max_tokens (cost control), messages (conversation context).
  • Monitor usage and set budget alerts to avoid surprise bills.
  • Log all AI requests for debugging, analytics, and compliance.
  • Implement timeouts to prevent hanging requests.

Next, let's explore AI-powered development tools that can accelerate your workflow.