Choosing an AI API Provider
Most web developers integrate AI through APIs rather than building models from scratch. Several providers offer powerful AI capabilities you can access with a few lines of code.
Major AI API Providers
Getting Started with OpenAI API
Let's walk through a complete example using OpenAI's API (the most popular choice).
Step 1: Get API Key
- Sign up at
platform.openai.com - Navigate to API keys section
- Create a new API key
- Store it securely (never commit to GitHub!)
Step 2: Install SDK
# Node.js
npm install openai
# Python
pip install openai
Step 3: Basic Implementation
// Node.js/JavaScript example
import OpenAI from 'openai';
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY, // Store in environment variable
});
async function generateResponse(userMessage) {
try {
const completion = await openai.chat.completions.create({
model: "gpt-4o",
messages: [
{
role: "system",
content: "You are a helpful customer support assistant."
},
{
role: "user",
content: userMessage
}
],
temperature: 0.7,
max_tokens: 500,
});
return completion.choices[0].message.content;
} catch (error) {
console.error('OpenAI API error:', error);
throw error;
}
}
// Usage
const response = await generateResponse("How do I reset my password?");
console.log(response);
Key Parameters Explained
- model – Which AI model to use (gpt-4o, gpt-4-turbo, gpt-3.5-turbo, etc.)
- messages – Array of conversation messages with roles (system, user, assistant)
- temperature – Randomness/creativity (0 = deterministic, 2 = very creative)
- max_tokens – Maximum length of response (limits cost)
- top_p – Alternative to temperature for controlling randomness
- presence_penalty – Encourages talking about new topics (-2 to 2)
- frequency_penalty – Reduces repetition (-2 to 2)
Architecture Patterns for AI Integration
Pattern 1: Client-Side Direct (Simple but Limited)
Why avoid: Your API key is exposed in client code, allowing anyone to use (and abuse) your quota.
Pattern 2: Backend Proxy (Recommended)
Implementation example:
// Backend API endpoint (Express.js)
import express from 'express';
import OpenAI from 'openai';
const app = express();
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
app.post('/api/chat', async (req, res) => {
// 1. Authenticate user
const user = await authenticateUser(req);
if (!user) {
return res.status(401).json({ error: 'Unauthorized' });
}
// 2. Rate limiting
if (await isRateLimited(user.id)) {
return res.status(429).json({ error: 'Too many requests' });
}
// 3. Validate and sanitize input
const { message } = req.body;
if (!message || message.length > 1000) {
return res.status(400).json({ error: 'Invalid message' });
}
try {
// 4. Call OpenAI API
const completion = await openai.chat.completions.create({
model: "gpt-4o",
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: message }
],
max_tokens: 500,
});
// 5. Log usage for billing
await logUsage(user.id, completion.usage);
// 6. Return response
res.json({
response: completion.choices[0].message.content,
usage: completion.usage
});
} catch (error) {
console.error('OpenAI error:', error);
res.status(500).json({ error: 'AI service unavailable' });
}
});
// Frontend code
async function askAI(message) {
const response = await fetch('/api/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ message }),
});
if (!response.ok) throw new Error('Request failed');
return response.json();
}
Pattern 3: Streaming Responses
For better UX, stream AI responses token-by-token instead of waiting for the complete response.
// Backend - streaming endpoint
app.post('/api/chat/stream', async (req, res) => {
res.setHeader('Content-Type', 'text/event-stream');
res.setHeader('Cache-Control', 'no-cache');
res.setHeader('Connection', 'keep-alive');
const stream = await openai.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: req.body.message }],
stream: true,
});
for await (const chunk of stream) {
const content = chunk.choices[0]?.delta?.content || '';
if (content) {
res.write(`data: ${JSON.stringify({ content })}\n\n`);
}
}
res.write('data: [DONE]\n\n');
res.end();
});
// Frontend - receive streaming response
async function streamAIResponse(message) {
const response = await fetch('/api/chat/stream', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ message }),
});
const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { value, done } = await reader.read();
if (done) break;
const chunk = decoder.decode(value);
const lines = chunk.split('\n');
for (const line of lines) {
if (line.startsWith('data: ')) {
const data = line.slice(6);
if (data === '[DONE]') return;
const parsed = JSON.parse(data);
updateUI(parsed.content); // Update UI incrementally
}
}
}
}
Cost Management Strategies
AI API costs can add up quickly. Implement these strategies to control expenses:
1. Caching
import Redis from 'ioredis';
const redis = new Redis();
async function getCachedAIResponse(prompt) {
// Check cache first
const cached = await redis.get(`ai:${hash(prompt)}`);
if (cached) return JSON.parse(cached);
// Call AI if not cached
const response = await openai.chat.completions.create({...});
const result = response.choices[0].message.content;
// Cache for 1 hour
await redis.setex(`ai:${hash(prompt)}`, 3600, JSON.stringify(result));
return result;
}
2. Token Limits & Truncation
import { encoding_for_model } from 'tiktoken';
function truncateToTokenLimit(text, maxTokens = 4000) {
const encoding = encoding_for_model('gpt-4');
const tokens = encoding.encode(text);
if (tokens.length <= maxTokens) return text;
// Truncate and decode back to text
const truncated = tokens.slice(0, maxTokens);
return encoding.decode(truncated);
}
3. Model Selection
// Use cheaper models for simple tasks
function chooseModel(taskComplexity) {
if (taskComplexity === 'simple') {
return 'gpt-3.5-turbo'; // $0.0005 per 1K tokens
} else {
return 'gpt-4o'; // $0.0025-$0.01 per 1K tokens
}
}
4. Rate Limiting
import rateLimit from 'express-rate-limit';
const aiRateLimiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes
max: 50, // Limit each user to 50 requests per window
message: 'Too many AI requests, please try again later.',
});
app.post('/api/chat', aiRateLimiter, async (req, res) => {
// ... handle request
});
Error Handling Best Practices
async function robustAICall(prompt, options = {}) {
const maxRetries = 3;
let lastError;
for (let attempt = 0; attempt < maxRetries; attempt++) {
try {
const response = await openai.chat.completions.create({
...options,
messages: [{ role: "user", content: prompt }],
});
return response.choices[0].message.content;
} catch (error) {
lastError = error;
// Don't retry on certain errors
if (error.status === 400) {
throw new Error('Invalid request: ' + error.message);
}
// Retry on rate limits with exponential backoff
if (error.status === 429) {
const delay = Math.pow(2, attempt) * 1000;
console.log(`Rate limited, retrying in ${delay}ms...`);
await new Promise(resolve => setTimeout(resolve, delay));
continue;
}
// Retry on server errors
if (error.status >= 500) {
console.log(`Server error, retrying attempt ${attempt + 1}...`);
await new Promise(resolve => setTimeout(resolve, 1000));
continue;
}
throw error;
}
}
throw new Error(`Failed after ${maxRetries} attempts: ${lastError.message}`);
}
Key Takeaways
- Major AI providers: OpenAI (most popular), Anthropic (best reasoning), Google (free tier), open source (privacy/control).
- Never expose API keys in client code—always use a backend proxy.
- Backend proxy pattern: frontend → your server → AI API (enables auth, rate limiting, input validation).
- Stream responses for better UX—show text as it's generated instead of waiting.
- Manage costs: cache responses, set token limits, choose cheaper models for simple tasks, implement rate limiting.
- Robust error handling: retry with exponential backoff for rate limits, don't retry on client errors (400s).
- Key parameters: model, temperature (randomness), max_tokens (cost control), messages (conversation context).
- Monitor usage and set budget alerts to avoid surprise bills.
- Log all AI requests for debugging, analytics, and compliance.
- Implement timeouts to prevent hanging requests.
Next, let's explore AI-powered development tools that can accelerate your workflow.