Skip to content

Optimizations

Building an AI-powered code review system at scale requires aggressive optimization, particularly around data ingestion and AI model interactions.

Instead of re-indexing the entire repository on every change, we implement an incremental approach using Git diffs.

  1. Commit Tracking: The Repository model stores the lastIndexedCommitSha.
  2. GitHub Diff: We use the GitHub compare API to get the list of files changed between the last indexed commit and the current HEAD.
  3. Targeted Invalidation: Only chunks belonging to changed files are deleted and re-created.
inngest/functions/index.ts
const changedFiles = await getChangedFiles(
owner,
repo,
token,
repository.lastIndexedCommitSha,
effectiveCommitSha
);
if (changedFiles.length > 0) {
await prisma.codeChunk.deleteMany({
where: {
repoId: repository.id,
filePath: { in: changedFiles },
},
});
}

Calling the Gemini embedding API for every single chunk would be slow and prone to rate limits. We process chunks in parallel batches.

const EMBEDDING_BATCH_SIZE = 15;
const batches = chunkArray(processedChunks, EMBEDDING_BATCH_SIZE);
for (const batch of batches) {
const chunksWithEmbeddings = await Promise.all(
batch.map(async (chunk) => {
const embedding = await generateEmbedding(chunk.content);
return { ...chunk, embedding };
})
);
await bulkInsertCodeChunks(chunksWithEmbeddings);
}

By using Tree-sitter for chunking, we ensure that the retrieval system returns logical units (like an entire function) rather than random slices of a file. This significantly improves the quality of the AI review as it sees the full context of a function’s implementation.

All external API calls (Gemini, GitHub, Inngest) are wrapped in retry logic with exponential backoff and jitter to handle transient failures and rate limiting.

By using Inngest, we gain “Durable Execution”. If a worker crashes or a server restarts mid-job, Inngest remembers exactly which steps were completed and resumes from the last successful step.