Architecture Deep Dive
How Offworld works internally
Understand Offworld's internal architecture.
System Architecture
┌─────────────────────────────────────────────┐
│ Frontend (Cloudflare Workers) │
│ TanStack Start + Router + shadcn/ui │
└──────────────┬──────────────────────────────┘
│ Convex WebSocket
┌──────────────▼──────────────────────────────┐
│ Backend (Convex Cloud) │
│ Queries, Mutations, Actions, Workflows │
├─────────────────────────────────────────────┤
│ ┌─────────────┐ ┌──────────────┐ │
│ │ Database │ │ File Store │ │
│ └─────────────┘ └──────────────┘ │
│ ┌─────────────┐ ┌──────────────┐ │
│ │ RAG/Vector │ │ Workflows │ │
│ └─────────────┘ └──────────────┘ │
└──────────────┬──────────────────────────────┘
│
┌──────────────▼──────────────────────────────┐
│ External APIs │
│ GitHub API • Gemini AI • Embeddings │
└─────────────────────────────────────────────┘Data Flow
Repository Analysis Flow
- User triggers analysis →
repos.startAnalysismutation - Workflow launched →
WorkflowManager.start(analyzeRepository) - GitHub data fetched → Actions call GitHub API
- Files ingested → RAG embeds and stores top 500 files
- AI analysis → Gemini generates summary, architecture, diagrams
- Database updated → Progressive updates after each step
- Frontend re-renders → Convex subscriptions push updates
Chat Flow
- User sends message →
conversations.sendMessagemutation - Agent invoked →
actions.chat.processMessagewith Gemini - Tools called → Agent decides which tools to use
- Context retrieved → Tools query RAG, DB, or GitHub
- Response generated → Gemini synthesizes answer
- Message stored → Saved to
messagestable - Frontend updates → Real-time via subscription
Backend Architecture
File Organization
packages/backend/convex/
├── schema.ts # Database schema
├── auth.ts, auth.config.ts # Better Auth setup
├── repos.ts # Repository queries/mutations
├── architectureEntities.ts # Entity queries
├── issues.ts # Issue analysis
├── pullRequests.ts # PR analysis
├── chat.ts # Chat queries/mutations
├── gemini.ts # All AI generation
├── github.ts # GitHub API actions
├── prompts.ts # AI prompts
├── aiValidation.ts # Zod schemas
├── workflows/
│ └── analyzeRepository.ts # 11-step workflow
└── agent/
├── codebaseAgent.ts # Agent setup
└── tools.ts # 9 agent toolsDatabase Schema
5 main tables:
-
repositories - Repo metadata + analysis results
fullName,stars,languagesummary,architecture,mermaidDiagramstatus(queued, processing, completed, failed)
-
architectureEntities - Discovered components
name,description,importance,layerpath,githubUrl
-
issues - GitHub issues + AI analysis
number,title,difficulty,skillsfilesTouch,difficultyRationale
-
pullRequests - PRs + AI analysis
number,title,impact,summaryfilesChanged
-
conversations - Chat threads
title,repoId,userId- Related:
messagestable
Workflow: analyzeRepository
11 steps (2-5 minutes):
1. validateRepository() // Check GitHub API
2. handleReIndex() // Clear old data if re-index
3. fetchFileTree() // Get all file paths
4. calculateIterations() // Based on repo size
5. ingestToRAG() // Embed top 500 files
6. generateSummary() // 300-word overview
7. discoverArchitecture() // 2-5 iterations
8. consolidateEntities() // Filter to top 5-15
9. generateDiagrams() // Mermaid + narrative
10. analyzeIssues() // Fetch & analyze issues
11. analyzePRs() // Fetch & analyze PRsProgressive updates: Database updated after each step.
Error handling: Failed steps retry automatically.
Progressive Architecture Discovery
Iteration loop (2-5 times):
for (let i = 0; i < iterationCount; i++) {
// Get context from previous iterations
const previousContext = await getPreviousEntities(ctx, repoId);
// Ask AI to discover more specific components
const newEntities = await discoverIteration(
fileTree,
previousContext,
iterationNumber: i + 1
);
// Save to database
await saveEntities(ctx, repoId, newEntities);
}Result: Hierarchical discovery (packages → modules → components).
RAG System
Ingestion:
// packages/backend/convex/github.ts
export const ingestRepository = action({
handler: async (ctx, { fullName }) => {
// 1. Fetch file tree from GitHub
const files = await fetchFileTree(fullName);
// 2. Filter to top 500 files (by size, extension)
const topFiles = filterRelevantFiles(files);
// 3. Chunk and embed
for (const file of topFiles) {
const content = await fetchFileContent(file.path);
await ctx.vectorSearch("rag", `repo:${fullName}`).add({
text: content,
metadata: { path: file.path }
});
}
}
});Search:
// agent/tools.ts
const results = await ctx.vectorSearch("rag", `repo:${fullName}`)
.search(query, { limit: 5 });AI Integration
All AI calls in gemini.ts:
export const generateSummary = action({
handler: async (ctx, { fileTree, metadata }) => {
const response = await generateObject({
model: gemini("gemini-2.5-flash-exp"),
schema: SummarySchema,
prompt: SUMMARY_PROMPT.replace("{FILE_TREE}", fileTree)
});
return response.object.summary;
}
});Validation with Zod (aiValidation.ts):
const ArchitectureEntitySchema = z.object({
name: z.string().min(1),
description: z.string().min(50),
importance: z.number().min(0).max(1),
layer: z.enum(["entry-point", "core", "feature", "utility", "integration"]),
path: z.string()
});Frontend Architecture
Route Structure
apps/web/src/routes/
├── __root.tsx # Root layout (header, auth)
├── index.tsx # Home page
├── _github/ # GitHub layout
│ └── $owner_.$repo/ # Repo layout
│ ├── index.tsx # Summary tab
│ ├── arch/ # Architecture
│ │ ├── index.tsx # Entity list
│ │ └── $slug.tsx # Entity detail
│ ├── issues/
│ │ ├── index.tsx # Issue list
│ │ └── $number.tsx # Issue detail
│ ├── pr/
│ │ ├── index.tsx # PR list
│ │ └── $number.tsx # PR detail
│ └── chat/
│ ├── index.tsx # New chat
│ └── $chatId.tsx # Chat threadConvex Integration
Setup (router.tsx):
const convex = new ConvexClient(import.meta.env.VITE_CONVEX_URL);
const convexQueryClient = new ConvexQueryClient(convex);
export const router = createRouter({
context: {
queryClient: new QueryClient(),
convexReactClient: convex,
convexQueryClient
}
});Usage (components):
// Reactive query (auto-subscribes)
const repo = useQuery(api.repos.getByFullName, { fullName });
// Mutation (one-off)
const startAnalysis = useMutation(api.repos.startAnalysis);
// Action (async)
const getOwnerInfo = useAction(api.github.getOwnerInfo);Component Patterns
Loading states:
if (!repo) return <Skeleton />;
if (repo.status === "processing") return <ProgressIndicator />;
return <RepoContent repo={repo} />;Progressive rendering:
{repo.summary && <SummaryCard summary={repo.summary} />}
{repo.entities && <ArchitectureList entities={repo.entities} />}
{repo.diagram && <MermaidDiagram code={repo.diagram} />}Key Patterns
Progressive Updates
Instead of "loading for 5 minutes", update UI after each workflow step:
// Backend
await ctx.runMutation(internal.repos.updateSummary, { summary });
// Frontend sees update immediately via Convex subscription
await ctx.runMutation(internal.repos.updateArchitecture, { entities });
// Frontend sees architecture!Case-Insensitive Lookups
All repo queries handle case variations:
const repo = await ctx.db
.query("repositories")
.withIndex("by_fullName_lower", (q) =>
q.eq("fullNameLower", fullName.toLowerCase())
)
.first();Path Validation
LLM-generated paths checked against GitHub file tree:
const validPath = fileTree.some(f => f.path === llmPath);
if (!validPath) {
console.warn("Invalid path from LLM:", llmPath);
return null; // Skip invalid entity
}This prevents hallucinated file paths from breaking GitHub links.
Next Steps
- Learn about Deployment
- Read Contributing guidelines
- Check Technical Reference for API details