Database ORM Usage¶
BidScript utilises a sophisticated multi-tier data architecture that combines Redis caching, MongoDB persistence (transitioning to PostgreSQL), and Drizzle ORM for optimal performance. This document explores how both the frontend and backend services interact with this data layer, with particular focus on document management workflows.
Current Architecture Overview¶
The BidScript data architecture implements a cache-first, persistence-second strategy that provides exceptional performance whilst maintaining data durability. The system currently operates in a transitional state, with MongoDB serving as the persistence layer and PostgreSQL being introduced as the unified replacement.
Three-Tier Data Flow¶
- Cache Layer (Redis): Primary data access point with 15-minute TTL
- Persistence Layer (MongoDB → PostgreSQL): Durable storage with background synchronisation
- ORM Layer (Drizzle): Type-safe database operations across both services
Service Architecture¶
Both the Next.js frontend and NestJS backend utilise Drizzle ORM, but serve different architectural roles:
- Frontend: Direct database access for user sessions, authentication, and real-time features
- Backend: Orchestrates caching strategy, handles document processing, and manages persistence
Database Architecture Evolution¶
The database architecture has evolved through several phases to reach the current sophisticated multi-tier approach:
Phase 1: MSSQL with Prisma (Historical)¶
Initial Setup: The application originally used MSSQL with Prisma ORM for user authentication, integrated seamlessly with NextAuth.js.
// Historical NextAuth configuration with Prisma (no longer used)
import { PrismaAdapter } from "@next-auth/prisma-adapter";
import { PrismaClient } from "@prisma/client";
const prisma = new PrismaClient();
export default NextAuth({
adapter: PrismaAdapter(prisma),
// ... other NextAuth configuration
});
Phase 2: Multi-Database Architecture (Historical)¶
Dual-Database Phase: Added MongoDB VCore for vector indexing capabilities alongside MSSQL, creating a complex multi-database architecture.
Challenges:
- Complex cross-database queries
- Data synchronisation issues
- Multiple connection strings and environments
- Inconsistent transaction boundaries
Phase 3: Current Unified Solution¶
Modern Architecture: PostgreSQL with pgvector extension, replacing both previous databases whilst maintaining Redis caching for optimal performance.
Key Improvements:
- Single database for all data types
- Native vector search capabilities
- ACID compliance across all operations
- Simplified deployment and monitoring
Migration Pain Points Resolved¶
The transition from Prisma to Drizzle specifically addressed several production issues:
- Shadow Database Complexity: Prisma's shadow database requirement during
migrate devcaused problems in cloud environments - Environment Management: Better support for multiple environments without complex shadow database setups
- Migration Debugging: Simplified debugging process compared to Prisma's more complex migration system
- Performance in Serverless: Improved query latency in serverless database environments
See the Database Overview for architectural diagrams and the complete migration timeline.
Current Implementation Benefits¶
Performance Improvements¶
- Sub-100ms Response Times: Redis-first caching strategy
- Background Persistence: Non-blocking MongoDB synchronisation
- Query Optimisation: Single database eliminates cross-database joins
- Vector Search: Native pgvector performance vs. external vector databases
Developer Experience¶
- Type Safety: Full TypeScript integration across both services
- Simplified Migrations: Drizzle's programmatic migration management
- Unified Schema: Shared schema definitions between frontend and backend
- Better Debugging: Transparent query generation and error reporting
Operational Excellence¶
- Single Point of Monitoring: One database instead of multiple systems
- Simplified Backups: Unified backup and disaster recovery
- Consistent Security: Single authentication and encryption model
- Reduced Complexity: Fewer moving parts and connection strings
For implementation details, see:
- Backend Drizzle Usage
- Frontend Drizzle Usage
- API Integration Patterns
- Technical Architecture Details
Redis-First Caching Strategy¶
Cache Architecture¶
- Single Database: Eliminates complexity of managing multiple database systems
- Native Vector Operations: pgvector provides high-performance vector similarity search
- ACID Compliance: Full transactional consistency across all data types
- Simplified Access Control: User permissions and document vectors in the same database
- Reduced Latency: No cross-database queries or data synchronisation delays
- Cost Efficiency: Single database instance instead of multiple database services
Authentication Migration¶
While Prisma provided excellent NextAuth.js integration, Drizzle now handles authentication tables with equal effectiveness:
// Current Drizzle schema for NextAuth.js tables
export const accounts = pgTable("Account", {
id: serial("id").primaryKey(),
userId: integer("userId")
.notNull()
.references(() => users.id),
type: varchar("type", { length: 255 }).notNull(),
provider: varchar("provider", { length: 255 }).notNull(),
// ... other NextAuth fields
});
export const sessions = pgTable("Session", {
id: serial("id").primaryKey(),
sessionToken: varchar("sessionToken", { length: 255 }).notNull().unique(),
userId: integer("userId")
.notNull()
.references(() => users.id),
expires: timestamp("expires").notNull(),
});
Current Database Architecture¶
Unified PostgreSQL Solution¶
As of June 2025, BidScript uses a single PostgreSQL database for all data storage needs. This unified approach provides:
- Consistency: Single source of truth for all application data
- Performance: Optimised queries without cross-database complexity
- Maintenance: Simplified backup, monitoring, and administration
- Scalability: Better horizontal and vertical scaling options
Migration Completed¶
The migration from the previous multi-database architecture (MongoDB VCore + MSSQL) to PostgreSQL has been completed successfully. All historical data has been migrated and the application now operates entirely on PostgreSQL.
Redis-First Caching Strategy¶
Cache Architecture¶
BidScript implements a Redis-first caching strategy for document management that prioritises performance whilst ensuring data durability. The system uses a sophisticated caching pattern that handles both immediate response requirements and background persistence.
Cache Key Structure¶
Document data is cached using a hierarchical key structure:
const getDocKey = (projectId: string, documentId: string) => {
return `tenderdoc:${projectId}:${documentId}:data`;
};
Each cached document includes:
- Data: The complete document object
- Meta: Field-level timestamps for selective updates
- TTL: 15-minute expiration (900 seconds)
Backend Caching Implementation¶
The backend service (TenderPackDBService) implements a comprehensive caching strategy:
// Cache documents to Redis with immediate MongoDB persistence
private async cacheDocumentToRedis(
doc: TenderDocumentType,
fieldMeta?: Partial<Record<keyof TenderDocumentType, number>>
): Promise<void> {
const key = this.getDocKey(doc.projectId, doc.documentId);
const now = Math.floor(Date.now() / 1000);
const redisValue = {
data: doc,
meta: {} as Record<string, number>,
};
// Set field-level timestamps for selective updates
for (const field in doc) {
const typedField = field as keyof TenderDocumentType;
redisValue.meta[typedField] = fieldMeta?.[typedField] ?? now;
}
// Cache to Redis with TTL
await this.redis.set(
key,
JSON.stringify(redisValue),
'EX',
this.TTL_SECONDS,
);
// Immediately persist to MongoDB to ensure durability
try {
await this.persistRedisDocsToMongoDB([doc]);
this.logger.log(`Cached doc ${doc.documentId} to Redis and immediately persisted to MongoDB`);
} catch (error) {
this.logger.error(`Failed to immediately persist doc ${doc.documentId} to MongoDB: ${error.message}`);
}
}
The getDocuments Endpoint: A Complete Example¶
The getDocuments endpoint exemplifies the cache-first, persistence-second strategy and demonstrates the complete data flow between frontend and backend services.
Backend Implementation (tender-parsing.controller.ts)¶
@Public()
@Post('getDocuments')
async getDocuments(
@Body() body: GetTenderDocumentation,
): Promise<Array<TenderDocumentType>> {
const projectId = body.projectId;
// Delegate to database service for cache-first retrieval
const result: TenderDocumentType[] =
await this.tenderPackDBService.getProjectDocumentation(projectId);
return result;
}
Backend Service Layer (tender-pack-db.service.ts)¶
The service implements a three-stage fallback strategy:
- Primary: Redis Cache Lookup
- Secondary: MongoDB Fallback
- Tertiary: Background Cache Population
public async getProjectDocumentation(
projectId: string,
): Promise<Array<TenderDocumentType>> {
const redisKeysPattern = `tenderdoc:${projectId}:*:data`;
try {
// Stage 1: Check Redis cache first
const stream = this.redis.scanStream({ match: redisKeysPattern });
const redisDocs: TenderDocumentType[] = [];
for await (const keys of stream) {
if (keys.length) {
const results = await this.redis.mget(...keys);
for (const docStr of results) {
if (docStr) {
const parsed = JSON.parse(docStr);
redisDocs.push(parsed.data || parsed);
}
}
}
}
if (redisDocs.length > 0) {
this.logger.log(`Found ${redisDocs.length} documents in Redis - returned`);
// Background persistence to MongoDB without blocking response
setImmediate(() => {
this.persistRedisDocsToMongoDB(redisDocs).catch((error) => {
this.logger.error(`Background MongoDB persistence failed: ${error.message}`);
});
});
return redisDocs; // Return immediately from cache
}
// Stage 2: MongoDB fallback if cache miss
const db = this.mongoClient.db('tender-docs');
const collection = db.collection('documents');
const mongoDocs = (await collection
.find({ projectId })
.project({ _id: 0 })
.toArray()) as TenderDocumentType[];
if (mongoDocs.length > 0) {
this.logger.log(`Found ${mongoDocs.length} documents in MongoDB; caching to Redis`);
// Stage 3: Populate Redis cache from MongoDB
const pipeline = this.redis.pipeline();
const now = Math.floor(Date.now() / 1000);
for (const doc of mongoDocs) {
const key = this.getDocKey(doc.projectId, doc.documentId);
const value = {
data: doc,
meta: {} as Record<string, number>,
};
for (const field in doc) {
value.meta[field] = now;
}
pipeline.set(key, JSON.stringify(value), 'EX', this.TTL_SECONDS);
}
await pipeline.exec();
}
return mongoDocs;
} catch (error) {
this.logger.error(`Error fetching project documents: ${error.message}`);
throw error;
}
}
Frontend Implementation (ResourceHubContext.tsx)¶
The frontend React context provides a clean interface for document loading:
// Load documents for a specific project
const loadProjectDocuments = useCallback(
async (projectId: string) => {
if (!projectId) {
console.warn("Cannot load documents: No projectId provided");
return;
}
setIsLoadingDocuments(true);
setIsLoading(true);
setDocumentsError(null);
try {
console.log("Loading documents for project:", projectId);
// API call using the configured apiClient
const response = await apiClient.post(
"/api/tender-parsing/getDocuments",
{
projectId: projectId,
}
);
if (response.status === 200 || response.status === 201) {
const documents: DocumentType[] = response.data || [];
console.log("Loaded documents from backend:", documents);
setProjectDocuments(documents);
} else {
throw new Error(`Failed to load documents. Status: ${response.status}`);
}
} catch (error) {
console.error("Error loading project documents:", error);
setDocumentsError(
error instanceof Error ? error.message : "Failed to load documents"
);
setProjectDocuments([]);
} finally {
setIsLoadingDocuments(false);
setIsLoading(false);
}
},
[setIsLoading]
);
Frontend Usage in Components¶
The document upload and labelling steps demonstrate practical usage:
Document Upload Step (components/steps/document-upload-step.tsx):
const { selectedProject, projectDocuments } = useResourceHub();
// Check if we have pre-existing documents from backend cache/persistence
const hasPreExistingDocuments = projectDocuments && projectDocuments.length > 0;
const preExistingCount = projectDocuments?.length || 0;
Document Labelling Step (components/steps/document-labeling-step.tsx):
const { selectedProject, loadProjectDocuments } = useResourceHub();
// Documents are automatically loaded when project is selected
// and accessible via the context for immediate use
Performance Benefits¶
This architecture provides several key performance advantages:
- Sub-100ms Response Times: Redis cache hits return documents almost instantly
- Background Persistence: MongoDB writes don't block user interactions
- Automatic Cache Population: MongoDB data automatically populates Redis on cache misses
- Graceful Degradation: System continues functioning even if Redis is unavailable
Frontend-Backend Communication Flow¶
The complete data flow for document retrieval:
sequenceDiagram
participant F as Frontend Component
participant RC as ResourceHub Context
participant API as API Client
participant NC as NestJS Controller
participant DS as Database Service
participant R as Redis
participant M as MongoDB
F->>RC: loadProjectDocuments(projectId)
RC->>API: POST /api/tender-parsing/getDocuments
API->>NC: HTTP Request
NC->>DS: getProjectDocumentation(projectId)
DS->>R: SCAN tenderdoc:projectId:*:data
alt Cache Hit
R-->>DS: Return cached documents
DS-->>NC: Return documents immediately
DS->>M: Background persistence (async)
else Cache Miss
R-->>DS: No cached documents
DS->>M: Query documents collection
M-->>DS: Return documents
DS->>R: Populate cache with documents
DS-->>NC: Return documents
end
NC-->>API: Return documents array
API-->>RC: HTTP Response
RC->>RC: setProjectDocuments(documents)
RC-->>F: Documents available in context
Error Handling and Resilience¶
The system implements comprehensive error handling:
- Redis Failures: Automatically fall back to MongoDB
- MongoDB Failures: Return cached data if available
- Network Issues: Frontend displays appropriate error messages
- Timeout Handling: Custom timeouts prevent hanging requests
See the API Client documentation for details on request timeout configurations and Technical Details for error boundary implementations.
Drizzle ORM Integration Across Services¶
Frontend Drizzle Usage¶
The frontend Next.js application uses Drizzle primarily for user authentication, session management, and real-time features that require direct database access.
Configuration (drizzle.config.ts)¶
import type { Config } from "drizzle-kit";
export default {
schema: "./drizzle/schema/*",
out: "./drizzle",
driver: "pg",
dbCredentials: {
connectionString: process.env.POSTGRES_URL!,
},
verbose: true,
strict: true,
} satisfies Config;
Database Connection (lib/database.ts)¶
import "dotenv/config";
import { drizzle } from "drizzle-orm/neon-serverless";
import { neon } from "@neondatabase/serverless";
if (!process.env.POSTGRES_URL) {
throw new Error("POSTGRES_URL environment variable is not defined");
}
const sql = neon(process.env.POSTGRES_URL);
export const db = drizzle(sql);
Schema Organisation¶
The frontend shares schema definitions with the backend, organised by domain:
drizzle/schema/
├── users.ts # User authentication
├── company.ts # Company/organisation data
├── session.ts # NextAuth session management
├── account.ts # OAuth account linking
├── projects.ts # Project management
├── themes.ts # Answer themes
├── knowledgeBase.ts # Knowledge base articles
└── collaboration.ts # Real-time collaboration
Frontend Usage Examples¶
User Authentication (integrated with NextAuth.js):
import { db } from "@/lib/database";
import { users, accounts, sessions } from "@/drizzle/schema";
import { eq } from "drizzle-orm";
// NextAuth.js adapter integration
export const authOptions: NextAuthOptions = {
adapter: DrizzleAdapter(db, {
usersTable: users,
accountsTable: accounts,
sessionsTable: sessions,
// ... other table mappings
}),
// ... other NextAuth configuration
};
Project Data Access:
import { db } from "@/lib/database";
import { projects } from "@/drizzle/schema";
export async function getUserProjects(userId: string) {
return await db
.select()
.from(projects)
.where(eq(projects.userId, userId))
.orderBy(desc(projects.createdAt));
}
Backend Drizzle Usage¶
The NestJS backend uses Drizzle for comprehensive data operations, including the sophisticated caching and persistence strategy described above.
Configuration (src/db/database.ts)¶
import "dotenv/config";
import { drizzle } from "drizzle-orm/node-postgres";
import { Client } from "pg";
if (!process.env.POSTGRES_URL) {
throw new Error("POSTGRES_URL environment variable is not defined");
}
const client = new Client({
connectionString: process.env.POSTGRES_URL,
ssl:
process.env.NODE_ENV === "production"
? { rejectUnauthorized: false }
: false,
});
client.connect();
export const db = drizzle(client);
Service Integration Pattern¶
Backend services inject the database connection and combine it with Redis caching:
import { Injectable, Logger } from "@nestjs/common";
import { db } from "../db/database";
import { projects, users } from "../db/schema";
import { eq, and } from "drizzle-orm";
@Injectable()
export class ProjectService {
private readonly logger = new Logger(ProjectService.name);
async createProject(userId: string, projectData: CreateProjectDto) {
try {
const result = await db
.insert(projects)
.values({
userId,
name: projectData.name,
description: projectData.description,
createdAt: new Date(),
})
.returning();
return result[0];
} catch (error) {
this.logger.error(`Error creating project: ${error.message}`);
throw error;
}
}
}
PostgreSQL Migration Strategy¶
The transition from MongoDB to PostgreSQL represents a significant architectural improvement, unifying both relational and vector data in a single database.
Current State: Hybrid Architecture¶
During the transition period, the system operates with:
- MongoDB: Document storage for tender documents, questions, and metadata
- PostgreSQL: User data, projects, themes, sessions (via Drizzle)
- Redis: Caching layer for both data sources
Target State: Unified PostgreSQL¶
The target architecture consolidates all data into PostgreSQL:
-- Document storage with JSON columns for flexibility
CREATE TABLE documents (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
project_id UUID NOT NULL REFERENCES projects(id),
document_id VARCHAR(255) NOT NULL,
name VARCHAR(255) NOT NULL,
content_type VARCHAR(100),
tags TEXT[],
metadata JSONB,
respondent_questions JSONB,
text_content TEXT,
embedding VECTOR(1536), -- pgvector for semantic search
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW(),
UNIQUE(project_id, document_id)
);
-- Vector similarity search index
CREATE INDEX ON documents
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
Migration Benefits¶
- Simplified Operations: Single database to manage and monitor
- ACID Compliance: Full transactional guarantees across all data
- Unified Queries: Join document data with user/project data efficiently
- Vector Search: Native pgvector support eliminates external vector databases
- Better Performance: Reduced latency from eliminating cross-database queries
Schema Synchronisation¶
Both frontend and backend share identical schema definitions to ensure type safety:
// Shared schema definition (projects.ts)
export const projects = pgTable("projects", {
id: uuid("id").primaryKey().defaultRandom(),
userId: varchar("user_id", { length: 255 }).notNull(),
name: varchar("name", { length: 255 }).notNull(),
description: text("description"),
status: varchar("status", { length: 50 }).default("active"),
createdAt: timestamp("created_at").defaultNow(),
updatedAt: timestamp("updated_at").defaultNow(),
});
export type Project = typeof projects.$inferSelect;
export type InsertProject = typeof projects.$inferInsert;
This ensures that both services maintain type safety when working with the same data structures.
Document Management Workflow Integration¶
Complete Document Lifecycle¶
The document management system demonstrates the integration of all architectural components:
- Upload (
document-upload-step.tsx): Frontend uploads documents, backend processes and caches - Processing: Backend AI services tag and analyse documents, updating cache
- Labelling (
document-labeling-step.tsx): Frontend displays cached documents for user interaction - Persistence: Background synchronisation ensures MongoDB durability
- Future Migration: Gradual transition to PostgreSQL unified storage
Cross-Component Communication¶
The system utilises React contexts and NestJS services to maintain data consistency:
Frontend Context (ResourceHubContext.tsx):
- Manages document state across components
- Handles optimistic updates for better UX
- Provides error handling and loading states
- Automatically refreshes data when needed
Backend Services (tender-pack-db.service.ts):
- Orchestrates cache-first data access
- Manages background persistence
- Handles failover scenarios
- Provides consistent API interfaces
See the Workspace Page and Document Table Component documentation for detailed component-level integration examples.
For API communication patterns, refer to the API Integration documentation, and for frontend-specific Drizzle usage, see the Frontend Drizzle documentation.
Summary and Next Steps¶
Architecture Overview¶
BidScript's database architecture represents a sophisticated evolution from a complex multi-database setup to a streamlined, performance-optimised solution:
Current State: Redis-first caching with MongoDB persistence and PostgreSQL for structured data
Target State: Unified PostgreSQL with pgvector, maintaining Redis caching for optimal performance
Key Performance Metrics¶
The current architecture delivers:
- Sub-100ms response times for cached document requests
- Zero blocking operations for background persistence
- Automatic failover between cache and persistence layers
- Type-safe operations across both frontend and backend services
Development Benefits¶
For developers working with BidScript:
- Consistent APIs: Both services use identical Drizzle schemas and TypeScript types
- Predictable Caching: Redis-first strategy with transparent fallback behaviour
- Error Resilience: Multiple fallback layers ensure system reliability
- Future-Proof: PostgreSQL migration path provides clear architectural direction
Implementation References¶
- Document Upload Flow: See Workspace Page for frontend implementation
- Document Labelling: See Workspace Page for user interaction patterns
- Backend Processing: Reference the ResourceHub Tender Parsing for service orchestration
- Caching Strategy: Study the
getDocumentsendpoint as the canonical example of the cache-first approach
Migration Timeline¶
The transition to unified PostgreSQL storage is progressing:
- ✅ Phase 1: Drizzle ORM integration (Complete)
- ✅ Phase 2: Redis caching implementation (Complete)
- 🔄 Phase 3: MongoDB → PostgreSQL data migration (In Progress)
- 📋 Phase 4: Vector search optimisation (Planned)
For the most current implementation details, always reference the Database Overview and service-specific documentation in the Backend Modules and Database Documentation sections.