Blob Service (src/azure/blob.service.ts)¶
Overview¶
The BlobService manages document storage and retrieval using Azure Blob Storage in the BidScript backend. It provides a comprehensive API for uploading, downloading, listing, and deleting documents, with support for metadata management and URL generation.
Dependencies¶
import { Injectable, Logger, NotFoundException } from '@nestjs/common';
import { ConfigService } from '@nestjs/config';
import { BlobServiceClient, ContainerClient, BlockBlobClient } from '@azure/storage-blob';
import { DocumentMetadata } from '../types/interfaces/document.interface';
Key Features¶
- Document upload and storage
- Document retrieval and download
- URL generation for document access
- Metadata management
- Container and blob management
- Error handling and logging
Core Methods¶
uploadDocument¶
async uploadDocument(
file: Buffer | Readable,
metadata?: DocumentMetadata,
options?: {
filename?: string;
contentType?: string;
containerName?: string;
}
): Promise<{
id: string;
url: string;
metadata: DocumentMetadata;
contentType: string;
size: number;
}>
Uploads a document to Azure Blob Storage.
Parameters:
- file: The document content as Buffer or Readable stream
- metadata: Optional metadata for the document
- options: Upload options
- filename: Custom filename (defaults to generated UUID)
- contentType: MIME type (auto-detected if not provided)
- containerName: Target container name (uses default if not specified)
Returns: - Object containing document ID, URL, metadata, content type, and size
Example:
const documentBuffer = fs.readFileSync('document.pdf');
const result = await this.blobService.uploadDocument(documentBuffer, {
title: 'Contract Document',
tags: ['contract', 'legal'],
userId: '123'
}, {
contentType: 'application/pdf'
});
getDocument¶
async getDocument(
documentId: string,
options?: {
containerName?: string;
}
): Promise<{
content: Buffer;
metadata: DocumentMetadata;
contentType: string;
size: number;
}>
Retrieves a document from Azure Blob Storage.
Parameters:
- documentId: The ID of the document to retrieve
- options: Retrieval options
- containerName: Source container name (uses default if not specified)
Returns: - Object containing document content, metadata, content type, and size
Throws:
- NotFoundException: If the document doesn't exist
Example:
try {
const document = await this.blobService.getDocument('document-id');
console.log(`Retrieved document: ${document.metadata.title}`);
console.log(`Size: ${document.size} bytes`);
// Process document.content
} catch (error) {
if (error instanceof NotFoundException) {
console.error('Document not found');
} else {
console.error('Error retrieving document:', error);
}
}
getDocumentUrl¶
async getDocumentUrl(
documentId: string,
options?: {
containerName?: string;
expiresInMinutes?: number;
permissions?: 'read' | 'write' | 'delete' | 'all';
}
): Promise<string>
Generates a URL for accessing a document.
Parameters:
- documentId: The ID of the document
- options: URL generation options
- containerName: Container name
- expiresInMinutes: URL expiration time in minutes (default: 60)
- permissions: Access permissions (default: 'read')
Returns: - URL for accessing the document
Example:
// Generate a read-only URL that expires in 30 minutes
const url = await this.blobService.getDocumentUrl('document-id', {
expiresInMinutes: 30,
permissions: 'read'
});
listDocuments¶
async listDocuments(
options?: {
containerName?: string;
prefix?: string;
maxResults?: number;
}
): Promise<{
id: string;
url: string;
metadata: DocumentMetadata;
contentType: string;
size: number;
lastModified: Date;
}[]>
Lists documents in a container.
Parameters:
- options: Listing options
- containerName: Container name
- prefix: Filter by name prefix
- maxResults: Maximum number of results
Returns: - Array of document objects
Example:
// List up to 20 PDF documents
const documents = await this.blobService.listDocuments({
prefix: 'pdf/',
maxResults: 20
});
deleteDocument¶
Deletes a document from Azure Blob Storage.
Parameters:
- documentId: The ID of the document to delete
- options: Deletion options
- containerName: Container name
Returns: - Boolean indicating whether the deletion was successful
Example:
const deleted = await this.blobService.deleteDocument('document-id');
if (deleted) {
console.log('Document deleted successfully');
}
Implementation Details¶
Azure Blob Storage Client Initialization¶
private blobServiceClient: BlobServiceClient;
private containerClient: ContainerClient;
private defaultContainerName: string;
constructor(private configService: ConfigService) {
const connectionString = this.configService.get<string>('AZURE_STORAGE_CONNECTION_STRING');
if (!connectionString) {
throw new Error('Azure Storage connection string not configured');
}
this.blobServiceClient = BlobServiceClient.fromConnectionString(connectionString);
this.defaultContainerName = this.configService.get<string>('AZURE_STORAGE_CONTAINER', 'documents');
this.containerClient = this.blobServiceClient.getContainerClient(this.defaultContainerName);
// Ensure container exists
this.initializeContainer();
}
private async initializeContainer(): Promise<void> {
try {
await this.containerClient.createIfNotExists();
this.logger.log(`Container '${this.defaultContainerName}' initialized`);
} catch (error) {
this.logger.error(`Failed to initialize container: ${error.message}`);
throw error;
}
}
Metadata Serialization¶
private serializeMetadata(metadata: DocumentMetadata): Record<string, string> {
const serialized: Record<string, string> = {};
for (const [key, value] of Object.entries(metadata)) {
if (value === undefined || value === null) continue;
if (typeof value === 'object') {
serialized[key] = JSON.stringify(value);
} else {
serialized[key] = String(value);
}
}
return serialized;
}
private deserializeMetadata(metadata: Record<string, string>): DocumentMetadata {
const deserialized: DocumentMetadata = {};
for (const [key, value] of Object.entries(metadata)) {
try {
// Try to parse as JSON
deserialized[key] = JSON.parse(value);
} catch (e) {
// Not JSON, use as-is
deserialized[key] = value;
}
}
return deserialized;
}
Integration with Other Services¶
The BlobService integrates with:
- DocumentParseService: For processing documents after upload
- RAG Module: For document storage as part of the RAG pipeline
- Editor Module: For saving edited documents
Error Handling¶
The service includes robust error handling for various scenarios:
- Connection errors: Issues connecting to Azure Storage
- Authentication errors: Invalid connection string or credentials
- Not found errors: Document not found in container
- Permission errors: Insufficient permissions
- Throttling errors: Exceeding Azure Storage request limits
Logging¶
The service uses NestJS Logger for detailed logging:
private readonly logger = new Logger(BlobService.name);
// Usage
this.logger.log(`Uploading document with size ${file.length} bytes`);
this.logger.error(`Error uploading document: ${error.message}`, error.stack);
Configuration¶
Required environment variables: