Skip to main content

Validation Workflow with DDEX Workbench

Learn how to integrate the DDEX Suite with DDEX Workbench and other validation tools for comprehensive message validation and quality assurance.

Problem Statement

DDEX message validation requires multiple layers of checks:

  • Schema Validation: Ensuring XML conforms to XSD schemas
  • Business Rule Validation: Checking DDEX-specific business logic
  • Platform Compliance: Meeting DSP-specific requirements
  • Data Quality: Verifying metadata completeness and accuracy
  • Reference Integrity: Ensuring all IDs and references are valid
  • Content Matching: Validating that metadata matches actual content

Manual validation is time-consuming, error-prone, and doesn't scale for large catalogs. Different validation tools have varying coverage and may produce conflicting results.

Solution Approach

The DDEX Suite integrates with validation tools to create a comprehensive validation pipeline that catches errors early and ensures high-quality DDEX messages before submission to platforms.

Validation Stack

LayerToolPurposeDDEX Suite Integration
SchemaDDEX WorkbenchXSD validationPre-validation before Workbench
Business RulesDDEX SuiteDDEX-specific logicBuilt-in validation
PlatformDSP APIsPlatform requirementsPreset-based validation
ContentAudio analysisTechnical validationMetadata cross-check

DDEX Workbench Integration

Basic Integration Setup

import { DDEXParser, DdexBuilder } from 'ddex-suite';
import { exec } from 'child_process';
import { promisify } from 'util';
import { writeFileSync, readFileSync } from 'fs';
import path from 'path';

const execAsync = promisify(exec);

class DDEXWorkbenchIntegration {
private workbenchPath: string;
private parser: DDEXParser;
private builder: DdexBuilder;

constructor(workbenchPath: string) {
this.workbenchPath = workbenchPath;
this.parser = new DDEXParser();
this.builder = new DdexBuilder();
}

async validateWithWorkbench(xmlContent: string): Promise<WorkbenchValidationResult> {
// First, validate with DDEX Suite built-in validation
const suiteValidation = await this.validateWithSuite(xmlContent);

if (!suiteValidation.isValid) {
return {
isValid: false,
source: 'DDEX_SUITE',
errors: suiteValidation.errors,
warnings: suiteValidation.warnings,
workbenchResult: null
};
}

// Create temporary file for DDEX Workbench
const tempFilePath = path.join(process.cwd(), `temp_validation_${Date.now()}.xml`);
writeFileSync(tempFilePath, xmlContent, 'utf-8');

try {
// Run DDEX Workbench validation
const workbenchResult = await this.runWorkbenchValidation(tempFilePath);

return {
isValid: workbenchResult.isValid && suiteValidation.isValid,
source: 'COMBINED',
errors: [...suiteValidation.errors, ...workbenchResult.errors],
warnings: [...suiteValidation.warnings, ...workbenchResult.warnings],
workbenchResult: workbenchResult,
suiteResult: suiteValidation
};

} finally {
// Cleanup temporary file
try {
require('fs').unlinkSync(tempFilePath);
} catch (e) {
console.warn('Failed to cleanup temp file:', e);
}
}
}

private async validateWithSuite(xmlContent: string): Promise<ValidationResult> {
try {
// Parse with DDEX Suite
const parseResult = await this.parser.parse(xmlContent, {
validateReferences: true,
includeRawExtensions: true
});

// Validate structure and business rules
const validation = await this.builder.validate();

// Additional DDEX Suite specific checks
const additionalErrors: string[] = [];
const additionalWarnings: string[] = [];

// Check for common issues
if (parseResult.flat.releases.length === 0) {
additionalErrors.push('No releases found in message');
}

parseResult.flat.releases.forEach(release => {
if (!release.releaseId) {
additionalErrors.push('Release missing required ID');
}

if (!release.title) {
additionalErrors.push(`Release ${release.releaseId} missing title`);
}

if (release.trackCount === 0) {
additionalWarnings.push(`Release ${release.releaseId} has no tracks`);
}
});

parseResult.flat.soundRecordings.forEach(track => {
if (!track.isrc) {
additionalWarnings.push(`Track ${track.title} missing ISRC`);
}

if (!track.duration) {
additionalWarnings.push(`Track ${track.title} missing duration`);
}
});

return {
isValid: validation.isValid && additionalErrors.length === 0,
errors: [...validation.errors, ...additionalErrors],
warnings: [...validation.warnings, ...additionalWarnings]
};

} catch (error) {
return {
isValid: false,
errors: [`DDEX Suite parsing failed: ${error.message}`],
warnings: []
};
}
}

private async runWorkbenchValidation(filePath: string): Promise<WorkbenchResult> {
try {
// Run DDEX Workbench command-line tool
const command = `java -jar "${this.workbenchPath}" validate "${filePath}" --format json`;
const { stdout, stderr } = await execAsync(command, { timeout: 30000 });

if (stderr) {
console.warn('DDEX Workbench stderr:', stderr);
}

// Parse Workbench JSON output
const result = JSON.parse(stdout);

return {
isValid: result.isValid || result.validationStatus === 'VALID',
errors: result.errors || [],
warnings: result.warnings || [],
schemaVersion: result.schemaVersion,
validationTime: result.validationTime,
fileSize: result.fileSize,
rawOutput: result
};

} catch (error) {
// Handle Workbench execution errors
if (error.code === 'ENOENT') {
throw new Error('DDEX Workbench not found. Please check the path.');
}

if (error.signal === 'SIGTERM') {
throw new Error('DDEX Workbench validation timed out');
}

// Try to parse error output if it's JSON
try {
const errorResult = JSON.parse(error.stdout || error.message);
return {
isValid: false,
errors: errorResult.errors || [error.message],
warnings: errorResult.warnings || [],
schemaVersion: null,
validationTime: null,
fileSize: null,
rawOutput: errorResult
};
} catch (parseError) {
throw new Error(`DDEX Workbench validation failed: ${error.message}`);
}
}
}

async validateAndRepair(xmlContent: string): Promise<ValidationRepairResult> {
const validation = await this.validateWithWorkbench(xmlContent);

if (validation.isValid) {
return {
originalValid: true,
repairedXml: xmlContent,
repairsMade: [],
finalValidation: validation
};
}

// Attempt automatic repairs
const repairs: string[] = [];
let repairedXml = xmlContent;

try {
// Parse with DDEX Suite for repair attempts
const parseResult = await this.parser.parse(xmlContent);

// Apply common repairs
if (this.hasRepairableIssues(validation)) {
const repairResult = await this.applyAutomaticRepairs(parseResult, validation);
repairedXml = repairResult.xml;
repairs.push(...repairResult.repairs);
}

// Re-validate repaired XML
const finalValidation = await this.validateWithWorkbench(repairedXml);

return {
originalValid: false,
repairedXml: repairedXml,
repairsMade: repairs,
finalValidation: finalValidation
};

} catch (error) {
return {
originalValid: false,
repairedXml: xmlContent,
repairsMade: [],
finalValidation: validation,
repairError: error.message
};
}
}

private hasRepairableIssues(validation: WorkbenchValidationResult): boolean {
const repairableErrorPatterns = [
/missing.*required.*element/i,
/invalid.*date.*format/i,
/missing.*namespace/i,
/duplicate.*id/i
];

return validation.errors.some(error =>
repairableErrorPatterns.some(pattern => pattern.test(error))
);
}

private async applyAutomaticRepairs(parseResult: any, validation: WorkbenchValidationResult): Promise<RepairResult> {
const repairs: string[] = [];

// Reset builder and rebuild with corrections
this.builder.reset();

// Apply repairs to releases
for (const release of parseResult.flat.releases) {
const repairedRelease = this.repairRelease(release, validation.errors);
this.builder.addRelease(repairedRelease);

if (repairedRelease !== release) {
repairs.push(`Repaired release ${release.releaseId}`);
}
}

// Apply repairs to resources
for (const resource of parseResult.flat.soundRecordings) {
const repairedResource = this.repairResource(resource, validation.errors);
this.builder.addResource(repairedResource);

if (repairedResource !== resource) {
repairs.push(`Repaired resource ${resource.soundRecordingId}`);
}
}

// Rebuild XML
const repairedXml = await this.builder.build();

return {
xml: repairedXml,
repairs: repairs
};
}

private repairRelease(release: any, errors: string[]): any {
const repaired = { ...release };

// Fix missing required fields
if (!repaired.releaseDate && errors.some(e => e.includes('releaseDate'))) {
repaired.releaseDate = new Date().toISOString().split('T')[0];
}

if (!repaired.territory && errors.some(e => e.includes('territory'))) {
repaired.territory = 'Worldwide';
}

// Fix invalid date formats
if (repaired.releaseDate && !this.isValidDate(repaired.releaseDate)) {
repaired.releaseDate = this.parseAndFormatDate(repaired.releaseDate);
}

return repaired;
}

private repairResource(resource: any, errors: string[]): any {
const repaired = { ...resource };

// Fix missing duration
if (!repaired.duration && errors.some(e => e.includes('duration'))) {
repaired.duration = 'PT3M30S'; // Default 3:30
}

// Fix invalid ISRC format
if (repaired.isrc && !this.isValidISRC(repaired.isrc)) {
repaired.isrc = this.normalizeISRC(repaired.isrc);
}

return repaired;
}

private isValidDate(dateString: string): boolean {
const dateRegex = /^\d{4}-\d{2}-\d{2}$/;
return dateRegex.test(dateString) && !isNaN(Date.parse(dateString));
}

private parseAndFormatDate(dateString: string): string {
try {
const parsed = new Date(dateString);
return parsed.toISOString().split('T')[0];
} catch {
return new Date().toISOString().split('T')[0];
}
}

private isValidISRC(isrc: string): boolean {
const isrcRegex = /^[A-Z]{2}[A-Z0-9]{3}\d{7}$/;
return isrcRegex.test(isrc);
}

private normalizeISRC(isrc: string): string {
// Remove spaces and hyphens, convert to uppercase
const normalized = isrc.replace(/[\s-]/g, '').toUpperCase();

// If it matches pattern after normalization, return it
if (this.isValidISRC(normalized)) {
return normalized;
}

// If still invalid, generate a placeholder
return 'USRC17600000';
}
}

// Type definitions
interface WorkbenchValidationResult {
isValid: boolean;
source: 'DDEX_SUITE' | 'WORKBENCH' | 'COMBINED';
errors: string[];
warnings: string[];
workbenchResult?: WorkbenchResult;
suiteResult?: ValidationResult;
}

interface WorkbenchResult {
isValid: boolean;
errors: string[];
warnings: string[];
schemaVersion: string | null;
validationTime: number | null;
fileSize: number | null;
rawOutput: any;
}

interface ValidationResult {
isValid: boolean;
errors: string[];
warnings: string[];
}

interface ValidationRepairResult {
originalValid: boolean;
repairedXml: string;
repairsMade: string[];
finalValidation: WorkbenchValidationResult;
repairError?: string;
}

interface RepairResult {
xml: string;
repairs: string[];
}

Batch Validation Workflow

class BatchValidationWorkflow {
private integration: DDEXWorkbenchIntegration;
private maxConcurrent: number;

constructor(workbenchPath: string, maxConcurrent = 5) {
this.integration = new DDEXWorkbenchIntegration(workbenchPath);
this.maxConcurrent = maxConcurrent;
}

async validateBatch(filePaths: string[]): Promise<BatchValidationResult> {
const results: ValidationFileResult[] = [];
const semaphore = new Semaphore(this.maxConcurrent);

console.log(`Starting batch validation of ${filePaths.length} files...`);

const promises = filePaths.map(async (filePath, index) => {
await semaphore.acquire();

try {
const result = await this.validateSingleFile(filePath, index + 1, filePaths.length);
results.push(result);
return result;
} finally {
semaphore.release();
}
});

await Promise.all(promises);

return this.compileBatchResults(results);
}

private async validateSingleFile(filePath: string, index: number, total: number): Promise<ValidationFileResult> {
const startTime = Date.now();
console.log(`[${index}/${total}] Validating ${path.basename(filePath)}...`);

try {
const xmlContent = readFileSync(filePath, 'utf-8');
const validation = await this.integration.validateWithWorkbench(xmlContent);

const duration = Date.now() - startTime;

return {
filePath,
fileName: path.basename(filePath),
isValid: validation.isValid,
errors: validation.errors,
warnings: validation.warnings,
validationTime: duration,
fileSize: xmlContent.length,
success: true
};

} catch (error) {
const duration = Date.now() - startTime;

return {
filePath,
fileName: path.basename(filePath),
isValid: false,
errors: [error.message],
warnings: [],
validationTime: duration,
fileSize: 0,
success: false
};
}
}

private compileBatchResults(results: ValidationFileResult[]): BatchValidationResult {
const valid = results.filter(r => r.isValid && r.success);
const invalid = results.filter(r => !r.isValid || !r.success);
const totalErrors = results.reduce((sum, r) => sum + r.errors.length, 0);
const totalWarnings = results.reduce((sum, r) => sum + r.warnings.length, 0);
const avgValidationTime = results.reduce((sum, r) => sum + r.validationTime, 0) / results.length;

// Categorize common errors
const errorCategories = this.categorizeErrors(results);

return {
totalFiles: results.length,
validFiles: valid.length,
invalidFiles: invalid.length,
totalErrors,
totalWarnings,
averageValidationTime: avgValidationTime,
errorCategories,
results,
summary: {
successRate: (valid.length / results.length) * 100,
mostCommonErrors: this.getMostCommonErrors(results, 5),
recommendedActions: this.getRecommendedActions(errorCategories)
}
};
}

private categorizeErrors(results: ValidationFileResult[]): ErrorCategories {
const categories: ErrorCategories = {
schema: [],
businessRules: [],
references: [],
dataQuality: [],
technical: []
};

results.forEach(result => {
result.errors.forEach(error => {
if (this.isSchemaError(error)) {
categories.schema.push({ file: result.fileName, error });
} else if (this.isBusinessRuleError(error)) {
categories.businessRules.push({ file: result.fileName, error });
} else if (this.isReferenceError(error)) {
categories.references.push({ file: result.fileName, error });
} else if (this.isDataQualityError(error)) {
categories.dataQuality.push({ file: result.fileName, error });
} else {
categories.technical.push({ file: result.fileName, error });
}
});
});

return categories;
}

private isSchemaError(error: string): boolean {
return /schema|xsd|element.*not.*allowed|invalid.*structure/i.test(error);
}

private isBusinessRuleError(error: string): boolean {
return /business.*rule|ddex.*rule|must.*contain|required.*when/i.test(error);
}

private isReferenceError(error: string): boolean {
return /reference|id.*not.*found|unresolved.*link|invalid.*reference/i.test(error);
}

private isDataQualityError(error: string): boolean {
return /missing.*required|invalid.*format|empty.*field|quality/i.test(error);
}

private getMostCommonErrors(results: ValidationFileResult[], limit: number): Array<{error: string, count: number}> {
const errorCounts = new Map<string, number>();

results.forEach(result => {
result.errors.forEach(error => {
// Normalize similar errors
const normalizedError = this.normalizeError(error);
errorCounts.set(normalizedError, (errorCounts.get(normalizedError) || 0) + 1);
});
});

return Array.from(errorCounts.entries())
.map(([error, count]) => ({ error, count }))
.sort((a, b) => b.count - a.count)
.slice(0, limit);
}

private normalizeError(error: string): string {
// Normalize similar errors by removing specific IDs and file references
return error
.replace(/[A-Z0-9]{10,}/g, '[ID]')
.replace(/line \d+/g, 'line [N]')
.replace(/\d{4}-\d{2}-\d{2}/g, '[DATE]')
.trim();
}

private getRecommendedActions(categories: ErrorCategories): string[] {
const actions: string[] = [];

if (categories.schema.length > 0) {
actions.push('Review XML schema compliance - use DDEX Suite builder to ensure valid structure');
}

if (categories.businessRules.length > 0) {
actions.push('Check DDEX business rules - ensure all required fields are present');
}

if (categories.references.length > 0) {
actions.push('Validate ID references - ensure all referenced IDs exist in the message');
}

if (categories.dataQuality.length > 0) {
actions.push('Improve data quality - check for missing required fields and invalid formats');
}

if (categories.technical.length > 0) {
actions.push('Address technical issues - check file encoding and XML formatting');
}

return actions;
}
}

// Semaphore for controlling concurrency
class Semaphore {
private permits: number;
private waiting: Array<() => void> = [];

constructor(permits: number) {
this.permits = permits;
}

async acquire(): Promise<void> {
if (this.permits > 0) {
this.permits--;
return;
}

return new Promise<void>(resolve => {
this.waiting.push(resolve);
});
}

release(): void {
this.permits++;
if (this.waiting.length > 0) {
const next = this.waiting.shift()!;
this.permits--;
next();
}
}
}

// Type definitions for batch validation
interface ValidationFileResult {
filePath: string;
fileName: string;
isValid: boolean;
errors: string[];
warnings: string[];
validationTime: number;
fileSize: number;
success: boolean;
}

interface BatchValidationResult {
totalFiles: number;
validFiles: number;
invalidFiles: number;
totalErrors: number;
totalWarnings: number;
averageValidationTime: number;
errorCategories: ErrorCategories;
results: ValidationFileResult[];
summary: {
successRate: number;
mostCommonErrors: Array<{error: string, count: number}>;
recommendedActions: string[];
};
}

interface ErrorCategories {
schema: Array<{file: string, error: string}>;
businessRules: Array<{file: string, error: string}>;
references: Array<{file: string, error: string}>;
dataQuality: Array<{file: string, error: string}>;
technical: Array<{file: string, error: string}>;
}

Python Validation Workflow

# Python implementation for validation workflows
import asyncio
import json
import subprocess
import tempfile
import os
from pathlib import Path
from typing import List, Dict, Any, Optional
from ddex_parser import DDEXParser
from ddex_builder import DdexBuilder
import pandas as pd

class DDEXValidationWorkflow:
def __init__(self, workbench_path: str):
self.workbench_path = workbench_path
self.parser = DDEXParser()
self.builder = DdexBuilder()

async def validate_with_workbench(self, xml_content: str) -> Dict[str, Any]:
"""Validate DDEX XML using both DDEX Suite and DDEX Workbench"""

# First validate with DDEX Suite
suite_result = await self.validate_with_suite(xml_content)

if not suite_result['is_valid']:
return {
'is_valid': False,
'source': 'DDEX_SUITE',
'errors': suite_result['errors'],
'warnings': suite_result['warnings'],
'workbench_result': None
}

# Then validate with DDEX Workbench
workbench_result = await self.run_workbench_validation(xml_content)

return {
'is_valid': suite_result['is_valid'] and workbench_result['is_valid'],
'source': 'COMBINED',
'errors': suite_result['errors'] + workbench_result['errors'],
'warnings': suite_result['warnings'] + workbench_result['warnings'],
'suite_result': suite_result,
'workbench_result': workbench_result
}

async def validate_with_suite(self, xml_content: str) -> Dict[str, Any]:
"""Validate using DDEX Suite built-in validation"""
try:
# Parse with validation
result = self.parser.parse(xml_content)

# Perform additional business rule checks
errors = []
warnings = []

# Check for missing releases
if result.release_count == 0:
errors.append('No releases found in message')

# Validate release data
for release in result.releases:
if not release.get('release_id'):
errors.append('Release missing required ID')

if not release.get('title'):
errors.append(f"Release {release.get('release_id', 'unknown')} missing title")

if not release.get('display_artist'):
errors.append(f"Release {release.get('release_id', 'unknown')} missing artist")

# Additional validation using builder
builder_validation = await self.builder.validate()
if not builder_validation.is_valid:
errors.extend(builder_validation.errors)
warnings.extend(builder_validation.warnings)

return {
'is_valid': len(errors) == 0,
'errors': errors,
'warnings': warnings,
'parse_result': result
}

except Exception as e:
return {
'is_valid': False,
'errors': [f'DDEX Suite validation failed: {str(e)}'],
'warnings': []
}

async def run_workbench_validation(self, xml_content: str) -> Dict[str, Any]:
"""Run DDEX Workbench validation"""
with tempfile.NamedTemporaryFile(mode='w', suffix='.xml', delete=False) as temp_file:
temp_file.write(xml_content)
temp_file_path = temp_file.name

try:
# Run DDEX Workbench
cmd = [
'java', '-jar', self.workbench_path,
'validate', temp_file_path,
'--format', 'json'
]

process = await asyncio.create_subprocess_exec(
*cmd,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE
)

stdout, stderr = await process.communicate()

if process.returncode != 0:
raise RuntimeError(f'DDEX Workbench failed: {stderr.decode()}')

# Parse JSON result
result = json.loads(stdout.decode())

return {
'is_valid': result.get('isValid', False),
'errors': result.get('errors', []),
'warnings': result.get('warnings', []),
'schema_version': result.get('schemaVersion'),
'validation_time': result.get('validationTime'),
'raw_output': result
}

except Exception as e:
return {
'is_valid': False,
'errors': [f'Workbench validation failed: {str(e)}'],
'warnings': []
}

finally:
# Cleanup temp file
try:
os.unlink(temp_file_path)
except OSError:
pass

async def validate_batch(self, file_paths: List[str], max_concurrent: int = 5) -> Dict[str, Any]:
"""Validate multiple files concurrently"""
semaphore = asyncio.Semaphore(max_concurrent)

async def validate_single_file(file_path: str) -> Dict[str, Any]:
async with semaphore:
try:
with open(file_path, 'r', encoding='utf-8') as f:
xml_content = f.read()

result = await self.validate_with_workbench(xml_content)
result['file_path'] = file_path
result['file_name'] = Path(file_path).name
result['file_size'] = len(xml_content)

return result

except Exception as e:
return {
'file_path': file_path,
'file_name': Path(file_path).name,
'is_valid': False,
'errors': [f'File processing failed: {str(e)}'],
'warnings': [],
'file_size': 0
}

print(f"Starting batch validation of {len(file_paths)} files...")

tasks = [validate_single_file(path) for path in file_paths]
results = await asyncio.gather(*tasks)

# Compile batch statistics
return self.compile_batch_results(results)

def compile_batch_results(self, results: List[Dict[str, Any]]) -> Dict[str, Any]:
"""Compile batch validation results with statistics"""
valid_files = [r for r in results if r['is_valid']]
invalid_files = [r for r in results if not r['is_valid']]

# Error analysis
all_errors = []
for result in results:
all_errors.extend(result.get('errors', []))

error_counts = {}
for error in all_errors:
normalized = self.normalize_error(error)
error_counts[normalized] = error_counts.get(normalized, 0) + 1

most_common_errors = sorted(
error_counts.items(),
key=lambda x: x[1],
reverse=True
)[:10]

return {
'total_files': len(results),
'valid_files': len(valid_files),
'invalid_files': len(invalid_files),
'success_rate': len(valid_files) / len(results) * 100,
'total_errors': sum(len(r.get('errors', [])) for r in results),
'total_warnings': sum(len(r.get('warnings', [])) for r in results),
'most_common_errors': most_common_errors,
'results': results,
'recommendations': self.generate_recommendations(results)
}

def normalize_error(self, error: str) -> str:
"""Normalize error messages for pattern analysis"""
import re

# Remove specific IDs and numbers
normalized = re.sub(r'[A-Z0-9]{10,}', '[ID]', error)
normalized = re.sub(r'\d{4}-\d{2}-\d{2}', '[DATE]', normalized)
normalized = re.sub(r'line \d+', 'line [N]', normalized)

return normalized.strip()

def generate_recommendations(self, results: List[Dict[str, Any]]) -> List[str]:
"""Generate actionable recommendations based on validation results"""
recommendations = []

# Analyze error patterns
all_errors = []
for result in results:
all_errors.extend(result.get('errors', []))

schema_errors = sum(1 for e in all_errors if 'schema' in e.lower() or 'xsd' in e.lower())
reference_errors = sum(1 for e in all_errors if 'reference' in e.lower() or 'id' in e.lower())
date_errors = sum(1 for e in all_errors if 'date' in e.lower())
missing_field_errors = sum(1 for e in all_errors if 'missing' in e.lower() or 'required' in e.lower())

if schema_errors > 0:
recommendations.append('Use DDEX Suite builder to ensure XML schema compliance')

if reference_errors > 0:
recommendations.append('Enable reference validation in DDEX Suite parser')

if date_errors > 0:
recommendations.append('Validate date formats (use ISO 8601: YYYY-MM-DD)')

if missing_field_errors > 0:
recommendations.append('Review platform requirements for required fields')

success_rate = len([r for r in results if r['is_valid']]) / len(results) * 100

if success_rate < 50:
recommendations.append('Consider data quality audit - over 50% of files have validation errors')
elif success_rate < 80:
recommendations.append('Focus on common error patterns to improve validation success rate')

return recommendations

def export_validation_report(self, batch_result: Dict[str, Any], output_path: str):
"""Export detailed validation report"""
# Create detailed report
report = {
'summary': {
'total_files': batch_result['total_files'],
'valid_files': batch_result['valid_files'],
'invalid_files': batch_result['invalid_files'],
'success_rate': batch_result['success_rate'],
'total_errors': batch_result['total_errors'],
'total_warnings': batch_result['total_warnings']
},
'error_analysis': {
'most_common_errors': batch_result['most_common_errors'],
'recommendations': batch_result['recommendations']
},
'file_details': []
}

# Add per-file details
for result in batch_result['results']:
file_detail = {
'file_name': result['file_name'],
'file_path': result['file_path'],
'is_valid': result['is_valid'],
'error_count': len(result.get('errors', [])),
'warning_count': len(result.get('warnings', [])),
'errors': result.get('errors', []),
'warnings': result.get('warnings', [])
}
report['file_details'].append(file_detail)

# Save report
with open(output_path, 'w') as f:
json.dump(report, f, indent=2)

print(f"Validation report saved to {output_path}")

def create_validation_dashboard(self, batch_result: Dict[str, Any]) -> pd.DataFrame:
"""Create pandas DataFrame for validation analysis"""
data = []

for result in batch_result['results']:
data.append({
'file_name': result['file_name'],
'is_valid': result['is_valid'],
'error_count': len(result.get('errors', [])),
'warning_count': len(result.get('warnings', [])),
'file_size': result.get('file_size', 0),
'has_schema_errors': any('schema' in e.lower() for e in result.get('errors', [])),
'has_reference_errors': any('reference' in e.lower() for e in result.get('errors', [])),
'has_date_errors': any('date' in e.lower() for e in result.get('errors', []))
})

df = pd.DataFrame(data)

print("\nValidation Dashboard:")
print(f"Success Rate: {batch_result['success_rate']:.1f}%")
print(f"Average Errors per File: {df['error_count'].mean():.1f}")
print(f"Files with Schema Errors: {df['has_schema_errors'].sum()}")
print(f"Files with Reference Errors: {df['has_reference_errors'].sum()}")
print(f"Files with Date Errors: {df['has_date_errors'].sum()}")

return df

# Usage example
async def main():
# Initialize validation workflow
validator = DDEXValidationWorkflow('/path/to/ddex-workbench.jar')

# Validate single file
with open('sample.xml', 'r') as f:
xml_content = f.read()

single_result = await validator.validate_with_workbench(xml_content)
print(f"Single file validation: {'PASS' if single_result['is_valid'] else 'FAIL'}")

if single_result['errors']:
print("Errors:")
for error in single_result['errors']:
print(f" - {error}")

# Validate batch of files
file_paths = ['file1.xml', 'file2.xml', 'file3.xml']
batch_result = await validator.validate_batch(file_paths)

print(f"\nBatch validation complete:")
print(f"Success rate: {batch_result['success_rate']:.1f}%")
print(f"Valid files: {batch_result['valid_files']}/{batch_result['total_files']}")

# Export detailed report
validator.export_validation_report(batch_result, 'validation_report.json')

# Create analysis dashboard
df = validator.create_validation_dashboard(batch_result)

# Save DataFrame for further analysis
df.to_csv('validation_analysis.csv', index=False)

# Run the validation workflow
if __name__ == "__main__":
asyncio.run(main())

Continuous Integration Validation

# GitHub Actions workflow for automated DDEX validation
name: DDEX Validation Pipeline

on:
push:
paths: ['ddex-files/**/*.xml']
pull_request:
paths: ['ddex-files/**/*.xml']

jobs:
validate-ddex:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v4

- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: '18'

- name: Setup Java (for DDEX Workbench)
uses: actions/setup-java@v4
with:
java-version: '11'
distribution: 'temurin'

- name: Install DDEX Suite
run: |
npm install ddex-parser ddex-builder

- name: Download DDEX Workbench
run: |
wget https://releases.ddex.net/workbench/ddex-workbench-latest.jar
chmod +x ddex-workbench-latest.jar

- name: Validate DDEX Files
run: |
node scripts/validate-ddex-files.js
env:
WORKBENCH_PATH: ./ddex-workbench-latest.jar

- name: Upload Validation Report
uses: actions/upload-artifact@v4
if: always()
with:
name: validation-report
path: validation-report.json

- name: Comment PR with Results
if: github.event_name == 'pull_request'
uses: actions/github-script@v7
with:
script: |
const fs = require('fs');
const report = JSON.parse(fs.readFileSync('validation-report.json', 'utf8'));

const comment = `
## DDEX Validation Results

- **Total Files**: ${report.summary.total_files}
- **Valid Files**: ${report.summary.valid_files}
- **Success Rate**: ${report.summary.success_rate.toFixed(1)}%
- **Total Errors**: ${report.summary.total_errors}

${report.summary.success_rate < 100 ? '⚠️ **Some files failed validation**' : '✅ **All files passed validation**'}

### Most Common Errors
${report.error_analysis.most_common_errors.slice(0, 5).map(e => `- ${e[0]} (${e[1]} occurrences)`).join('\n')}

### Recommendations
${report.error_analysis.recommendations.map(r => `- ${r}`).join('\n')}
`;

github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: comment
});

Performance Considerations

Validation Optimization

// Optimize validation for large batches
class OptimizedValidator {
private validationCache = new Map<string, ValidationResult>();

async validateWithCaching(xmlContent: string): Promise<ValidationResult> {
// Generate content hash for caching
const hash = this.generateHash(xmlContent);

if (this.validationCache.has(hash)) {
return this.validationCache.get(hash)!;
}

const result = await this.integration.validateWithWorkbench(xmlContent);
this.validationCache.set(hash, result);

return result;
}

private generateHash(content: string): string {
return require('crypto').createHash('sha256').update(content).digest('hex');
}
}

Common Pitfalls and Solutions

Pitfall 1: Workbench Version Compatibility

// WRONG: Assuming any Workbench version works
const result = await runWorkbench(xml);

// RIGHT: Check Workbench version compatibility
const workbenchVersion = await this.getWorkbenchVersion();
if (!this.isCompatibleVersion(workbenchVersion)) {
throw new Error(`Workbench version ${workbenchVersion} not supported`);
}

Pitfall 2: Error Message Parsing

// WRONG: Assuming error format
const errors = result.errors; // May be string or array

// RIGHT: Normalize error format
const errors = Array.isArray(result.errors) ? result.errors : [result.errors];

Pitfall 3: Timeout Handling

// WRONG: No timeout protection
const result = await exec(workbenchCommand);

// RIGHT: Implement timeout
const result = await exec(workbenchCommand, { timeout: 30000 });

This comprehensive validation workflow ensures high-quality DDEX messages through multiple validation layers, automated workflows, and detailed error analysis.