Round-Trip Fidelity Guide
Comprehensive guide to preserving all data through parse → modify → build cycles with the DDEX Suite, ensuring perfect data integrity for complex workflows.
Problem Statement
Round-trip fidelity is critical for DDEX processing workflows where you need to:
- Parse existing DDEX XML while preserving all original data
- Make targeted modifications without losing unrelated information
- Generate new XML that maintains all non-modified elements exactly
- Preserve extension data that may not be understood by your application
- Maintain XML formatting and namespace declarations when possible
Without proper round-trip fidelity, modifications can inadvertently:
- Remove extension elements from third-party tools
- Lose XML comments and processing instructions
- Change namespace prefixes and formatting
- Drop unknown metadata fields
- Alter element ordering in ways that break partner integrations
Solution Approach
The DDEX Suite provides comprehensive round-trip fidelity through:
- Graph Model Preservation: Maintains the complete DDEX structure
- Extension Handling: Preserves unknown elements and attributes
- Raw XML Retention: Optionally stores original XML for critical sections
- Deterministic Building: Ensures consistent output formatting
- Validation-Safe Modifications: Guarantees schema compliance
Understanding Graph vs Flattened Models
Graph Model - Complete Fidelity
import { DDEXParser } from 'ddex-parser';
const parser = new DDEXParser();
const result = await parser.parse(xmlContent, {
preserveExtensions: true, // Keep unknown elements
includeComments: true, // Preserve XML comments
rawExtensions: true // Store raw XML for extensions
});
// Graph model preserves complete structure
console.log(result.graph.message.header.messageId);
console.log(result.graph.message.releaseList.releases[0].releaseReference);
// Extensions are preserved
console.log(result.graph.extensions); // Unknown elements
console.log(result.graph.rawXmlSections); // Raw XML preservation
Flattened Model - Developer Convenience
// Flattened model for easier manipulation
console.log(result.flat.releases[0].title);
console.log(result.flat.releases[0].artists);
// But still maintains fidelity links
console.log(result.flat.releases[0]._graphRef); // Link to graph model
console.log(result.flat.releases[0]._extensions); // Preserved extensions
Complete Round-Trip Workflow
Basic Round-Trip Example
import { DDEXParser, DDEXBuilder } from 'ddex-suite';
async function modifyReleaseTitleWithFidelity(
originalXml: string,
newTitle: string
): Promise<string> {
// Step 1: Parse with full fidelity preservation
const parser = new DDEXParser();
const parseResult = await parser.parse(originalXml, {
preserveExtensions: true,
includeComments: true,
rawExtensions: true,
validateReferences: true
});
// Step 2: Modify only the target field
parseResult.flat.releases[0].title = newTitle;
// Step 3: Build with fidelity preservation
const builder = new DDEXBuilder();
const buildRequest = parseResult.toBuildRequest();
const newXml = await builder.build(buildRequest, {
preserveExtensions: true,
maintainFormatting: true,
deterministicOutput: true
});
return newXml;
}
// Verify round-trip fidelity
async function verifyRoundTrip(originalXml: string) {
const parser = new DDEXParser();
const builder = new DDEXBuilder();
// Parse original
const original = await parser.parse(originalXml, { preserveExtensions: true });
// Build without modifications
const rebuilt = await builder.build(original.toBuildRequest());
// Parse rebuilt to compare
const rebuiltParsed = await parser.parse(rebuilt, { preserveExtensions: true });
// Deep comparison
const isIdentical = await compareStructures(original, rebuiltParsed);
console.log(`Round-trip fidelity: ${isIdentical ? 'PASS' : 'FAIL'}`);
return isIdentical;
}
Advanced Extension Preservation
interface ExtensionPreservationOptions {
preserveUnknownElements: boolean;
preserveUnknownAttributes: boolean;
preserveNamespaceDeclarations: boolean;
preserveElementOrder: boolean;
preserveWhitespace: boolean;
}
async function parseWithFullExtensionSupport(xmlContent: string) {
const parser = new DDEXParser();
const result = await parser.parse(xmlContent, {
preserveExtensions: true,
rawExtensions: true,
extensionOptions: {
preserveUnknownElements: true,
preserveUnknownAttributes: true,
preserveNamespaceDeclarations: true,
preserveElementOrder: true,
preserveWhitespace: false // Usually safe to normalize
}
});
// Access preserved extensions
console.log('Unknown elements:', result.extensions.unknownElements);
console.log('Unknown attributes:', result.extensions.unknownAttributes);
console.log('Raw XML sections:', result.rawXmlSections);
return result;
}
// Custom extension handler
class CustomExtensionHandler {
async processExtensions(extensions: any[]): Promise<any[]> {
return extensions.map(ext => {
// Add custom processing while preserving original
return {
...ext,
_processed: true,
_originalXml: ext._rawXml
};
});
}
async restoreExtensions(processedExtensions: any[]): Promise<string[]> {
return processedExtensions.map(ext => ext._originalXml);
}
}
Python Round-Trip Workflows
DataFrame Integration with Fidelity
from ddex_parser import DDEXParser
from ddex_builder import DDEXBuilder
import pandas as pd
from typing import Dict, Any
async def modify_catalog_with_fidelity(
xml_content: str,
modifications: Dict[str, Any]
) -> str:
"""Modify catalog data while preserving all other information"""
# Parse with full preservation
parser = DDEXParser()
parse_result = await parser.parse(
xml_content,
preserve_extensions=True,
include_comments=True,
raw_extensions=True
)
# Convert to DataFrame for bulk operations
df = parser.to_dataframe(xml_content)
# Apply modifications efficiently
for release_id, changes in modifications.items():
mask = df['release_id'] == release_id
for field, value in changes.items():
df.loc[mask, field] = value
# Rebuild preserving extensions
builder = DDEXBuilder()
# Convert back with fidelity preservation
build_request = await builder.from_dataframe(
df,
original_parse_result=parse_result, # Preserves extensions
preserve_extensions=True
)
return await builder.build(build_request)
async def compare_dataframes_for_fidelity(
original_xml: str,
modified_xml: str
) -> pd.DataFrame:
"""Compare DataFrames to verify what changed"""
parser = DDEXParser()
original_df = parser.to_dataframe(original_xml)
modified_df = parser.to_dataframe(modified_xml)
# Identify differences
comparison = original_df.compare(modified_df, align_axis=1)
return comparison
Extension-Aware Data Processing
import json
from dataclasses import dataclass
from typing import List, Optional
@dataclass
class ExtensionData:
namespace: str
element_name: str
attributes: Dict[str, str]
content: Optional[str]
raw_xml: str
class FidelityPreservingProcessor:
def __init__(self):
self.preserved_extensions: List[ExtensionData] = []
self.namespace_mappings: Dict[str, str] = {}
async def process_with_extensions(
self,
xml_content: str,
processor_func: callable
) -> str:
"""Process DDEX while preserving all extensions"""
parser = DDEXParser()
result = await parser.parse(
xml_content,
preserve_extensions=True,
raw_extensions=True
)
# Store extensions
self.preserved_extensions = self._extract_extensions(result)
self.namespace_mappings = result.namespace_mappings
# Process the structured data
processed_data = await processor_func(result.flat)
# Rebuild with extensions
builder = DDEXBuilder()
build_request = self._create_build_request_with_extensions(
processed_data,
self.preserved_extensions
)
return await builder.build(build_request)
def _extract_extensions(self, parse_result) -> List[ExtensionData]:
extensions = []
for ext in parse_result.extensions.unknown_elements:
extensions.append(ExtensionData(
namespace=ext.namespace,
element_name=ext.local_name,
attributes=ext.attributes,
content=ext.text_content,
raw_xml=ext.raw_xml
))
return extensions
def _create_build_request_with_extensions(
self,
processed_data,
extensions: List[ExtensionData]
):
# Create build request that includes extensions
build_request = {
'message': processed_data,
'extensions': [
{
'namespace': ext.namespace,
'element_name': ext.element_name,
'attributes': ext.attributes,
'content': ext.content,
'raw_xml': ext.raw_xml
}
for ext in extensions
],
'namespace_mappings': self.namespace_mappings
}
return build_request
Schema Evolution and Versioning
Handling Version Differences
class VersionAwareFidelityHandler {
async migrateWithFidelity(
xmlContent: string,
fromVersion: string,
toVersion: string
): Promise<string> {
const parser = new DDEXParser();
const builder = new DDEXBuilder();
// Parse with version-specific handling
const parseResult = await parser.parse(xmlContent, {
version: fromVersion,
preserveExtensions: true,
versionMigration: {
targetVersion: toVersion,
preserveIncompatibleFields: true,
addVersionExtensions: true
}
});
// Version-specific transformations
const migrated = await this.applyVersionTransformations(
parseResult,
fromVersion,
toVersion
);
// Build with target version
return await builder.build(migrated.toBuildRequest(), {
version: toVersion,
preserveExtensions: true
});
}
private async applyVersionTransformations(
parseResult: any,
fromVersion: string,
toVersion: string
): Promise<any> {
const transformations = this.getVersionTransformations(fromVersion, toVersion);
for (const transformation of transformations) {
parseResult = await transformation.apply(parseResult);
}
return parseResult;
}
private getVersionTransformations(from: string, to: string) {
const transformationMap = {
'3.8.2->4.2': [
new ResourceTypeTransformation(),
new MetadataFieldTransformation(),
new IdentifierFormatTransformation()
],
'4.2->4.3': [
new StreamingMetadataTransformation(),
new TerritoryCodeTransformation()
]
};
return transformationMap[`${from}->${to}`] || [];
}
}
class ResourceTypeTransformation {
async apply(parseResult: any): Promise<any> {
// Transform resource types while preserving extensions
for (const resource of parseResult.flat.resources) {
if (resource.type === 'SoundRecording') {
// Preserve original in extension
resource._extensions = resource._extensions || {};
resource._extensions.originalType = resource.type;
// Apply transformation
resource.type = 'AudioResource';
}
}
return parseResult;
}
}
Testing Round-Trip Fidelity
Comprehensive Fidelity Test Suite
interface FidelityTestCase {
name: string;
inputXml: string;
modification?: (data: any) => void;
expectedChanges?: string[];
preservedElements?: string[];
}
class FidelityTestSuite {
async runFidelityTests(testCases: FidelityTestCase[]): Promise<TestResult[]> {
const results: TestResult[] = [];
for (const testCase of testCases) {
const result = await this.runSingleTest(testCase);
results.push(result);
}
return results;
}
private async runSingleTest(testCase: FidelityTestCase): Promise<TestResult> {
const parser = new DDEXParser();
const builder = new DDEXBuilder();
try {
// Parse original
const original = await parser.parse(testCase.inputXml, {
preserveExtensions: true,
includeComments: true
});
// Apply modification if specified
if (testCase.modification) {
testCase.modification(original);
}
// Build new XML
const rebuiltXml = await builder.build(original.toBuildRequest());
// Parse rebuilt for comparison
const rebuilt = await parser.parse(rebuiltXml, {
preserveExtensions: true,
includeComments: true
});
// Compare structures
const comparison = await this.compareStructures(original, rebuilt);
return {
testName: testCase.name,
passed: comparison.identical,
differences: comparison.differences,
preservedExtensions: comparison.preservedExtensions,
metrics: comparison.metrics
};
} catch (error) {
return {
testName: testCase.name,
passed: false,
error: error.message,
differences: [],
preservedExtensions: false,
metrics: {}
};
}
}
private async compareStructures(original: any, rebuilt: any) {
const differences: string[] = [];
let preservedExtensions = true;
// Compare graph structures
const graphDiff = this.deepCompare(original.graph, rebuilt.graph);
differences.push(...graphDiff);
// Compare extensions
if (original.extensions.length !== rebuilt.extensions.length) {
differences.push(`Extension count mismatch: ${original.extensions.length} vs ${rebuilt.extensions.length}`);
preservedExtensions = false;
}
// Compare flattened data
const flatDiff = this.deepCompare(original.flat, rebuilt.flat);
differences.push(...flatDiff);
return {
identical: differences.length === 0,
differences,
preservedExtensions,
metrics: {
totalElements: this.countElements(original.graph),
extensionCount: original.extensions.length,
namespaceCount: Object.keys(original.namespaces || {}).length
}
};
}
private deepCompare(obj1: any, obj2: any, path = ''): string[] {
const differences: string[] = [];
if (typeof obj1 !== typeof obj2) {
differences.push(`Type mismatch at ${path}: ${typeof obj1} vs ${typeof obj2}`);
return differences;
}
if (obj1 === null || obj2 === null) {
if (obj1 !== obj2) {
differences.push(`Null mismatch at ${path}: ${obj1} vs ${obj2}`);
}
return differences;
}
if (typeof obj1 === 'object') {
const keys1 = Object.keys(obj1);
const keys2 = Object.keys(obj2);
const allKeys = new Set([...keys1, ...keys2]);
for (const key of allKeys) {
const newPath = path ? `${path}.${key}` : key;
if (!(key in obj1)) {
differences.push(`Missing key in original: ${newPath}`);
} else if (!(key in obj2)) {
differences.push(`Missing key in rebuilt: ${newPath}`);
} else {
differences.push(...this.deepCompare(obj1[key], obj2[key], newPath));
}
}
} else if (obj1 !== obj2) {
differences.push(`Value mismatch at ${path}: ${obj1} vs ${obj2}`);
}
return differences;
}
}
// Example test cases
const fidelityTests: FidelityTestCase[] = [
{
name: 'No modification round-trip',
inputXml: originalXml,
// No modification - should be identical
},
{
name: 'Title modification preserves extensions',
inputXml: xmlWithExtensions,
modification: (data) => {
data.flat.releases[0].title = 'New Title';
},
expectedChanges: ['releases[0].title'],
preservedElements: ['extensions', 'comments', 'namespaces']
},
{
name: 'Add resource preserves structure',
inputXml: originalXml,
modification: (data) => {
data.flat.resources.push({
id: 'A123456789',
type: 'SoundRecording',
duration: 'PT3M45S'
});
},
expectedChanges: ['resources.length'],
preservedElements: ['message.header', 'releaseList.structure']
}
];
Automated Fidelity Validation
import asyncio
import hashlib
from typing import List, Dict, Any
class AutomatedFidelityValidator:
def __init__(self):
self.test_results = []
async def validate_bulk_processing(
self,
xml_files: List[str],
modification_func: callable
) -> Dict[str, Any]:
"""Validate fidelity across multiple files"""
results = {
'total_files': len(xml_files),
'passed': 0,
'failed': 0,
'failures': []
}
for file_path in xml_files:
try:
with open(file_path, 'r') as f:
xml_content = f.read()
# Test round-trip fidelity
passed = await self._test_file_fidelity(
xml_content,
modification_func
)
if passed:
results['passed'] += 1
else:
results['failed'] += 1
results['failures'].append(file_path)
except Exception as e:
results['failed'] += 1
results['failures'].append(f"{file_path}: {str(e)}")
return results
async def _test_file_fidelity(
self,
xml_content: str,
modification_func: callable
) -> bool:
"""Test fidelity for a single file"""
parser = DDEXParser()
builder = DDEXBuilder()
# Parse original
original = await parser.parse(
xml_content,
preserve_extensions=True
)
# Create unmodified copy for comparison
unmodified_copy = copy.deepcopy(original)
# Apply modifications
if modification_func:
modification_func(original)
# Build new XML
rebuilt_xml = await builder.build(original.to_build_request())
# Parse rebuilt
rebuilt = await parser.parse(
rebuilt_xml,
preserve_extensions=True
)
# Compare preserved elements
return self._compare_preserved_elements(
unmodified_copy,
rebuilt,
modification_func
)
def _compare_preserved_elements(
self,
original: Any,
rebuilt: Any,
modification_func: callable
) -> bool:
"""Compare elements that should be preserved"""
# Elements that should always be preserved
preserved_paths = [
'graph.message.header.messageId',
'graph.message.header.sender',
'extensions',
'namespace_mappings'
]
for path in preserved_paths:
original_value = self._get_nested_value(original, path)
rebuilt_value = self._get_nested_value(rebuilt, path)
if original_value != rebuilt_value:
print(f"Fidelity violation at {path}")
return False
return True
def _get_nested_value(self, obj: Any, path: str) -> Any:
"""Get value from nested object path"""
keys = path.split('.')
current = obj
for key in keys:
if hasattr(current, key):
current = getattr(current, key)
elif isinstance(current, dict) and key in current:
current = current[key]
else:
return None
return current
Common Pitfalls and Solutions
1. Extension Loss During Modification
Pitfall: Modifying flattened data without preserving graph extensions
// DON'T - Extensions lost
result.flat.releases[0] = { ...newReleaseData }; // Overwrites extensions
// DO - Preserve extensions
result.flat.releases[0] = {
...result.flat.releases[0], // Preserve existing data including _extensions
...newReleaseData, // Apply modifications
_extensions: result.flat.releases[0]._extensions // Explicitly preserve
};
2. Namespace Declaration Loss
Pitfall: Not preserving namespace prefixes and declarations
# DON'T - Namespace context lost
build_request = {
'message': modified_data
# Missing namespace_mappings
}
# DO - Preserve namespace context
build_request = {
'message': modified_data,
'namespace_mappings': original_parse_result.namespace_mappings,
'preserve_prefixes': True
}
3. Element Order Changes
Pitfall: Rebuilding changes element order unintentionally
// Configure builder for deterministic order
const xml = await builder.build(buildRequest, {
preserveElementOrder: true,
deterministicOutput: true,
sortingStrategy: 'preserve-original'
});
Performance Considerations
- Extension Storage: Raw XML storage increases memory usage by ~20-30%
- Parsing Overhead: Full fidelity parsing is ~15% slower than basic parsing
- Build Complexity: Preserving extensions adds ~10% to build time
- Memory Management: Use streaming for large files with extensions
- Comparison Overhead: Deep structure comparison is expensive for large documents
Links to API Documentation
- Parser TypeScript API
- Builder TypeScript API
- Data Models Overview
- Python Parser API
- Builder Types Reference
This comprehensive guide ensures perfect round-trip fidelity for all DDEX processing workflows, maintaining data integrity across complex modification cycles.