Legacy Data Migration Guide
This document explains how to migrate data from the legacy MySQL/MariaDB physics stores database to the new PostgreSQL schema.
Overview
The migration script (migrate_legacy_data.py) supports two data sources:
Direct MySQL/MariaDB connection (recommended for live databases)
SQL backup file parsing (for existing dump files)
The script transforms and imports data from the legacy system into the new schema structure. Here’s the mapping:
Table Mappings
Legacy Table |
New Table(s) |
Description |
|---|---|---|
|
|
User accounts with role mapping and UUID generation |
|
|
Inventory items with automatic category creation |
|
N/A |
Supplier information integrated into item descriptions |
|
|
Completed transactions |
|
|
Pending requests treated as draft quotes |
|
N/A |
Historical data (can be added as sales if needed) |
|
N/A |
Reference data (not migrated) |
|
N/A |
VAT configuration integrated into items |
Key Transformations
User IDs: Legacy integer IDs → UUID strings
Categories: Auto-generated from stock item patterns
SKUs: Legacy stock IDs →
LEGACY-{id}formatPrices: Decimal precision handling
VAT: 20% default rate applied consistently
Dates: Legacy date formats → PostgreSQL timestamps
Prerequisites
Python 3.7+ with pip
PostgreSQL database with the new schema already created
Database connection details (host, port, database, username, password)
MySQL/MariaDB access (if using direct database connection)
Installation
Install Python dependencies:
pip install -r migration-requirements.txt
Ensure your PostgreSQL database is running and accessible
Verify the database schema is up to date by running migrations in your application
Usage
Method 1: Direct MySQL Database Connection (Recommended)
Connect directly to the legacy MySQL/MariaDB database:
python migrate_legacy_data.py \
--mysql-db \
--mysql-host py-it.lancs.ac.uk \
--mysql-user your_username \
--mysql-password your_password \
--mysql-database physics_stores \
--pg-connection "postgresql://username:password@localhost:5432/database_name"
Method 2: SQL File Parsing
Parse an existing SQL backup file:
python migrate_legacy_data.py \
--sql-file "dev-samples/legacy/physics_stores_backup-Friday-18h.sql" \
--pg-connection "postgresql://username:password@localhost:5432/database_name"
Dry Run (Parse Only)
To test the data loading without making any database changes:
From MySQL database:
python migrate_legacy_data.py \
--mysql-db \
--mysql-host py-it.lancs.ac.uk \
--mysql-user your_username \
--mysql-password your_password \
--mysql-database physics_stores \
--pg-connection "postgresql://username:password@localhost:5432/database_name" \
--dry-run
From SQL file:
python migrate_legacy_data.py \
--sql-file "dev-samples/legacy/physics_stores_backup-Friday-18h.sql" \
--pg-connection "postgresql://username:password@localhost:5432/database_name" \
--dry-run
PowerShell Helper Script
Use the provided PowerShell script for easier execution:
Direct MySQL connection:
.\run-migration.ps1 -UseMySQL -MySQLUser "your_username" -MySQLPassword "your_password"
SQL file parsing:
.\run-migration.ps1 -SqlFile "dev-samples\legacy\physics_stores_backup-Friday-18h.sql"
Dry run:
.\run-migration.ps1 -UseMySQL -MySQLUser "your_username" -DryRun
Connection String Examples
Local PostgreSQL:
postgresql://postgres:password@localhost:5432/lustores
Remote PostgreSQL:
postgresql://user:pass@host.example.com:5432/dbname
With SSL:
postgresql://user:pass@host.example.com:5432/dbname?sslmode=require
Data Source Options
Direct MySQL Connection (Recommended)
Advantages:
Always gets the latest data from the live database
More reliable data type handling
Automatic handling of MySQL-specific date formats
Better error reporting for connection issues
No need to generate and transfer large SQL dump files
Requirements:
Network access to the MySQL server (py-it.lancs.ac.uk)
Valid MySQL credentials
PyMySQL Python package (included in requirements)
SQL File Parsing
Advantages:
Works offline once you have the SQL dump
No need for ongoing database access
Can work with historical backups
Disadvantages:
Data may be outdated depending on when the dump was created
Requires manual SQL parsing which may miss edge cases
Larger files require more memory
Migration Process
The script performs these steps in order:
Load Legacy Data
MySQL mode: Connects to database and reads tables directly
File mode: Parses CREATE TABLE structures and INSERT statements
Builds in-memory data structures
Migrate Users
Generates new UUIDs
Maps legacy access levels to roles
Forces password changes for security
Process Reference Data
Extracts supplier information
Processes VAT parameters
Create Categories and Items
Auto-generates categories based on item descriptions
Creates inventory items with supplier info in descriptions
Maps legacy stock IDs to new item IDs
Migrate Sales Data
Groups charge records by charge_id
Creates sales with calculated totals
Creates individual sale items
Migrate Quotes
Converts issues to draft quotes
Creates quote items where possible
Generate Summary Report
Shows record counts for all tables
Displays mapping statistics
Data Validation
After migration, verify:
User count matches between legacy and new system
Item stock levels are correctly transferred
Sales totals calculate correctly with VAT
Foreign key relationships are properly established
Validation Queries
-- Check user migration
SELECT COUNT(*) FROM users;
-- Check items with categories
SELECT c.name, COUNT(i.id) as item_count
FROM categories c
LEFT JOIN items i ON c.id = i.category_id
GROUP BY c.name;
-- Check sales totals
SELECT COUNT(*) as sales_count,
SUM(total_amount) as total_sales
FROM sales;
-- Check quotes
SELECT COUNT(*) as quote_count,
SUM(total_amount) as total_quotes
FROM quotes;
Common Issues and Solutions
1. Connection Errors
Verify PostgreSQL is running
Check connection string format
Ensure database exists and user has permissions
2. Data Parsing Errors
Check for special characters in legacy data
Verify SQL file encoding (should be UTF-8)
Review any warnings about unparseable dates
3. Foreign Key Violations
Ensure users are migrated first
Check that referenced items exist before creating sales/quotes
4. Duplicate Data
The script handles conflicts for users (by email) and categories (by name)
For items, it uses unique SKUs to prevent duplicates
Rollback Procedure
If you need to rollback the migration:
-- WARNING: This will delete all migrated data
DELETE FROM sale_items;
DELETE FROM sales;
DELETE FROM quote_items;
DELETE FROM quotes;
DELETE FROM stock_movements;
DELETE FROM items;
DELETE FROM categories;
DELETE FROM users WHERE id LIKE '%-%-%-%-%'; -- UUIDs only
Custom Modifications
The script can be modified for specific requirements:
Category Logic: Edit
_extract_category()methodUser Role Mapping: Modify
_map_user_role()methodVAT Rates: Adjust VAT calculation logic
Date Parsing: Update
_parse_legacy_date()for different formats
Support
For issues or questions:
Check the console output for specific error messages
Review the PostgreSQL logs for database-related errors
Use the dry-run mode to test parsing without database changes