Legacy Data Migration Guide

This document explains how to migrate data from the legacy MySQL/MariaDB physics stores database to the new PostgreSQL schema.

Overview

The migration script (migrate_legacy_data.py) supports two data sources:

  1. Direct MySQL/MariaDB connection (recommended for live databases)

  2. SQL backup file parsing (for existing dump files)

The script transforms and imports data from the legacy system into the new schema structure. Here’s the mapping:

Table Mappings

Legacy Table

New Table(s)

Description

users

users

User accounts with role mapping and UUID generation

stock

items + categories

Inventory items with automatic category creation

supplier

N/A

Supplier information integrated into item descriptions

charge

sales + sale_items

Completed transactions

issues

quotes + quote_items

Pending requests treated as draft quotes

orders

N/A

Historical data (can be added as sales if needed)

periods

N/A

Reference data (not migrated)

vatparams

N/A

VAT configuration integrated into items

Key Transformations

  1. User IDs: Legacy integer IDs → UUID strings

  2. Categories: Auto-generated from stock item patterns

  3. SKUs: Legacy stock IDs → LEGACY-{id} format

  4. Prices: Decimal precision handling

  5. VAT: 20% default rate applied consistently

  6. Dates: Legacy date formats → PostgreSQL timestamps

Prerequisites

  1. Python 3.7+ with pip

  2. PostgreSQL database with the new schema already created

  3. Database connection details (host, port, database, username, password)

  4. MySQL/MariaDB access (if using direct database connection)

Installation

  1. Install Python dependencies:

pip install -r migration-requirements.txt
  1. Ensure your PostgreSQL database is running and accessible

  2. Verify the database schema is up to date by running migrations in your application

Usage

Method 2: SQL File Parsing

Parse an existing SQL backup file:

python migrate_legacy_data.py \
  --sql-file "dev-samples/legacy/physics_stores_backup-Friday-18h.sql" \
  --pg-connection "postgresql://username:password@localhost:5432/database_name"

Dry Run (Parse Only)

To test the data loading without making any database changes:

From MySQL database:

python migrate_legacy_data.py \
  --mysql-db \
  --mysql-host py-it.lancs.ac.uk \
  --mysql-user your_username \
  --mysql-password your_password \
  --mysql-database physics_stores \
  --pg-connection "postgresql://username:password@localhost:5432/database_name" \
  --dry-run

From SQL file:

python migrate_legacy_data.py \
  --sql-file "dev-samples/legacy/physics_stores_backup-Friday-18h.sql" \
  --pg-connection "postgresql://username:password@localhost:5432/database_name" \
  --dry-run

PowerShell Helper Script

Use the provided PowerShell script for easier execution:

Direct MySQL connection:

.\run-migration.ps1 -UseMySQL -MySQLUser "your_username" -MySQLPassword "your_password"

SQL file parsing:

.\run-migration.ps1 -SqlFile "dev-samples\legacy\physics_stores_backup-Friday-18h.sql"

Dry run:

.\run-migration.ps1 -UseMySQL -MySQLUser "your_username" -DryRun

Connection String Examples

Local PostgreSQL:

postgresql://postgres:password@localhost:5432/lustores

Remote PostgreSQL:

postgresql://user:pass@host.example.com:5432/dbname

With SSL:

postgresql://user:pass@host.example.com:5432/dbname?sslmode=require

Data Source Options

SQL File Parsing

Advantages:

  • Works offline once you have the SQL dump

  • No need for ongoing database access

  • Can work with historical backups

Disadvantages:

  • Data may be outdated depending on when the dump was created

  • Requires manual SQL parsing which may miss edge cases

  • Larger files require more memory

Migration Process

The script performs these steps in order:

  1. Load Legacy Data

    • MySQL mode: Connects to database and reads tables directly

    • File mode: Parses CREATE TABLE structures and INSERT statements

    • Builds in-memory data structures

  2. Migrate Users

    • Generates new UUIDs

    • Maps legacy access levels to roles

    • Forces password changes for security

  3. Process Reference Data

    • Extracts supplier information

    • Processes VAT parameters

  4. Create Categories and Items

    • Auto-generates categories based on item descriptions

    • Creates inventory items with supplier info in descriptions

    • Maps legacy stock IDs to new item IDs

  5. Migrate Sales Data

    • Groups charge records by charge_id

    • Creates sales with calculated totals

    • Creates individual sale items

  6. Migrate Quotes

    • Converts issues to draft quotes

    • Creates quote items where possible

  7. Generate Summary Report

    • Shows record counts for all tables

    • Displays mapping statistics

Data Validation

After migration, verify:

  1. User count matches between legacy and new system

  2. Item stock levels are correctly transferred

  3. Sales totals calculate correctly with VAT

  4. Foreign key relationships are properly established

Validation Queries

-- Check user migration
SELECT COUNT(*) FROM users;

-- Check items with categories
SELECT c.name, COUNT(i.id) as item_count 
FROM categories c 
LEFT JOIN items i ON c.id = i.category_id 
GROUP BY c.name;

-- Check sales totals
SELECT COUNT(*) as sales_count, 
       SUM(total_amount) as total_sales 
FROM sales;

-- Check quotes
SELECT COUNT(*) as quote_count, 
       SUM(total_amount) as total_quotes 
FROM quotes;

Common Issues and Solutions

1. Connection Errors

  • Verify PostgreSQL is running

  • Check connection string format

  • Ensure database exists and user has permissions

2. Data Parsing Errors

  • Check for special characters in legacy data

  • Verify SQL file encoding (should be UTF-8)

  • Review any warnings about unparseable dates

3. Foreign Key Violations

  • Ensure users are migrated first

  • Check that referenced items exist before creating sales/quotes

4. Duplicate Data

  • The script handles conflicts for users (by email) and categories (by name)

  • For items, it uses unique SKUs to prevent duplicates

Rollback Procedure

If you need to rollback the migration:

-- WARNING: This will delete all migrated data
DELETE FROM sale_items;
DELETE FROM sales;
DELETE FROM quote_items; 
DELETE FROM quotes;
DELETE FROM stock_movements;
DELETE FROM items;
DELETE FROM categories;
DELETE FROM users WHERE id LIKE '%-%-%-%-%'; -- UUIDs only

Custom Modifications

The script can be modified for specific requirements:

  1. Category Logic: Edit _extract_category() method

  2. User Role Mapping: Modify _map_user_role() method

  3. VAT Rates: Adjust VAT calculation logic

  4. Date Parsing: Update _parse_legacy_date() for different formats

Support

For issues or questions:

  1. Check the console output for specific error messages

  2. Review the PostgreSQL logs for database-related errors

  3. Use the dry-run mode to test parsing without database changes