Data Migration Script Guide
Note
This guide documents the Python-based data_migration_script.py for migrating
from the legacy PHP Physics Stores system (MySQL) to the modern LUStores system (PostgreSQL).
Overview
The data_migration_script.py is a specialized migration tool designed to handle the specific migration from:
Source: MySQL-based
physicsstoresdatabase (Legacy PHP system)Target: PostgreSQL-based
university_inventorydatabase (Modern LUStores)
Script Location
/data/LUStores/scripts/data_migration_script.py
Features
- ✅ Automated Schema Migration
Automatically applies any pending SQL migrations from
/migrations/directoryTracks applied migrations to avoid duplicates
- ✅ Data Preservation
Preserves existing admin users during migration
Creates fallback “Legacy System” user for data integrity
Maintains referential integrity across all tables
- ✅ Comprehensive Logging
Detailed migration logs saved to
data_migration.logReal-time console output with progress indicators
Error tracking and statistics
- ✅ Data Transformation
Maps legacy MySQL schema to modern PostgreSQL schema
Handles password hashing with bcrypt
Converts data types and formats automatically
Prerequisites
System Requirements
On the migration machine (can be local or VM):
Python 3.8+
Required Python packages:
pip install psycopg2-binary pymysql bcrypt
Access Requirements:
SSH/network access to legacy MySQL database
SSH/network access to target PostgreSQL database
Read permissions on legacy database
Write permissions on target database
Schema Export File
The script requires a schema mapping file:
/data/LUStores/migrations/database_schemas_export_20250911_152208.json
This JSON file contains the schema structure of both source and target databases.
Migration Process
Data Flow Diagram
graph LR subgraph "Legacy System" MYSQL[(MySQL<br/>physicsstores)] TABLES["Tables:<br/>• users<br/>• stock<br/>• supplier<br/>• orders<br/>• issues<br/>• charge"] end subgraph "Migration Script" SCRIPT[data_migration_script.py] TRANSFORM[Data Transformation<br/>& Mapping] VALIDATE[Validation &<br/>Error Handling] end subgraph "Modern System" POSTGRES[(PostgreSQL<br/>university_inventory)] NEWTABLES["Tables:<br/>• users<br/>• items<br/>• suppliers<br/>• orders + order_items<br/>• sales + sale_items<br/>• chargecodes"] end MYSQL --> TABLES TABLES --> SCRIPT SCRIPT --> TRANSFORM TRANSFORM --> VALIDATE VALIDATE --> NEWTABLES NEWTABLES --> POSTGRES style MYSQL fill:#ff6b6b style POSTGRES fill:#6bcf7f style SCRIPT fill:#4d96ffTable Mappings
The script performs the following table migrations:
Legacy Table |
Target Table(s) |
Notes |
|---|---|---|
|
|
Role mapping: LEVEL ≥10→admin, ≥5→manager, else→user |
|
|
All items assigned to “General” category |
|
|
Direct mapping with address concatenation |
|
|
COSTCENTRE becomes primary code |
|
|
Denormalized→normalized structure |
|
|
Stock issues converted to sales records |
Field Transformations
Users Table
Legacy (MySQL) Target (PostgreSQL)
────────────── ───────────────────
USERNAME → email (with @physics.lancs.ac.uk if missing @)
USERPASSWORD → password_hash (bcrypt hashed)
USERNAME → first_name
'User' → last_name
LEVEL (10+/5+/other) → role (admin/manager/user)
(always) → is_active = true
(always) → must_change_password = true
Items Table
Legacy (MySQL) Target (PostgreSQL)
────────────── ───────────────────
DESC1 → name
CODE → sku
DESC1 + DESC2 → description
(all items) → category_id = "General" category
PRICE → price
0.20 → vat_rate (20% VAT)
false → vat_included
BALANCE → current_stock
MIN → minimum_stock
HIDDEN != 'Y' → is_active
Suppliers Table
Legacy (MySQL) Target (PostgreSQL)
────────────── ───────────────────
CODE → id
NAME → name
(none) → contact (null)
(none) → email (null)
TELEPHONE → phone
ADDRESS1-4 (concatenated) → address
Orders & Order Items
Legacy (MySQL) Target (PostgreSQL)
────────────── ───────────────────
Orders (denormalized):
ORDER_NO → orders.order_id
SUPPLIER → orders.supplier_id
DATE → orders.status (if set: completed, else: pending)
SUM(VALUE) → orders.total_amount
Order Items (created from order rows):
STOCK_CODE → order_items.item_sku
PRICE → order_items.unit_cost
QTY → order_items.quantity
VALUE → order_items.total_cost
RECD_DATE != null → order_items.received = true
Issues → Sales Migration
Legacy (MySQL) Target (PostgreSQL)
────────────── ───────────────────
Sales (created from issues):
STOCK_CODE + DATE + counter → sales.sale_id (LEGACY-CODE-DATE-NNNNNN)
'GENERAL' → sales.charge_code
(calculated) → sales.subtotal_amount
(calculated at 20% VAT) → sales.vat_amount
(total inc VAT) → sales.total_amount
'completed' → sales.status
Customer Info (JSON):
ISSUEDTO → customer_info.name
REASON → customer_info.reason
COSTCENTRE, ACTIVITY, etc. → customer_info.legacy_data
Usage Guide
Basic Usage
Command-line syntax:
python scripts/data_migration_script.py \
--schema-file migrations/database_schemas_export_20250911_152208.json \
--source-host legacy-db.university.edu \
--source-port 3306 \
--source-user readonly_user \
--source-password 'legacy_password' \
--source-database physicsstores \
--target-host localhost \
--target-port 5432 \
--target-user postgres \
--target-password 'postgres_password' \
--target-database university_inventory
Step-by-Step Migration
Step 1: Prepare Backup
Before running migration, backup the target database:
docker compose -f docker-compose.prod.yml exec -T db \
pg_dump -U postgres university_inventory \
| gzip > "backups/pre_migration_$(date +%Y%m%d_%H%M%S).sql.gz"
Step 2: Verify Access
Test connections to both databases:
# Test MySQL connection
mysql -h legacy-db.university.edu -u readonly_user -p physicsstores \
-e "SELECT COUNT(*) FROM stock;"
# Test PostgreSQL connection
docker compose -f docker-compose.prod.yml exec db \
psql -U postgres -d university_inventory -c "SELECT version();"
Step 3: Run Migration
Execute the migration script:
cd /data/LUStores
python scripts/data_migration_script.py \
--schema-file migrations/database_schemas_export_20250911_152208.json \
--source-host YOUR_LEGACY_HOST \
--source-user YOUR_LEGACY_USER \
--source-password 'YOUR_LEGACY_PASSWORD' \
--source-database physicsstores \
--target-host localhost \
--target-port 5432 \
--target-user postgres \
--target-password 'YOUR_POSTGRES_PASSWORD' \
--target-database university_inventory
Step 4: Monitor Progress
The script provides real-time output:
[2025-01-22 14:30:15] - INFO - Starting complete data migration...
[2025-01-22 14:30:15] - INFO - Source: physicsstores (mysql)
[2025-01-22 14:30:15] - INFO - Target: university_inventory (postgres)
[2025-01-22 14:30:16] - INFO - Step 0: Verifying and applying database migrations...
[2025-01-22 14:30:18] - INFO - Step 1: Clearing target database...
[2025-01-22 14:30:20] - INFO - Step 3: Migrating categories...
[2025-01-22 14:30:21] - INFO - Step 4: Migrating suppliers...
[2025-01-22 14:30:45] - INFO - Migrated supplier: Acme Corp -> ID ACME001
[2025-01-22 14:31:15] - INFO - Step 5: Migrating items...
[2025-01-22 14:32:30] - INFO - Migrated item: LAB-001 -> ID 1523
...
[2025-01-22 14:45:00] - INFO - Migration completed successfully!
Step 5: Review Log File
Check the detailed log:
tail -100 data_migration.log
Step 6: Verify Migration
Run verification queries:
-- Check migrated counts
SELECT 'users' as table, COUNT(*) as count FROM users
UNION ALL
SELECT 'items', COUNT(*) FROM items
UNION ALL
SELECT 'suppliers', COUNT(*) FROM suppliers
UNION ALL
SELECT 'orders', COUNT(*) FROM orders
UNION ALL
SELECT 'chargecodes', COUNT(*) FROM chargecodes
UNION ALL
SELECT 'sales', COUNT(*) FROM sales;
Migration Sequence
The script follows this exact order (critical for foreign key dependencies):
graph TD START([Start Migration]) STEP0[Step 0: Apply SQL Migrations] STEP1[Step 1: Clear Target Database<br/>Preserve Admin Users] STEP2[Step 2: Skip User Migration<br/>Create Legacy System User] STEP3[Step 3: Migrate Categories<br/>Create default categories] STEP4[Step 4: Migrate Suppliers] STEP5[Step 5: Migrate Items] STEP6[Step 6: Migrate Chargecodes] STEP7[Step 7: Migrate Orders & Order Items] STEP8[Step 8: Migrate Issues as Sales] STEP9[Step 9: Restore Admin Users] SUMMARY[Print Migration Summary] DONE([Migration Complete]) START --> STEP0 STEP0 --> STEP1 STEP1 --> STEP2 STEP2 --> STEP3 STEP3 --> STEP4 STEP4 --> STEP5 STEP5 --> STEP6 STEP6 --> STEP7 STEP7 --> STEP8 STEP8 --> STEP9 STEP9 --> SUMMARY SUMMARY --> DONE style START fill:#4d96ff style STEP1 fill:#ffd93d style STEP5 fill:#ffd93d style DONE fill:#6bcf7fMigration Statistics
After completion, the script prints a summary:
============================================================
MIGRATION SUMMARY
============================================================
users : 0 migrated, 0 errors, 5 skipped
categories : 6 migrated, 0 errors, 0 skipped
suppliers : 45 migrated, 0 errors, 0 skipped
items : 1523 migrated, 0 errors, 12 skipped
chargecodes : 89 migrated, 0 errors, 3 skipped
orders : 234 migrated, 0 errors, 1 skipped
order_items : 567 migrated, 0 errors, 2 skipped
stock_movements : 3421 migrated, 0 errors, 15 skipped
------------------------------------------------------------
TOTAL : 5885 migrated, 0 errors
Success rate: 99.5%
============================================================
Understanding the Output
migrated: Successfully transferred to target database
skipped: Skipped due to duplicates or missing required fields
errors: Failed during migration (should be 0 for successful migration)
Troubleshooting
Common Errors
Connection Refused
Error:
pymysql.err.OperationalError: (2003, "Can't connect to MySQL server...")
Solutions:
Verify source database host is accessible:
ping legacy-db.university.edu
Check firewall allows MySQL port 3306
Verify credentials are correct
Check if MySQL allows remote connections
Authentication Failed
Error:
psycopg2.OperationalError: FATAL: password authentication failed
Solutions:
Verify PostgreSQL password in
.env.prodmatches commandCheck PostgreSQL
pg_hba.confallows connectionsEnsure user has sufficient privileges
Duplicate Key Violations
Error:
psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint
Solutions:
The script should have cleared the database first (Step 1)
If running multiple times, ensure database is cleared between runs
Check for pre-existing data that shouldn’t be there
Out of Memory
Error:
MemoryError: Unable to allocate array
Solutions:
Increase available RAM
Process large tables in batches (modify script)
Close other applications during migration
Schema File Not Found
Error:
FileNotFoundError: [Errno 2] No such file or directory: 'migrations/database_schemas_export...'
Solutions:
Ensure you’re running from
/data/LUStoresdirectoryVerify schema file exists:
ls -la migrations/database_schemas_export_*.json
Use absolute path if needed
Partial Migration Recovery
If migration fails partway through:
Option 1: Restore and Retry (Recommended)
# Stop application
docker compose -f docker-compose.prod.yml stop app
# Restore backup
gunzip -c backups/pre_migration_TIMESTAMP.sql.gz | \
docker compose -f docker-compose.prod.yml exec -T db \
psql -U postgres -d university_inventory
# Fix the issue, then retry migration
Option 2: Continue from Failure Point
The script tracks which tables succeeded. You can manually complete remaining tables if needed (advanced users only).
Post-Migration Tasks
Verify Data Integrity
Run validation queries to ensure data is correct
Reset Admin Password
docker compose -f docker-compose.prod.yml exec app \ npm run admin:reset-password admin@university.edu
Test Application
Log in with admin account
Verify items are visible
Test search functionality
Create a test order
Create Post-Migration Backup
docker compose -f docker-compose.prod.yml exec -T db \ pg_dump -U postgres university_inventory \ | gzip > "backups/post_migration_$(date +%Y%m%d_%H%M%S).sql.gz"
Archive Legacy System
Once verified successful:
Take final backup of legacy system
Document legacy system shutdown date
Preserve access for auditing purposes
Best Practices
Before Migration
☑ Test on staging first - Never run directly on production
☑ Backup everything - Both source and target databases
☑ Schedule downtime - Plan for 2-4 hours depending on data size
☑ Verify schema file - Ensure it matches both databases
☑ Check disk space - Ensure sufficient space for migration
☑ Notify users - Inform them of the migration window
During Migration
☑ Monitor logs - Watch for errors in real-time
☑ Don’t interrupt - Let the script complete fully
☑ Check resource usage - Monitor CPU, memory, disk I/O
☑ Save migration log - Keep
data_migration.logfor records
After Migration
☑ Verify counts - Ensure record counts match expected
☑ Test functionality - Test all critical features
☑ Review skipped records - Investigate why records were skipped
☑ Document issues - Note any problems encountered
☑ Train users - Ensure team knows about any changes
Performance Considerations
For large datasets (>100,000 records):
Migration may take several hours
Consider running during off-hours
Monitor database server performance
Ensure adequate disk space for transaction logs
Consider increasing PostgreSQL
shared_bufferstemporarily
Estimated Times
Records |
Estimated Time |
Recommended RAM |
|---|---|---|
< 10,000 |
5-15 minutes |
2GB |
10,000 - 50,000 |
15-45 minutes |
4GB |
50,000 - 100,000 |
45-90 minutes |
8GB |
> 100,000 |
2-4 hours |
16GB+ |