Moving an e-commerce store to a new platform is a process whose success depends largely on the quality of the prepared database. Migration is not limited to exporting and importing files—it requires a thorough audit, cleanup, and mapping of information according to the target system's architecture. Data structure errors can paralyze analytics, cause SEO problems, and create order fulfillment issues. This technical guide explains how to prepare data for migration to ensure consistency and minimize the risk of operational downtime.
Why Does Data Quality Determine Migration Success?
When changing e-commerce platforms, the "garbage in, garbage out" rule often applies. Any errors, duplicates, or outdated information in the old system will be carried into the new one unless verified beforehand. Preparing the database is especially critical when planning a migration to Shopify, particularly from systems with different data structures such as Magento or WooCommerce. Inconsistent product data can negatively affect store filtering, and errors in customer records make marketing personalization difficult. Proper file preparation helps avoid costly manual fixes after launching the new store version. Bad data directly affects reporting and analytics—if product identifiers or categories are mapped incorrectly, historical sales comparisons become impossible without advanced data processing. The operational cost of fixing errors after launch, such as manually editing thousands of variants or correcting inventory levels, often far exceeds the investment in a thorough pre-migration audit.
Source Database Audit: Selecting Data to Migrate
The first step is inventorying your assets. Not all data accumulated over years in the old system is worth migrating. Excess information can unnecessarily burden the database and extend import time. Before import, defining the scope of migratable data helps organize files and set priorities. The decision about what goes into the new system should be based on business and technical usefulness analysis.
Record Selection Criteria
- Products: Migrate only active assortment and archived products that generate organic traffic. Products discontinued years ago with no SEO value are better archived in an external spreadsheet.
- Customers: Migrate customers who purchased within the last 2–3 years or have active newsletter consent. Older records often contain outdated email addresses, which can harm domain reputation for marketing sends.
- Orders: Order history matters for analytics, but import is often the most technically complex. Decide whether you need full history from the store's inception or only the last 12–24 months needed for returns and claims handling.
- Static content: Blog articles and informational pages require verification of link freshness, formatting, and alignment with the new URL structure.
Data Cleaning: HTML Cleanup and Duplicate Removal Techniques
Data cleaning is the most labor-intensive preparation stage. Systems such as WordPress often store unnecessary HTML tags, inline styles, and CSS classes in product descriptions that can break content display in a new template. Data standardization also includes validating email addresses (removing spaces, checking format) and phone numbers to match international standards (e.g. +48).
Removing Unnecessary Formatting from Descriptions
When exporting from open-source systems, product descriptions often contain code fragments like <div style="color: #000;"> or visual editor-specific classes. Before import, clean these texts, keeping only basic semantic tags such as <ul>, <li>, <strong>, or <h2>. In spreadsheets you can use =REGEXREPLACE(A1; "<[^>]*>"; "") to remove all tags entirely, or more advanced scripts that preserve selected tags. This avoids situations where an old layout breaks a modern, responsive template.
Standardizing Attributes and Variants
Inconsistent naming is a common problem in large databases. For example, the same color may be recorded as "Red", "red", or "RED". Before mapping data, unify all attribute values. Spreadsheet functions such as PROPER() for letter case or TRIM() to remove extra spaces are helpful. Removing duplicate SKUs (Stock Keeping Units) is essential because the system requires unique identifiers for each product variant. If two different products share the same SKU, the importer will report an error or overwrite data.
Architecture Mapping: Aligning Fields with Shopify Logic
Shopify's data structure differs from systems such as Magento, where categories are hierarchical. Shopify uses collections, which can be manual or automated (based on tags, prices, or keywords). Understanding the Handle column and how product variants are defined is key.
Handle Column Logic and Its Impact on SEO
Handle is a unique product name that serves as a database identifier and URL element (e.g. domain.com/products/product-name). Every product must have a unique Handle. If a product has multiple variants (e.g. different sizes), all rows for those variants in the CSV must share the same Handle. This lets the system group them under one product page. When generating Handles from product names, remove accented characters and replace spaces with hyphens to maintain SEO consistency.
Variant Mapping: Option Name and Option Value
In the import file, variants are defined with Option1 Name (e.g. Size) and Option1 Value (e.g. XL). The system allows up to 3 options per product (e.g. Size, Color, Material). Option names must be identical for all products in a given category. Using "Size" and "Size label" interchangeably will break filters on product listings. Each variant should have its own SKU, weight (Variant Grams), and price (Variant Price).
CSV File Technical Aspects: Encoding, Separators, and Limits
CSV files must meet strict technical requirements. The most important parameter is UTF-8 encoding. Using another standard (e.g. Windows-1250) will cause errors displaying accented characters. Use a comma as the field separator, and values containing special characters should be quoted. Also remember limits: the standard importer handles files up to 15 MB. One product can have a maximum of 100 variants (unless the store uses Shopify Plus and newer API interfaces supporting more variants). For large databases, split files into smaller batches or use external bulk import tools.
Managing Media and Metafields
Product images are not uploaded directly in the CSV file. The file contains only publicly accessible image URLs. During import, the system downloads those files, optimizes them, and stores them on its own servers.
Image Hosting During Migration
The safest approach is to leave images on the old server until migration completes. If the old server is shut down before import, images will not be downloaded. Alternatively, bulk-upload files to Content → Files in the admin panel and generate new links. Ensure file names contain no spaces or special characters that could interrupt Shopify's download process.
Preparing Metafields
Custom data such as product composition, user manuals, or additional technical parameters should be mapped to Metafields. Before import, define the metafield structure in the admin panel, specifying type (e.g. text, number, file). CSV columns for metafields should follow the namespace.key schema. This preserves product-specific information that does not fit in the standard description.
Test Import (Dry Run) and Error Resolution
Before the final import to the production store, run a test on a development (sandbox) store. The most common errors during migration to Shopify usually stem from minor CSV structure mistakes that can be avoided by verifying a small sample (e.g. 20 products of varying complexity). After a trial import, check whether sale prices (Compare at price) are visible, inventory matches the source database, and products are correctly assigned to collections. The test scenario should also include adding an imported product to the cart and proceeding to checkout to ensure variants and weights are interpreted correctly by shipping systems.
Technical Summary of Data Preparation
Data preparation requires precision and understanding of the target platform's architecture. Success depends on a clean database, correct column mapping, and strict adherence to CSV technical standards. Proper data preparation also supports later Shopify analytics in e-commerce, giving a clear sales picture from day one and eliminating the need to manually fix thousands of records later. A systematic approach to auditing and cleaning the database minimizes downtime risk and customer service errors after the new platform launches.
FAQ
What Are the Most Common Causes of CSV Import Errors in Shopify?
The most common issues are incorrect character encoding (missing UTF-8), header structure errors, missing values in the Handle column, and exceeding file size or product variant limits.
Can Customer Passwords Be Imported During Migration to Shopify?
No—for security reasons Shopify does not allow importing passwords in plain or encrypted form. After migration, send customers invitations to activate accounts in the new system.
How Do I Prepare Product Images for Bulk Import?
Images must be available at public URLs. Place direct links to image files in the CSV; Shopify will download and optimize them during import.
What Is Handle in a Shopify Import File and Why Does It Matter?
Handle is a unique product identifier used to create friendly URLs and link product variants into one database record.
How Do I Handle Custom Data Not in the Default CSV?
Custom data such as specific technical parameters should be mapped to Metafields. This requires preparing the appropriate column structure in a spreadsheet or using apps that support Metafield import.
How Do I Avoid Duplicating Products During a Repeat Import?
Shopify identifies products by the Handle column. If you enable overwriting existing products during import, the system updates records with the same Handle instead of creating new ones.
Bibliography
- Shopify Developer Changelog - The product variant limit is now 2048 for all merchants - Shopify has increased the product variant limit to 2048 for all merchants using compatible themes and the new GraphQL Product API, moving away from the old 100 variant limit.
- Shopify Help Center - For security reasons Shopify does not allow importing customer passwords from other e-commerce platforms via CSV or standard API. Passwords in Shopify are encrypted using platform-specific security algorithms.
- Data migration challenges - Data structure errors during migration lead to analytics problems (incorrect reports) and order fulfillment issues.
- How migration affects SEO - Data structure and URL errors negatively affect SEO, site indexing, and search visibility.