Automating Drupal Migrations How to go from an Estimated One Week to Two Minutes Down Time
About Dan Harris ● Founder Webdrips.com ○ Drupal-based web design and development shop ○ Founded in July, 2011. ○ Nine years Drupal experience ○ 21 years professional experience. ● Twitter @webdrips ● Email dan@webdrips.com
Note About the Migration Process Although we’re covering a Drupal 6 to 7 migration in this presentation, most if not all of these ideas presented here should work for any Drupal to Drupal migration.
Overview: Initial Plan/Estimates ● Initial estimate: one week of downtime ● SQL queries would be used to export/import when coverage was limited with Drupal Migrate ● Only automation provided by Migrate Modules ● Existing Drupal 7 Architecture
Overview: Updated Plan ● Virtually zero downtime ○ Intermediate: asking for one day down time or less ● Complete migration in one business day ● Over 99% automated ● D7 site to be built during migration from scratch
About the Drupal 6 Site ● Architecturally, was a mess (Frankensite) ○ Migration provided chance to clean up architecture and code ● Six custom themes (1 custom/5 subthemes) ● 35 custom modules ● 151 contributed modules
About the Drupal 6 Site ● 1000 privileged users ● About 400k non-privileged users ● 25 Content Types, including Webforms ● Over 2,500 pages
About the Drupal 7 Site ● 106 Modules ● Bootstrap Primary Theme ● One Bootstrap subtheme, Four sub- subthemes ● Six content types only ● 11 Features provided architecture
Automated Migration Process Requirements ● Migrate modules: migrate, migrate_extras, migrate_d2d, migrate_webform ● Import modules: menu_import, path_redirect_import ● Four custom modules ● Scripts migration and deployment ● Fast server with SSD
Migration Script Overview Requirements: ● Create new Drupal D7 site ● Build out site architecture with features ● Enable Modules ● Migrate D6 to D7 ● Import items that couldn’t be migrated This provided for a repeatable/reliable process
Migration Script Highlights (Review) Build the site: drush site-install Enable features and modules: drush en feature_name -y Migrate each entity: drush mi entity
Custom Migration Modules 1. Disable “edits” to the D6 site a. Basically re-direct webform pages, admin pages, and paths like node/add, node/edit, etc. 2. Views (implemented with features) only for migration status and post-processing 3. Migrate_d2d module 4. CSV-based Migration
Drupal Migrate/D2D/Extras ● Handled most of the heavy lifting ○ Everything except menu links, path redirects, and slide shows ● Extensive drush support ● Plenty of methods available to massage data ● D2D: simplifies migration code
Migrating Users Challenges ● Nearly 400K unprivileged users ● Needed to assign users to organic groups ○ Based on how webform questions answered ● Had to fix user passwords ○ Fixed by writing directly to the user table inside the migration
Migrate Users Code Unprivileged vs. Privileged was a simple query: class NvidiaPrivilegedUserMigration extends NvidiaUserMigration { protected function query() { $query = parent::query(); $query->condition('u.mail', '%nvidia.com', 'LIKE/NOT LIKE'); return $query; } }
Migrate Users Code Fix the password: public function complete($account, $row) { parent::complete($account, $row); $account->pass = $row->pass; db_update('users') ->fields(array('pass' => $account->pass)) ->condition('uid', $account->uid) ->execute(); $this->nvidia_memberships($row); }
Assign Users to Groups (Review) public function nvidia_memberships($row) { $membership_query = Database::getConnection('default', 'd6source')->select ('webform_submissions', 'ws'); $membership_query->join('webform_submitted_data', 'wd', 'wd.sid = ws. sid'); $membership_query->fields('wd', array('cid')); $membership_query->fields('ws', array('nid')); $membership_query->addExpression('group_concat(data)', 'data'); $membership_query->groupBy('ws.sid'); $membership_query->groupBy('cid'); $membership_query->condition('ws.uid', $row->uid); $membership_query->condition('ws.nid', array (1234567,2345678,3456789,4567890,5678901), 'IN'); $membership_id = nvidia_og_membership_associate_user_with_program();
Node Migration Challenges ● Body images & links with absolute paths ● Empty fields sometimes caused display issues ● Had to deal with “interesting” architecture decisions on the D6 site ● Moved larger files to the cloud ● Reduced the number of content types
Node Migration Code Dealing with textarea images: ● Needed to use Simple HTML DOM Parser ● Code Review
How a Strange Dev. Decision can Affect a Migration D6 product page and dB variables table (review) led to the following code $variable_name = 'nvidia_product_disable_product_image_'.$row->nid; // drush_print_r($variable_name); $query = Database::getConnection('default', 'd6source') ->select('variable', 'v') ->fields('v', array('name', 'value')) ->condition('v.name', $variable_name, '=') ->execute() ->fetchAll(); $product_image_disabled = $query[0]->value; if ($product_image_disabled == 'i:1;') { $row->field_inline_image = NULL; }
Remove Empty Textarea Fields public function prepare($entity, stdClass $row) { foreach ($row as $key => $value) { if (!isset($row->$key) || $row->$key === null) { $entity->$key = NULL; } } }
“Non-Standard” Entity Migrations (Review) ● D2D handles established Drupal entities well ○ nodes, users, taxonomy, etc. ● But what if you want to migrate block content to an entity? ○ CSV Migration to the rescue
Challenges ● Biggest challenge was reducing the migration time ○ The original estimate just for migrating users was over 40h ○ Eventually that time was reduced to ~ 3 hours ○ We tweaked my.cnf, php.ini, drush.ini ○ Got a really fast server with Intel Xeon processors, fast RAM, and a SSD
Challenges ● Installation of modules in order ○ circular dependencies ○ features that add fields need to be installed before migration ● Relationships between content ○ Both nodes need to exist before creating a relationship ○ “Parent” content that did not exist in original site
Migration timeline ● -7days to release: Content freeze ● -2days: Automated rebuild, content migration and editorial approval. ● -8h: Registration lockdown and migration start ● -2h: Batch processing of content by editors and final tests
Accelerating migration ● Use Drush ● Single pass for each item ○ Migration objects are big and slow ○ Don’t load an object from DB twice ● Multithreading ○ https://www.deeson.co.uk/labs/multi-processing-part-2-how-make-migrate-move
Add multithreading to a working migration class ● Not very portable ○ needs a Drush extension ○ needs to run on the ‘fast’ server ● Very effective
Add multithreading to a working migration class ● Sub-class the migration ● Make all the sub-migrations use the same index ● Make the sub-migration work on a small ‘chunk’ of the index ● Break the migration in parts and send chunks of it to multiple threads
Add multithreading to a working migration class <?php class NVMultiThread extends NvidiaUnprivilegedUserMigration { public function __construct($args) { This is $args += array( 'source_connection' => NVIDIA_MIGRATE_SOURCE_DATABASE, boilerplate 'source_version' => 6, 'format_mappings' => array( needed by '1' => 'filtered_html', '2' => 'full_html', D2D '3' => 'plain_text', '4' => 'full_html', ), 'description' => t('Multithreaded Migration of users from Drupal 6'), 'role_migration' => 'Role', );
Add multithreading to a working migration class parent::__construct($args); $this->limit = empty($args['limit']) ? 100 : $args['limit']; $this->offset = empty($args['offset']) ? 0 : $args['offset']; map/index $this->map = new MigrateSQLMap('nvidiaunprivilegeduser', array( table 'uid' => array( 'type' => 'int', index definition 'unsigned' => TRUE, 'not null' => TRUE, 'description' => 'User migration reference', ), ), MigrateDestinationUser::getKeySchema() ); }
Add multithreading to a working migration class protected function query() { $query = parent::query(); $query->range($this->arguments['offset'], $this->arguments['limit']); Modify original return $query; query to limit the } number of items } to work on
Measuring the improvement ● Same server ● Restore destination DB from backup after each run ● Same source DB ● Both DBs in the same server ● MySQL optimizations for concurrency issues
Measuring the improvement 1000 rows, 100 per thread Threads Time Speed 1 71s 845/min 2 60s 1000/min 3 54s 1111/min
Measuring the improvement 10,000 rows, 1000 per thread Threads Time Speed 1 707s 848/min 2 303s 1980/min 3 300s 2000/min 4 291s 2061/min 5 351s 1709/min
Measuring the improvement 50,000 rows, 5000 per thread Threads Time Speed 3 1990s 1507/min 4 1562s 1920/min 5 1303s 2302/min 6 1637s 1832/min
Recommend
More recommend