Migrating SQL in Drupal 8 with Migrate Tools and Migrate Plus

Submitted by christophe on Tue, 13/06/2017 - 22:21
Drupal migration


UPDATE FOR DRUPAL 8.4
We must wait for this issue to be fixed to use Migrate Tools with Drush 9 (that is used by default by 8.4). On Drush 9, the drush mi  (migrate-import) alias is also replaced by drush mim.
If you still use Drush 8 in global (and not from your local vendor/bin) read this issue.

Migrate API is awesome, and if you plan some custom migration, a few contributed modules does the heavy lifting for you.

This article should be regarded as a list of steps to follow to achieve a simple migration from another SQL data source than Drupal. So we will not go in deep into the explanations of the Migrate theory, for this subject, refer to the documentation of the Migrate API.
Also, we will finish with some debugging techniques and a first shot of a bash script for rerolling a migration from scratch in case of configuration change.

Here is a repository with the code exposed below.

You can also find a kickstarter article for JSON custom migration.

SQL to Drupal fields

Here is a simple case :

  • We want to migrate rows from a company table into nodes from a Company content type.
  • Here is the mapping for the table columns into the Drupal node fields, test data are provided below, in point 4.
    • name (varchar) -> node title
    • description (text) -> body, default node text field
    • phone (varchar) -> telephone field
    • email (varchar) -> email field
    • website URL (varchar) -> link field
    • id (int) -> no field, kept for source - destination mapping
  • We also want simple processing like trim for telephone numbers and validate for email addresses.

Note that the id will be used by the Migrate system to keep a mapping of the source id / destination entity id, for update and status purposes, in the migrate_map_company Drupal table that is created by the Migrate system.

For other concepts like entity reference, great code examples are available in the migrate_plus module, that comes with two submodules.

  • migrate_example
  • migrate_example_advanced

If you are still interested by covering entity reference and other Migrate concepts in another article, just drop a word.

At this point it is still good to know that


1. Create a new Drupal project

A good starting point is the Composer template for Drupal projects.


2. Add dependencies

Get the Migrate Tools and Migrate Plus modules.

# migrate_tools requires migrate_plus
composer require drupal/migrate_tools


3. Scaffold a module with Drupal Console

# Via the Composer template for Drupal project, preferably use the tools provided under the /vendor/bin directory (drupal console, drush, phpcs, ...) and not the globally installed.
./vendor/bin/drupal gm

 // Welcome to the Drupal module generator

 Enter the new module name:
 > My Custom Migrate

 Enter the module machine name [my_custom_migrate]:
 > 

 Enter the module Path [/modules/custom]:
 > 

 Enter module description [My Awesome Module]:
 > Custom migrate example

 Enter package name [Custom]:
 > Migration

 Enter Drupal Core version [8.x]:
 > 

 Do you want to generate a .module file (yes/no) [yes]:
 > 

 Define module as feature (yes/no) [no]:
 > 

 Do you want to add a composer.json file to your module (yes/no) [yes]:
 > no

 Would you like to add module dependencies (yes/no) [no]:
 > yes

 Module dependencies separated by commas (i.e. context, panels):
 > migrate_tools

 Do you want to generate a unit test class (yes/no) [yes]:
 > no

 Do you want to generate a themeable template (yes/no) [yes]:
 > no


 Do you confirm generation? (yes/no) [yes]:
 > yes

Generated or updated files

 1 - /var/www/dev/drupal8/web/modules/custom/my_custom_migrate/my_custom_migrate.info.yml
 2 - /var/www/dev/drupal8/web/modules/custom/my_custom_migrate/my_custom_migrate.module


4. Reference your source database

Here we go!
Modify the sites/default/settings.php file and append the reference to the source database. Copy this at the bottom of the file, right after the $databases['default']['default'] = array (...);

// Use the 'migrate' key.
$databases['migrate']['default'] = array (
  // Replace by your source database name and credentials
  'database' => 'my_source_database',
  'username' => 'root',
  'password' => 'root',
  'prefix' => '',
  'host' => 'localhost',
  'port' => '3306',
  'namespace' => 'Drupal\\Core\\Database\\Driver\\mysql',
  'driver' => 'mysql',
);

Here is a SQL sample that creates the Company table and populates 100 records


5. Create your content type and fields

Create a Company content type and make sure that it contains the following fields.

Company content type fields


6. Plugins for the Extract - Transform - Load phases (ETL)

Only the Source plugin definition is required, Process (e.g. case transform) and Destination plugins are here to help on customizing the migration to your needs.

The Process comes with a default one-to-one field migration (the get Plugin).
Here is the full list of all the core Process plugins available, like concat, explode, flatten, default_value and much more.

Destination plugins also have core definition like entity:node. Here you can also define your custom entity types, but this is not covered by this article.

So we have

  • Extract : Source plugin
  • Transform : Process plugin
  • Load : Destination plugin

In my_custom_migrate module, create the following directory structure

  • src/Plugin/migrate/source
  • src/Plugin/migrate/process

Source plugin

Add the Company.php file in the src/Plugin/migrate/source directory where you will create the source fields definition.

<?php

/**
 * @file
 * Contains Drupal\my_custom_migrate\Plugin\migrate\source\Company
 */

namespace Drupal\my_custom_migrate\Plugin\migrate\source;

use Drupal\migrate\Plugin\migrate\source\SqlBase;

/**
 * Source plugin for Companies.
 *
 * @MigrateSource(
 *   id = "company"
 * )
 */
class Company extends SqlBase {

  /**
   * {@inheritdoc}
   */
  public function query() {
    $query = $this->select('company', 'c')
      ->fields('c', array(
        'id',
        'name',
        'description',
        'phone',
        'email',
        'website',
      ));
    return $query;
  }

  /**
   * {@inheritdoc}
   */
  public function fields() {
    $fields = array(
      'id' => $this->t('Autoincrement ID'),
      'name' => $this->t('Company name'),
      'description' => $this->t('HTML content'),
      'phone' => $this->t('Telephone number'),
      'email' => $this->t('Email address'),
      'website' => $this->t('Website URL'),
    );
    return $fields;
  }

  /**
   * {@inheritdoc}
   */
  public function getIds() {
    return [
      'id' => [
        'type' => 'integer',
        'alias' => 'c',
      ],
    ];
  }
}


Process plugins

We will cover here 2 Process plugins : Trim and ValidateMail.

Trim

Some modules are removing whitespace (like Email), some not (like Telephone) while storing their values.
Migration is a good moment to clean up.

Note that for such a trivial task we should use the callback process plugin with the trim php function.
This plugin example is only here to expose both ways to achieve it.

Example with the callback, in the Migrate Plus yml file (see more in point 7. Define the Migrate Plus yml file).

process:
  destination_field:
    plugin: callback
    callable: trim
    source: source_field

Example with the Process plugin

<?php

/**
 * @file
 * Contains Drupal\my_custom_migrate\Plugin\migrate\process\Trim
 */

namespace Drupal\my_custom_migrate\Plugin\migrate\process;

use Drupal\migrate\ProcessPluginBase;
use Drupal\migrate\MigrateException;
use Drupal\migrate\MigrateExecutableInterface;
use Drupal\migrate\Row;

/**
 * Removes the first and last whitespaces.
 * For demo purpose only, for such a trivial task use the callback process plugin :
 * https://www.drupal.org/docs/8/api/migrate-api/migrate-process/process-plugin-callback
 *
 * @MigrateProcessPlugin(
 *   id = "trim"
 * )
 */
class Trim extends ProcessPluginBase {

  /**
   * {@inheritdoc}
   */
  public function transform($value, MigrateExecutableInterface $migrate_executable, Row $row, $destination_property) {
    if (is_string($value)) {
      return trim($value);
    }
    else {
      throw new MigrateException(sprintf('%s is not a string', var_export($value, TRUE)));
    }
  }
}


ValidateMail

Nothing really different here, apart from the MigrateException that we do not want to throw : let's say that the email field in the content type is not mandatory, we do not want the row to be skipped for an invalid email.
So let's import the row, but do not populate the field with a wrong email address, just an empty string. Ideally, we should add some code to log this for later manual processing.

<?php

/**
 * @file
 * Contains Drupal\my_custom_migrate\Plugin\migrate\process\ValidateMail
 */

namespace Drupal\my_custom_migrate\Plugin\migrate\process;

use Drupal\migrate\ProcessPluginBase;
use Drupal\migrate\MigrateExecutableInterface;
use Drupal\migrate\Row;

/**
 * Checks if the mail syntax is correct.
 *
 * @MigrateProcessPlugin(
 *   id = "validate_mail"
 * )
 */
class ValidateMail extends ProcessPluginBase {

  /**
   * {@inheritdoc}
   */
  public function transform($value, MigrateExecutableInterface $migrate_executable, Row $row, $destination_property) {
    $value = trim($value);
    if (\Drupal::service('email.validator')->isValid($value)) {
      return $value;
    }
    else {
      // throw new MigrateException(sprintf('%s is not a mail', var_export($value, TRUE)));
      // do not throw Exception, just an empty value so our row is still imported.
      return '';
    }
  }
}


7. Define the Migrate Plus yml file

Create the my_custom_migrate/config/install/migrate_plus.migration.company.yml that defines the migration "metadata" (id and group that will be used by the drush commands) and references the source, process and destination plugins.

# Migration metadata
id: company
label: Migrate list of companies
migration_group: my_custom_migrate

# Source plugin, id defined in the @MigrateSource annotation
# in the src/Plugin/migrate/source/Company.php file.
source:
  plugin: company

# Destination plugin
destination:
  plugin: entity:node

# Process plugin
process:
  # Node type (bundle)
  type:
    plugin: default_value
    default_value: company

  # One-to-one field mapping using the default "get" process plugin.
  title: name
  # We can also use body/summary, body/format
  # see core/modules/text/config/schema/text.schema.yml
  # field.value.text_with_summary, used by the node entity for the body field.
  body/value: description
  field_telephone:
    plugin: trim
    source: phone
  field_email:
    plugin: validate_mail
    source: email
  field_website: website # and not field_website/url

# Nothing needed here, it is a single table without any other relation.
migration_dependencies: {}


8. Enable your module

drush en my_custom_migrate -y


9. Execute migration

Before we start, check the status of the migration.

# Shorthand for drush migrate-status
drush ms

drush migrate-status

Then import the migration group.

# Shorthand for drush migrate-import
drush mi company
# Note that in this case, this is similar to drush mi --group=my_custom_migrate, because company is the only item in the group

drush migrate-import

You can also rollback your migration

# Shorthand for drush migrate-rollback, separate with commas if multiple: drush mr company,contact
drush mr company

Other Drush Migrate commands


10. Debugging techniques

Nice tips from Mike Ryan found on StackOverflow:

I usually run migrations in drush and, when not stepping through in a debugger, use drush_print_r(). The key points you want to instrument are prepareRow() (dump $row to make sure the source data is being pulled correctly in the first place) and prepare() (dump $node to make sure the source values are getting properly mapped into the node object that's about to be saved).

Source Plugin

Edit the Company.php Source plugin and implement the prepareRow method to dump the rows and check if the source data is being pulled correctly.

  public function prepareRow(Row $row) {
    drush_print_r($row);
    return parent::prepareRow($row);
  }

You can also limit the rows to be imported on the query method, by adding a range. This will also be reflected with drush ms that uses this query to display the amount of items to be imported.

public function query() {
    $query = $this->select('company', 'c')
      ->fields('c', array(
        'id',
        'name',
        'description',
        'phone',
        'email',
        'website',
      ))->range(0,10);
    return $query;
}

 

Migrate Plus yml configuration

If you change the config/install configuration yml file(s), you will need to reinstall or use the Configuration development contributed module.

If you choose the first option, you can use what follows.

Implement the hook_uninstall() on a my_custom_migrate.install file.

<?php

/**
 * Implements hook_uninstall().
 */
function my_custom_migrate_uninstall() {
  // @review with like
  $configs  = [
    'migrate_plus.migration_group.my_custom_migrate',
    'migrate_plus.migration.company',
  ];
  foreach($configs as $config) {
    $delete = \Drupal::database()->delete('config')
      ->condition('name', $config)
      ->execute();
  }
  drupal_flush_all_caches();
}

With the devel module installed run this to uninstall and reinstall your module.

drush dre my_custom_migrate -y

Then remove all the content (with devel generate installed).

drush genc 0 --kill --types=company

Note that you can (and should) use drush mr to remove these created nodes, but for entities created with the entity_generate process plugin (e.g. terms references created along with the nodes), you should use devel generate to get rid of all the remaining entities.

# remove terms from the tags vocabulary created by the entity_reference process plugin (not shown in this example)
drush generate-terms tags --kill 0

Here is a draft of bash script to launch this reroll.

#!/bin/bash

# Prompt for confirmation
read -p "This will reinstall the custom migrate module, delete all the entities from the content type and restart import. Are you sure (y/n)? " -n 1 -r
# Move to a new line
echo
if [[ $REPLY =~ ^[Yy]$ ]]
then
    SECONDS=0
    echo "Getting initial status..."
    # Status
    drush ms
    echo "Reinstalling the custom migrate module..."
    # Assumes that devel installed
    drush dre my_custom_migrate -y
    echo "Removing all the nodes..."
    # Delete all the content, assumes that devel generate is installed
    drush genc 0 --kill --types=company
    echo "Status..."
    # Status
    drush ms
    echo "Reimport..."
    # Import
    drush mi --group=my_custom_migrate
    echo "$SECONDS seconds elapsed."
fi

Migrate all the things

Resources

Comments

I have to reinstall the module anytime I change the yaml file? What an absolute pleasure that must be, and so fast!

Add new comment

Restricted HTML

  • Allowed HTML tags: <a href hreflang> <em> <strong> <cite> <blockquote cite> <code> <ul type> <ol start type> <li> <dl> <dt> <dd> <h2 id> <h3 id> <h4 id> <h5 id> <h6 id>
  • Lines and paragraphs break automatically.
  • Web page addresses and email addresses turn into links automatically.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.