MySQL数据库迁移到PostgreSQL

时间:2020-12-26 14:25:37

MySQL数据库迁移到PostgreSQL

查了不少资料,也尝试了一些,最后采用的办法如下:

 1. 导出mysql表定义(无数据)

mysqldump --no-data [dbname] >dbdef.sql

2. 使用mysql2postgres把脚本转换为pgsql

3. 上面生成的脚本还不一定很完美,可以尝试导入pgsql,调试错误并手动修改之。我遇到的问题就只有一个,mysql列定义中的zerofill需要手工去掉。一些unsinged定义会生成constraint,如果不需要可以去掉。另外,trigger都有问题,只能后面手工重建

4. 导出mysql数据:

mysqldump -v -nt --complete-insert=TRUE --compact --no-create-info --skip-quote-names [dbname] >dbdata.sql

老一些版本的pgsql如果不支持批量插入的话还需要加上--extended-insert=FALSE,这个性能损失巨大。

5. 转义符

mysql默认字符串里的'\'是转义符,而pgsql默认不是,修改postgresql.conf:

backslash_quote = on

escape_string_warning = off

standard_conforming_strings = off

数据导入完成后可以改回默认值。

5. pgsql里导入表定义和数据

psql -d [dbname] <dbdef.sql

psql -d [dbname] <dbdata.sql

6. 重建trigger

7. 自增主键(字段)的处理

由于导入数据时此字段都是有值的,所以pgsql里面seq并不会增加,可以用如下语句设置自增列的当前值:

SELECT setval('sample_id_seq',max(id)) from sample;

 

最后,如果数据量大,导入时考虑性能可以先把主键、索引、约束都去掉,导入完成后再加上。另外,psql客户端从管道导入数据似乎不够快,可以用tcp方式psql -h localhost ,还有一些为大数据量导入优化的参数,大概列一下:

autovacuum = off

wal_level = minimal

archive_mode = off

full_page_writes = off

fsync = off

checkpoint_segments = 50  
checkpoint_timeout = 1h

maintenance_work_mem视内存情况尽量大点


2,

Drupal 6.x. site Migration from MySQL to PostgreSQL (howto)

MySQL数据库迁移到PostgreSQL

I'm not a MySQL fun, so some troubles with MySQL database (MyISAM data corruption and too slow InnoDB) force migration of three existing Drupal 6 sites from MySQL to PostgreSQL. Since Drupal 6.2 some PgSQL-related bugs (sample) in core modules has been fixed, so I have a try.

Drupal 6.5 and PostgreSQL 8.3 compatibility

Drupal Core Modules

I've installed clean Drupal 6.5 on top of PostgreSQL 8.3.3 (with UTF8 database encoding) and played with it a lot. There is no heavy troubles, only minor ones:

  1. My database instance has non-UTF charset set as default client_encoding (in postgresql.conf). It is not possible to change this setting due to large amount of legacy code. It is much simpler to explicitly set connection encoding after connect to database.
    Patch: drupal.pgsql.diff.gz (see also my bug report).
  2. Blob data (in cache_* tables) is not escaped correctly. PostgreSQL requires 'E' prefix before backslash-escaped values. So, database log is filled with warnings (2 lines on each cached element).
    Patch: drupal65-pgsql8x-patch2.diff.gz and and bug report (with same patch attached).
  3. MySQL truncates too long string values (longer than specified in DDL), that violates SQL standards (but more 'user friendly'). PgSQL refuses such records and raises an error. This problem occurs in many places, such as too long module name passed to watchdog() function, too long translated string in Locale module and so on. There is no easy way to fix it, you need to examine database logs and fix Drupal code step by step. I've fixed just one module name ('comment_subscribe' becomes 'comment_subscrib').

Contributed Drupal Modules

I use only ten 3rd-party modules. Most of them works under PgSQL without problems:

  1. Admin Block
  2. GeSHi Filter и GeSHi node
  3. Image, Image assist, Image Attach, ImageMagick Advanced Options
  4. Advanced Help
  5. Site Menu
  6. Taxonomy Menu
  7. CAPTCHA pack, Random CAPTCHA
  8. Tagadelic
  9. Views

Comment Subscribe Module changes

The Comment Subscribe is working with PgSQL only after heavy rewrite. This module uses many non-standard MySQL features such as if/ifnul (instead of case/coalesce), concat (instead of || operator) and multi-table update clause.
Patch: comment_subscribe-pgsql.diff and bug report.

MySQL to PgSQL Drupal Migration

There is simple: you need to convert database data structures, copy data, than change data source in drupal configuration. Sounds easy.

In real life it is not so easy:

  • Backup and Migrate module does not compatible with PgSQL. Amount of required changes is not small.
  • There is several mysql2pgsql conversion scripts does not do the job (I've tried several ones).

So, I opt to alternate way: create new empty site with same data structures, than move data by special migration script.

In Drupal any module may have own data tables in database. So real migration procedure is:

  1. Backup your existing site! Both code and database !!!
  2. Install Drupal from source, add patches (see above), add modules and module patches (if any). If you already have running Drupal site, you need only PgSQL-compatibility patches (see above).

    Also, your existing site should have separate directory under sites/, not just sites/default.
  3. Create any non-used DNS name (i.e. www-new.mysite.com), create new PostgreSQL database, than install Drupal for this hostname on fresh database. You can supply any site data on installation phase, this data will be overwritten on data copy phase.
  4. Enable all modules, enabled on migrating site. Module enabling will create all tables/sequences in your PostgreSQL database.
  5. Run Magic Migration Script (see below). Your new site now have all data copied (with exception of cache_* tables and locale).
  6. If you use Locale module, export your Locale data on source site and import on new site.
  7. Change data source ($db_url in settings.php) in your old site settings.php to new database.
  8. That's all, folks!

Note: You should have shell-access to your server.

Magic drupal-mysql2pgsql.pl Script

Download it here: drupal-mysql2pgsql.pl.gz (do not forget to gunzip it).

The script requires Perl 5.x (I use 5.8, but Perl 5.6 should work) with DBI, DBD:Pg and DBD::mysql libraries.

Usage: drupal-mysql2pgsql.pl source-db dest-db, source-db - source MySQL database,
dest-db - destination PostgreSQL.

You either should passwordless access to both databases or should change the script to add your username/password. This is one-time use script, so I have not added many command-line parsing code.

How it works:

  • Both databases are queried for list of tables. If any table exists in source DB, but not in destination one, the script will warn you.
  • All data in destination tables is erased.
  • Most data from source tables is copied into destination. Locale data and cache not copied.
  • All sequencer initialized to max(column);

All destination data changes is done within transaction.

The script is provided AS IS. It works for me, you can change it as you wish.

Comments

The best thing i've ever found

MySQL数据库迁移到PostgreSQL

Thanks a lot for this man !!! The script rocks. Finally my long awaited desire got completed. I converted my site to PGSQL. With PGSQL it has become much more fast than with MySQL.

Though I don't know perl, I managed to edit the thing to my requirements coz I know PHP well + Google ;)

Drupal core patches necessary?

MySQL数据库迁移到PostgreSQL

This howto refers to Drupal 6.5. There were the core patches necessary. But is this with Drupal 6.15 still true?
I'd prefer not to patch the core files because it could cause problems next time a Drupal update is on the schedule.

@nilesh.3892: Did you apply the patches?

@ Alex Tutubalin: Thanks for this howto!
Did your patch make it into the core?

No i didn't do so. I'm using

MySQL数据库迁移到PostgreSQL

No i didn't do so. I'm using 6.15 and not 6.5. I don't have any problems and performance is much better than MySQL on PG 8.4 and PHP 5.3 (mod_fastcgi)

How to make use of the script?

MySQL数据库迁移到PostgreSQL

Hi and thank you for the script, which I badly need to make use of!
Excuse me for my ignorance but as someone who knows no Perl and new to Postgresql there are a number of ambiguities which prohibited me to deploy the script. I hope you could clarify:

  • Is the patch still needed for Drupal 6.22?
  • How to make "passwordless accounts"?
  • If I want to use credentials in the script, how it can be done. Say I made a fresh install of Drupal with a Postgresql database 'PGDB' which can be used by 'PGUSER' with password 'PGPASS' and I have my original site in mysql 'MYDB' whose database credentials are 'MYUSER' and 'MYPASS' . So how can I implement these credentials in the database connect lines:

my $srcdb = DBI->connect("dbi:mysql:database=$ARGV[0]") or die;
my $destdb = DBI->connect("dbi:Pg:dbname=$ARGV[1]" ) or die;

And what will be my perl conversion command?

I really appreciate your further explanations.

Very limited script

MySQL数据库迁移到PostgreSQL

This script is quite naive, assumes 1:1 matching data types. It did not work for me as anything nontrivial like binary data (e.g. cache table) and geodata does not work this way.

ps. if you need schemas with password, just modify first lines to something like this:
my $srcdb = DBI->connect("dbi:mysql:database=$ARGV[0]","root","root")



Postgresql 8.3.x warnings on improperly escaped blobs

Postgresql 8.3.x requires 'E' prefix before string literals with backslashed escapes (i.e. field = E'\n\r\t' instead of field='\n\r\t')

Without it LOTs of db warnings are produced by caching module.

Attached patch adds this prefix in db_encode_blob() function.

NOTE: this is only an idea. For general use, this code needs to be reworked with PgSQL version checks.

Files: 
Comment File Size Author
  MySQL数据库迁移到PostgreSQL drupal65-pgsql8x-patch2.diff 318 bytes Alex_Tutubalin

Comments

Any news on it? Will it be introduced to HEAD?

+Subscribing


3mysql的文章


w to make a proper migration from MySQL to PostgreSQL

Migrations are always horrible. Doesn't matter from which system to which other system. About what you should take care when you think about migration by the example of migration from MySQL to PostgreSQL.

Contents

 [hide

Why do you want to migrate?

There are different reasons why a user wants to migrate. Mostly for migrations from MySQL to PostgreSQL it is one of the following:

  • performance reasons
  • stableness
  • reliability
  • GIS
  • Replication issues
  • Cluster issues
  • High availability issues
  • localisation/globalisation issues
  • need of object relational features
  • sick of bugs
  • costs for license (dual license, GPL)
  • license (BSD instead of GPL)

Check profitability

First of all you should check if a migration really makes sense.

Just imagine you were a developer of live online games. You developed the game three or five years ago by using MySQL. All worked fine at the beginning but today you have more then a million users and 4000 queries per second. You already optimised all what you can optimise at and for MySQL but you still often have problems with the database server.

Consider, PostgreSQL isn't able to handle this huge load out of the box too. You will have other problems then with MySQL but you will have problems too. A huge effort on optimisation is necessary for PostgreSQL here too. This would mean you have to optimise your database for PostgreSQL and you have to optimise your full source code for PostgreSQL. This would cost you half of a year work until you will have an advantage by using PostgreSQL.

Now consider the life time of your game. Usually such kinds of games have a life time from just a few years. Your game already reached the highest point in its life. It will be one of the top live online games maybe this year and maybe next year but not longer. So it just makes no sense to make such huge changes here.

It will be better to let the old game on the old system and start the new project (a new game) directly by using PostgreSQL.

How long need a migration?

Just dumping the database and change the different SQL stuff from the system to PostgreSQL related SQL and then import all into PostgreSQL often isn't good enough. You will have the same problems as before or different problems which are not better then the old problems. When you wanted to migrate for performance reasons it also could happen that PostgreSQL will have an inferior performance then MySQL has had.

This means you have to change lots of other stuff. I have never seen a migration that was faster then 3 months. Usually, you will need 3 months up to half a year until all will work proper and you will be lucky with the new system.

Use the advantages of the system

Usually, when you want to migrate then you want to use the advantages of the system. PostgreSQL has different advantages then MySQL and you have to do different stuff when you want to get a benefit of the advantages. Let me show you some examples:

At MySQL you often have to use a bad db design to avoid joins and subqueries (because of bad performance here).Indexing is just implemented rudimentary. You need different storage engines for full text search and transaction handling. Trigger handling especially before trigger is ugly and buggy. Procedures and functions only possible in SQL. Not Null behaviour isn't familiar. Non DDL/DCL transactions. Non check constraints. Often you will find usage of the advantages of known MySQL bugs.

PostgreSQL also has an excellent performance by using a good database design. Joins and subqueries work fast. You can choose between several index algorithm and indexing works as expected. You also have a good performance on DDL indexing stuff. You only have one storage engine. You won't have problems with triggers, procedures/functions. Also you can write procedures/functions in almost every language. Not Null behaviour is familiar. You will have DDL/DCL transactions and check constraints and there are non known bugs at the moment for PostgreSQL. You also can create rules when you need them as advantage.

You will find two different philosophies outside:

Often from MySQL users you will hear: Put most of database logic into the application.

Often from PostgreSQL users you will hear: Put database logic in the database.

My own opinion is: Put database logic in the database, it doesn't matter which system. Unfortunately, with MySQL sometimes you just have to put it into the application because of missing feature or other disadvantages.

You have lots of features at PostgreSQL where for MySQL not even has source code. So when you want to have an advantage of a migration to PostgreSQL you should use this features as well.

MySQL hierarchy: Database/Schema -> Table -> Column

PostgreSQL hierarchy: Database -> Schema -> Table -> Column

When you will ask MySQL here you often will get the answer: our databases are not real databases. What we call databases is the same as schema at PostgreSQL. What PostgreSQL is calling database is called an instance in MySQL.

When you ask me or some other PostgreSQLer here you will get the answer: they just have one less hierarchy level. Database is database and table is table they just don't have the schema level.

Anyway, when you want to migrate of course you can use these 4 hierarchy levels too and not only three.

Because MySQL has so many bugs you often find workarounds for bugs in application code. Of course this has to be repaired when you want to have an advantage on PostgreSQL.

Data types is the next topic. Neither timestamp nor datetime from MySQL work in the same way as timestamp in PostgreSQL. Also text/varchar/char is different. Looking to performance just consider, that you can't use indexes on data type text in MySQL. You can use them in PostgreSQL. Also you won't have different performance for text, varchar and char.

Summary

To make a proper migration:

  1. Think about the database, the design. Think about it independent from what already exist. Re-design the database and use the advantages from PostgreSQL.
  2. Re-design the application/software

Lots of work ...

But this is the best way for migration from MySQL to PostgreSQL. Often migration isn't profitable and it is better to stay on MySQL and just design the new/upcoming project with PostgreSQL.

If profitability says a migration makes sense and you really do all this work then you will be happy and won't have further problems in nearer future.


4,

Skip to end of metadata Go to start of metadata

JIRA's data can be migrated from one database to:

  1. A different database on the same database server,
  2. The same database type on a different server (e.g. from one PostgreSQL server to another PostgreSQL server) or
  3. A different type of database server (e.g. from a MySQL server to a PostgreSQL server).

To do this, follow the appropriate procedure:

Migrating JIRA's data to the same type of database

Use this procedure to migrate JIRA's data to:

  • A different database on the same database server, or
  • The same database type on a different database server (e.g. from one PostgreSQL server to another PostgreSQL server).

To migrate JIRA's data to the same type of database:

  1. Use your database server's native tools to either:
    • Copy your JIRA database to a new database on the same database server installation, or
    • Copy/migrate your JIRA database to a new database of the same type on a different database server installation.
      MySQL数据库迁移到PostgreSQL Please Note:
      • If you are unable to do either of these tasks, use the Migrating JIRA's database to a different type of database server procedure (below) instead.
      • You could use this procedure to migrate JIRA's data to a different type of database server (e.g. MySQL to PostgreSQL). However, you would need to find tools that support these processes. Furthermore, Atlassian does not provide support for this strategy.
  2. Once your new database has been populated with JIRA's data, shut down your JIRA server.
  3. Make a backup of your JIRA Home Directory and JIRA Installation Directory.
  4. Reconfigure your JIRA server's connection to your database:
    • If you installed a 'Recommended' distribution of JIRA, you can use the JIRA Configuration Tool (by runningbin/config.sh (for Linux/Solaris) or bin\config.bat (for Windows) in your JIRA Installation Directory), which provides a convenient GUI that allows you to reconfigure JIRA's database connection settings.
    • If any of the following points applies to your situation, you need to manually configure the dbconfig.xml file in your JIRA Home Directory. Refer to the appropriate database configuration guide in the Connecting JIRA to a Database  section for the manual configuration instructions.
      • You are using JIRA WAR
      • You have a console-only connection to your JIRA server
      • You would prefer to configure your database connection manually (for custom configuration purposes).

Migrating JIRA's data to a different type of database server

Use this procedure to migrate JIRA's data to a different type of database server (e.g. from a MySQL server to a PostgreSQL server).

MySQL数据库迁移到PostgreSQL You can also use this procedure if your JIRA installation is currently using the internal HSQL database (which is only supported for evaluating JIRA) and you need to switch your JIRA installation across to using a  supported database (which are supported for JIRA installations used in a production environment).

  1. Create an export of your data as an XML backup. See Backing Up Data for details.
    MySQL数据库迁移到PostgreSQL Please note that JIRA's XML backup utility does not back up attachments (if you have attachments enabled).
  2. Create a new database on your new database server to house JIRA's data. See the appropriate database configuration guide in the Connecting JIRA to a Database section for the database creation instructions.
  3. Shut down your JIRA server.
  4. Make a backup of your JIRA Home Directory and JIRA Installation Directory.
  5. Delete the dbconfig.xml file in your JIRA Home Directory.
  6. Restart JIRA and you should see the first step of the JIRA Setup Wizard for configuring your database connection.
  7. Configure JIRA's connection to your new database (created in step 2 above) and click the 'Next' button.
  8. On the 'Application Properties' setup page, click the 'import your existing data' link and restore your data from the XML backup created in step 1 above.
    MySQL数据库迁移到PostgreSQL

7 Comments

  1. dbconfig.xml does not exist so this could be an actual problem 

  2. Anonymous

    Check here.... on a linux install

    /var/atlassian/application-data/jira

  3. Anonymous

    Here in windows: Program Files\Atlassian\Application Data\JIRA

  4. Using this proccess obtain a lot of errors like:

     

    /secure/SetupImport.jspa [jira.action.admin.OfbizImportHandler] Exception importing entity: org.ofbiz.c
    ore.entity.GenericEntityException: while inserting: [GenericEntity:Status][id,10036][description,ANY ARBITRARY TEXT][sequence,43][name,
    ANY ARBITRARY TEXT LONGER THAN 60 CHARACTERS][iconurl,/images/icons/status_generic.gif] (SQL Exception while executing the following:INSERT INTO issuestatus (ID, SEQUENCE, pname, DESCRIPTIO
    N, ICONURL) VALUES (?, ?, ?, ?, ?) (ORA-12899: value too large for column "JIRA4_USER"."ISSUESTATUS"."PNAME" (actual: 63, maximum: 60)
    ))

     

     Moving from Oracle to Oracle (I am not allowed to capy Oracle scheme).

    Any known issue?

    The only one information I found about it, is that the xml was corrupted ( Export was done twice with quite similar results) and recommends that should fix it manually, the xml is quite big and could be really painfull try to fix it.

    Any help will be apprecciated.

  5. Quick question: Does the first workflow also cover the scenario of upgrading a database, e.g. postgres 8.4 to 9.0?

    If I migrate the database with the native postgres migration tools into a new 9.0 database, do I only have to update dbconfig.xml to point to this new location? 

  6. Anonymous

    How can I tell if I am using the internal HSQL database? 

     

    1. Anonymous

      Please ignore the comment about the internal HSQL database