MySQL运维实战(5.6) 字符集设置对mysqldump的影响
mysqldump不指定字符集
不指定字符集时,默认使用了utf8。可能和环境有关系。
mysqldump -uroot test test_load > test_dump.sql $ more test_dump.sql -- MySQL dump 10.13 Distrib 5.7.39, for osx10.17 (x86_64) -- -- Host: localhost Database: test -- ------------------------------------------------------ -- Server version 8.0.31 /*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */; /*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */; /*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */; /*!40101 SET NAMES utf8 */; /*!40103 SET @OLD_TIME_ZONE=@@TIME_ZONE */; /*!40103 SET TIME_ZONE='+00:00' */; /*!40014 SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0 */; /*!40014 SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0 */; /*!40101 SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='NO_AUTO_VALUE_ON_ZERO' */; /*!40111 SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0 */; $ grep INSERT test_dump.sql | od -t x1 0000000 49 4e 53 45 52 54 20 49 4e 54 4f 20 60 74 65 73 0000020 74 5f 6c 6f 61 64 60 20 56 41 4c 55 45 53 20 28 0000040 27 e5 88 97 e5 88 97 e5 88 97 e5 88 97 e5 88 97 0000060 41 41 41 27 2c 27 e5 88 97 e5 88 97 e5 88 97 e5 0000100 88 97 e5 88 97 41 41 41 27 29 3b 0a 0000114
mysqldump指定字符集
mysqldump --default-character-set gbk -uroot test test_load > test_dump_gbk.sql more test_dump_gbk.sql -- MySQL dump 10.13 Distrib 5.7.39, for osx10.17 (x86_64) -- -- Host: localhost Database: test -- ------------------------------------------------------ -- Server version 8.0.31 /*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */; /*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */; /*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */; /*!40101 SET NAMES gbk */; /*!40103 SET @OLD_TIME_ZONE=@@TIME_ZONE */; /*!40103 SET TIME_ZONE='+00:00' */; /*!40014 SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0 */; /*!40014 SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0 */; /*!40101 SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='NO_AUTO_VALUE_ON_ZERO' */; /*!40111 SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0 */; -- -- Table structure for table `test_load` -- DROP TABLE IF EXISTS `test_load`; /*!40101 SET @saved_cs_client = @@character_set_client */; /*!40101 SET character_set_client = utf8 */; CREATE TABLE `test_load` ( `c_gbk` varchar(100) CHARACTER SET gbk COLLATE gbk_chinese_ci DEFAULT NULL, `c_utf8` varchar(100) CHARACTER SET utf8mb3 COLLATE utf8mb3_general_ci DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci; /*!40101 SET character_set_client = @saved_cs_client */; $ grep INSERT test_dump_gbk.sql | od -t x1 0000000 49 4e 53 45 52 54 20 49 4e 54 4f 20 60 74 65 73 0000020 74 5f 6c 6f 61 64 60 20 56 41 4c 55 45 53 20 28 0000040 27 c1 d0 c1 d0 c1 d0 c1 d0 c1 d0 41 41 41 27 2c 0000060 27 c1 d0 c1 d0 c1 d0 c1 d0 c1 d0 41 41 41 27 29 0000100 3b 0a
使用mysqldump备份和恢复数据时,建议制定字符集为utf8mb4。
utf8mb4
测试数据
mysql> CREATE TABLE `test_dump` ( `a` varchar(100) DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4; mysql> insert into test_dump values (unhex('f09f9882')),('列'); Query OK, 2 rows affected (0.00 sec) Records: 2 Duplicates: 0 Warnings: 0 mysql> select a, hex(a) from test_dump; +------+----------+ | a | hex(a) | +------+----------+ | ? | F09F9882 | | 列 | E58897 | +------+----------+ 2 rows in set (0.00 sec)
导出数据
$ mysqldump --default-character-set utf8mb4 -uroot test test_dump > test_dump_utf8mb4.txt $ mysqldump --default-character-set utf8 -uroot test test_dump > test_dump_utf8mb3.txt
查看文件
$ grep INSERT test_dump_utf8mb4.txt INSERT INTO `test_dump` VALUES ('😂'),('列'); $ grep INSERT test_dump_utf8mb3.txt INSERT INTO `test_dump` VALUES ('?'),('列');
如果不指定default-character-set为utf8mb4,则utf8mb4编码的数据无法导出,会导致数据丢失。所以建议使用utf8mb4
总结
使用mysqldump备份数据时,建议指定utf8mb4字符集,避免有些数据在导出时由于无法编码而丢失。