Thinking_Out_Loud

MySQL中的GROUP_CONCAT()

2018-09-28

背景

今天要在负责维护的内部系统中做一次数据迁移(心跳加速),表结构不变,而是改变一些字段的存取方式。第一次用到了MySQL里的GROUP_CONCAT()函数,在这里记录一下。

准备测试表&数据

下面是简单的测试用表:

1
2
3
4
5
6
7
8
CREATE TABLE `tab_test` (
`id` mediumint(8) NOT NULL AUTO_INCREMENT,
`pid` mediumint(8) NOT NULL,
`version` char(8) NOT NULL DEFAULT '',
`str_1` varchar(20) NOT NULL DEFAULT '',
`str_2` varchar(20) NOT NULL DEFAULT '',
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

准备测试数据:

1
2
3
4
5
6
INSERT INTO `tab_test` VALUES(1,1,'v1',"Joe","Tag1");
INSERT INTO `tab_test` VALUES(2,1,'v1',"Ben","Tag2");
INSERT INTO `tab_test` VALUES(3,1,'v2',"Lily","Tag3");
INSERT INTO `tab_test` VALUES(4,2,'v1',"Alice","Tag4");
INSERT INTO `tab_test` VALUES(5,2,'v2',"Buddy","Tag5");
INSERT INTO `tab_test` VALUES(6,2,'v1',"Coco","Tag6");

具体的需求就是按pid分组且按id增序,把每组的str_1str_2合并到该组第一条记录的str_1中,值得注意的是,version不同的记录不能合并到一起。

刚开始还在考虑是不是需要写py脚本来操作,不过还好有GROUP_CONCAT(),现在只要写SQL就足够了。

用法示例

来看看GROUP_CONCAT()的用法和示例:

1
2
3
4
GROUP_CONCAT([DISTINCT] expr [,expr ...]
[ORDER BY {unsigned_integer | col_name | expr}
[ASC | DESC] [,col_name ...]]
[SEPARATOR str_val])

两种用法都可以:

1
2
3
4
5
6
7
8
9
10
SELECT student_name,
GROUP_CONCAT(test_score)
FROM student
GROUP BY student_name;

SELECT student_name,
GROUP_CONCAT(DISTINCT test_score
ORDER BY test_score DESC SEPARATOR ' ')
FROM student
GROUP BY student_name;

撸起袖子就是干

依葫芦画瓢先试试效果:

1
2
3
4
5
6
7
SELECT `id`,GROUP_CONCAT(
`str_1`
ORDER BY `id`
SEPARATOR ','
) `merged_str_1`
FROM `tab_test`
GROUP BY `pid`,`version`;

输出结果也很棒棒,合乎预期,按pidversion分组然后拼接str_1到每组第一条记录:

1
2
3
4
5
6
7
8
+----+--------------+
| id | merged_str_1 |
+----+--------------+
| 1 | Joe,Ben |
| 3 | Lily |
| 4 | Alice,Coco |
| 5 | Buddy |
+----+--------------+

继续拼装:

1
2
3
4
5
6
7
8
9
10
UPDATE `tab_test` t1,(
SELECT `id`,CONCAT_WS(
',',
TRIM(',' FROM GROUP_CONCAT(`str_1` ORDER BY `id` SEPARATOR ',')),
TRIM(',' FROM GROUP_CONCAT(`str_2` ORDER BY `id` SEPARATOR ','))
) `merged_str_1`
FROM `tab_test`
GROUP BY `pid`,`version`) t2
SET t1.str_1 = t2.merged_str_1
WHERE t1.id = t2.id;

更新后的表:

1
2
3
4
5
6
7
8
9
10
+----+-----+---------+----------------------+-------+
| id | pid | version | str_1 | str_2 |
+----+-----+---------+----------------------+-------+
| 1 | 1 | v1 | Joe,Ben,Tag1,Tag2 | Tag1 |
| 2 | 1 | v1 | Ben | Tag2 |
| 3 | 1 | v2 | Lily,Tag3 | Tag3 |
| 4 | 2 | v1 | Alice,Coco,Tag4,Tag6 | Tag4 |
| 5 | 2 | v2 | Buddy,Tag5 | Tag5 |
| 6 | 2 | v1 | Coco | Tag6 |
+----+-----+---------+----------------------+-------+

那么剩下的就是单纯的删除工作了。

Tags: MySQL