Is that query okay? I tracked down the behavior of the REPLACE clause, which you should be aware of

Hello. I'm Mandai, the Wild Team member on the development team.
This time, I'd like to take a closer look at the behavior of the REPLACE clause, which you may use every day without even thinking about it.

How do you update the data?

Our company uses MySQL as the main RDBMS for system development.
Normally, when updating a table with a unique constraint,

  • SELECT and check for duplicates, then branch to INSERT or UPDATE
  • If you try to INSERT and find a duplicate, catch the exception and proceed to UPDATE processing

You may be forced to choose between the above two options, but in fact (and not really) there is also an update process using the REPLACE clause

If you execute this REPLACE clause thinking that it simply replaces data as it literally says, it may result in data inconsistencies and lead to unexpected situations

Let's look at a simple example to see how it works

Let's see it in action

First, create a table like this:

mysql> CREATE TABLE `users` (`id` INT UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT, `name` VARCHAR(16) NOT NULL UNIQUE, `level` INT UNSIGNED NOT NULL DEFAULT 1); Query OK, 0 rows affected (0.02 sec) mysql> DESC `users`; +-------+-----+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +-------+------+------+-----+---------+----------------+ | id | int(10) unsigned | NO | PRI | NULL | auto_increment | | name | varchar(16) | NO | UNI | NULL | | | level | int(10) unsigned | NO | | 1 | | +-------+------------------+------+-----+---------+----------------+ 3 rows in set (0.00 sec)

The name field is set as a unique key.
Next, we will input some simple dummy data.

This time, I tried generating dummy data using generatedata.com

I think the atmosphere will be something like this:

mysql> INSERT INTO `users` (`name`,`level`) VALUES ("Pamela",8),("Dexter",4),("Ava",1),("Kelly",8),("Leila",2),("Quin",8),("Salvador",8),("Orli",2),("Ignacia",9),("Brennan",8); Query OK, 10 rows affected (0.00 sec) Records: 10 Duplicates: 0 Warnings: 0 mysql> SELECT * FROM `users`; | Quin | 8 | | 7 | Salvador | 8 | | 8 | Orli | 2 | | 9 | Ignacia | 9 | | 10 | Brennan | 8 | +----+----------+-------+ 10 rows in set (0.00 sec)

Now, let's try inserting a record with a duplicate name

mysql> INSERT INTO `users` (`name`,`level`) VALUES ("Pamela",1); ERROR 1062 (23000): Duplicate entry 'Pamela' for key 'name' mysql> SELECT * FROM `users`; +----+----------+-------+ | id | name | level | +----+----------+-------+ | 1 | Pamela | 8 | | 2 | Dexter | 4 | | | 3 | Ava | 1 | | 4 | Kelly | 8 | | 5 | Leila | 2 | | 6 | Quin | 8 | | 7 | Salvador | 8 | | 8 | Orli | 10 rows in set (0.00 sec)

An error occurred because of a duplicate.
This is expected behavior.

Normally, you would use the UPDATE clause here to update the record

mysql> UPDATE `users` SET `level`=10 WHERE `name` = "Pamela"; Query OK, 1 row affected (0.00 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> select * from users; +----+----------+-------+ | id | name | level | +----+----------+-------+ | 1 | Pamela | 10 | | 2 | Dexter | 4 | | 3 | Ava | 1 | | 4 | Kelly | 8 | | 5 | Leila | 2 | | 6 | Quin | in set (0.00 sec)

Now let's try updating a record using the REPLACE clause

mysql> REPLACE INTO `users` (`name`,`level`) VALUES ("Pamela",1); Query OK, 2 rows affected (0.00 sec) mysql> SELECT * FROM `users`; 3 | Ava | 1 | | 4 | Kelly | 8 | | 5 | Leila | 2 | | 6 | Quin | 8 | | 7 | Salvador | 8 | | 8 | Orli | 2 | | 9 | Ignacia | 9 | sec)

At this point, did you notice that something scary has happened?
Yes, somehow the primary key of the record with name = "Pamela" has been changed!
Let's take a closer look at this fact, which is as scary as a Junji Inagawa ghost story.

The REPLACE clause is not just a replacement

This fact is clear from the previous experiment, but why did this happen?
By the way, the reason why 11 is missing from id is because the INSERT statement was executed, and this is not the subject of this ghost story.

In short, the reason for this is

The REPLACE clause "INSERTs a record with the same data after a DELETE."


It's scary.
The idea that the primary key changes automatically is scary.
If you use it frequently, thinking that it's a convenient query that automatically determines whether to INSERT or UPDATE, you
may not have stepped on any landmines, but before you know it, you may find yourself in a sea of ​​landmines.

However, this phenomenon occurs when

  • Has an auto-incremented primary key
  • At the same time, there is a field that is a unique key
  • When a unique key is duplicated

I
think the correct answer is that they didn't implement the REPLACE clause after knowing its behavior.

There's a similar syntax called "INSERT ... ON DUPLICATE KEY UPDATE," but
it's not safe to use with statement-based replication or tables with composite primary (unique) keys
, so I think its usage is limited.
Also, many framework O/R mappers don't support it out of the box (because it's a MySQL dialect, not standard SQL), so if you use it, you won't get the benefits of the O/R mapper.
That said, there are reports that the ON DUPLICATE KEY UPDATE syntax can be processed much faster than SELECT → INSERT (UPDATE), so
if your MySQL system is taking a long time to process data updates, it might be worth trying.

MySQL 5.6 series Japanese documentation
MySQL 5.7 series documentation (English)
(The contents are roughly the same in the English version of the 5.7 series documentation)

Also, INSERT will unnecessarily count up the auto-increment, which may
lead to the tragedy of running out of primary keys.
If you want to be on the safe side, I think SELECT → INSERT (UPDATE) is the safest option.

Wow, generatedata.com is really useful.

If you want to consult with a development professional

At Beyond, we combine our extensive track record, technology, and know-how in system development with OSS technology and cloud technologies such as AWS to provide contracted development of web systems with reliable quality and excellent cost performance

We also handle server-side/back-end development and proprietary API collaboration development, making full use of our technology and know-how in building and operating web system/application infrastructure for large-scale, high-load games, applications, and digital content

If you have any problems with your development project, please visit the following website

● Web system development
● Server-side development (API/DB)

If you found this article useful, please click [Like]!
5
Loading...
5 votes, average: 1.00 / 15
23,875
X Facebook Hatena Bookmark pocket

The person who wrote this article

About the author

Yoichi Bandai

My main job is developing web APIs for social games, but I'm also grateful to be able to do a variety of other work, including marketing.
My portrait rights within Beyond are CC0.