[EagerLoad] Which is more useful: leftJoin or with? [N+1 problem]


I'm Enoki from the System Development Department, and my seniors and superiors have been telling me recently that
"anyone who has never written a raw query should be seriously concerned about this." (By the way, I'm now in charge of MySQL training for new graduates, so I'm seriously concerned about this.)

N+1 problem

(Leaving my story aside) Let
's say the following relationships were set up.

class Bolg { public function comments(): HasMany { return $this->hasMany(Comment::class); } }

Get the Bolg model

$blogs= Bolg::query()->first();

Here the following query is issued:

"select * from `blogs`"

By looping further, we can get all the Comment models associated with each Blog model

$comments= $blogs->map(function ($blog) { return $blogs->comments->first(); });

Naturally, since it is a loop, the following query will be executed as many times as there are chapters

"select * from `blogs` where `blogs`.`blog_id` = ? and `blogs`.`blog_id` is not null"

The initial query to retrieve all records is executed once + N queries are issued.
This is the 1+N or N+1 problem.

The problem is that, naturally, as many queries as there are chapters, the load is increased and this is simply a problem

join and with avoid N+1

We have now reached the main point

As mentioned above, issuing a query each time is extremely inefficient, so you should use (left)join or with.
(By the way, in some cases it may be better to retrieve each time, so choose carefully.)
Each of these can be used to prevent a large number of queries from being sent.
However, they each do something different.

What is join?

The join is an inner join, which
takes the data that is common between the tables and eliminates the rest.

leftJoin performs a Left Outer Join
, returning all rows from the left table and any matching rows from the right table.
You get the data that matches the join condition while retaining all the data from the left table.

These joins are a feature of SQL.
They join multiple tables within a query.
In terms of queries, they look like this:

0 => array:3[ "query" => "select * from `blogs` inner join `comments` on `blogs`.`id` = `blog_id`" "bindings" => [] "time" => 1.5 ],

It's just one query!
That's a huge improvement!

So what about with?

with

The with query
runs two times, which is worse than join , but it's a huge improvement over the N+1 loops.

By the way, the string that is the argument to with is the name of the relationship definition method in the model file.
(It is essentially the same as the name of the related model, but if you don't define it as a method, you won't be able to use with, so be careful.)

$blogAndComments= Blog::query() ->with('comments') ->first();

Looking at the query, we can see that using whereIn allows us to retrieve all the data at once

0 => array:3[ "query" => "select * from `blogs`" "bindings" => [] "time" => 0.5 ], 1 => array:3[ "query" => "select * from `blogs` where `blogs`.`blogt_id` in (?)" "bindings" => [] "time" => 1.5 ],

Characteristics and uses of each

We obtained the model using two methods, so let's take a look at the characteristics of each

● (left) Join
Combines models based on join conditions.
Join is an SQL function used to retrieve data from multiple tables at once.
When retrieving, only the amount of data that matches the join conditions is created .
In this example, you can think of the final data being created as the same number as the Comment model.
If multiple tables are joined, invalid data may be generated.

However, the ability to obtain the data all at once is attractive, and
it is ideal when it is not required as a model or when simple, flat data is desired.

● With
retrieves models in the form of an associative array without combining them.
pre-loading the model relationships in ORM , it solves the N+1 problem.

In the example, the Comment model associated with the Bolg model will exist under the "relations" property.
Models can be nested and retrieved in the hierarchical order of their parent-child relationships, so models are not merged.
Related models and related data are retrieved all at once, eliminating redundant data.
This is also useful when you want to treat the retrieved data as a model as is.

By the way, with can be used to retrieve relationships as long as they continue, such as (ModelA.ModelB.ModelC.), but since nesting means that all passed models will be included as outer shells, you will eventually need to remove them by unset or similar

Summary and digression

Now that all the information is out, I'll summarize it and then add a little digression before concluding

summary

What do you think?
There was a lot of information, so I put it in a table.

I don't know which one is truly useful!!!
The result is roughly speaking retrieving data from multiple tables, but what they do is very different.

So, the general conclusion is, "Change the acquisition method according to the data format you want."
The easiest way to tell is to determine
whether you want the model in its original form or whether it's ok to combine it Or, you can use either the nested or flat form depending on which one is easier to use in future processing

In addition, I think it should
be changed depending on the total amount of data you want to retrieve Since with loads relations, the deeper the nesting, the slower it will be than join .
In other words, it depends.

A side note (EagerLoad)

I would like to write about EagerLoad.
You may have already understood by now that it basically means that you don't want to issue SQL every time.

This is a bit boring, so let me go into more detail
"EagerLoad is only available for dynamic properties"That's what it means.
(Maybe that doesn't make sense)
First, let's talk about the difference between dynamic properties and relationship methods.
That is to say,

$blog= new Blog; $blog->comments;

This is the difference

$blog= new Blog; $blog->comments();

->comments is a dynamic property that returns a Collection and
has the property "lazy loading," meaning that the related data is loaded only when accessed,
whereas ->comments() returns a related object.

You cannot chain query builders together with dynamic properties, but

you can chain them like this: ->comments()->where('create_user_id', 1)

So it seems better to retrieve it as a related object.
Here, remember the properties of EagerLoad such as with.

Yes, you have issued a query beforehand .
Since the data has been retrieved (in the relations property),
the query will not be executed when you write $blog->comment->create_user_id .

This
means that "EagerLoad cannot be used unless the property is dynamic", or in other words,
, the subsequent processing must also be dynamic properties, otherwise it will be meaningless!!" It's a paradoxical situation

So how do you apply conditions, restrictions and searches when retrieving data?!

$blog= Bolg::with(['comments' => function ($query) { $query->where('create_user_id', 1); }])->first(); $comments= $blog->comments;

By doing this, you can combine EagerLoad with conditions, constraints, and searches.
By the way, if you have conditions that you think you will use often, you can make them into methods in the model file, which will improve readability and be more convenient.

The end

I hope this article will make access to databases around the world a little easier

If you found this article helpful , please give it a like!
6
Loading...
6 votes, average: 1.00 / 16
848
X facebook Hatena Bookmark pocket

The person who wrote this article

About the author

Enoki

I play everything from FPS, RPG, MMO, crafting, etc.