[Osaka/Yokohama/Tokushima] Looking for infrastructure/server side engineers!

[Deployed by over 500 companies] AWS construction, operation, maintenance, and monitoring services

[Successor to CentOS] AlmaLinux OS server construction/migration service

[For WordPress only] Cloud server “Web Speed”

Introducing the WHATWG URL API, officially implemented in Node.js 8.0!

2017.06.08

Web system development

table of contents [非表示]

1 What is WHATWG URL API? ?
2 Behavior of URL objects
3 URLSearchParams class
4 summary

Hello.
I'm Mandai, in charge of Wild on the development team.

A little while ago, Node.js 8.0.0 was released on May 30, 2017.
Starting from this version, npm version 5.0.0 is bundled, and the code around caching has been rewritten and seems to be faster.

A tweet comparing the speed with past versions has also been published, and in this example, it appears that installation is completed at 1/5th the speed of previous versions.

With #npm5 about to come out, I thought I'd update those benchmarks.

Here's the npm5 code I'm working on, vs npm@4.6.1 on a popular repo pic.twitter.com/KWPfbpE46p

— ✨11x gayer Kat✨ (@maybekatz) May 19, 2017

The current version of the V8 engine is version 5.8, but it seems to be compatible with V8 5.9 and V8 6.0, and future versions can be expected to be even faster with an upgraded version of the V8 engine. . → Node.js 8.0 released. npm 5.0 bundle, Node.js API included, official support for WHATWG URL parser, etc. - Publickey

This time, I would like to take a look at the WHATWG URL API, which was officially implemented in Node.js 8.0.

What is WHATWG URL API? ?

Actually, the WHATWG URL API has existed since the Node.js 7 series, but it became the official version in 8.0.0.
Many people may have already used it, but since it was in an "experimental" position, they may have been a little hesitant to use it in a production environment.

This API is intended to standardize URL parsing and is provided as an extension of the conventional URL module.

const URL = require('url').URL; const beyondUrl = new URL('http://www.beyondjapan.com/?abc=123&xyz=999#first'); console.log(beyondUrl); // Result URL { href: 'http://www.beyondjapan.com/?abc=123&xyz=999#first', origin: 'http://www.beyondjapan.com', protocol: 'http:', username: '' , password: '', host: 'www.beyondjapan.com', hostname: 'www.beyondjapan.com', port: '', pathname: '/', search: '?abc=123&xyz=999', searchParams: URLSearchParams { 'abc' => '123', 'xyz' => '999' }, hash: '#first' }

Of course you can also use the URL module, and it feels the same as before.

const url = require('url'); const beyondUrl = 'http://www.beyondjapan.com/?abc=123&xyz=999#first'; console.log(url.parse(beyondUrl)); // Result Url { protocol: 'http:', slashes: true, auth: null, host: 'www.beyondjapan.com', port: null, hostname: 'www.beyondjapan.com', hash: '#first', search: ' ?abc=123&xyz=999', query: 'abc=123&xyz=999', pathname: '/', path: '/?abc=123&xyz=999', href: 'http://www.beyondjapan.com/? abc=123&xyz=999#first' }

It's confusing that only the object names are slightly different, but the contents of the output objects are also slightly different.
In the response from the WHATWG URL API, it is convenient that the query string is parsed and returned using the key searchParams.

This alone makes you want to use it.

Behavior of URL objects

The URL object returned from the WHATWG API can also access each data.

1	`const u =` `'http://www.beyondjapan.com/?abc=123&xyz=999#first'; const URL = require('url').URL; const beyondUrl =` `new` `URL(u); console.log(beyondUrl .hostname); // Result www.beyondjapan.com`

Try using a different host name.

const u = 'http://www.beyondjapan.com/?abc=123&xyz=999#first'; const URL = require('url').URL; const beyondUrl = new URL(u); beyondUrl.hostname = ' example.com'; console.log(beyondUrl); // Result URL { href: 'http://example.com/?abc=123&xyz=999#first', origin: 'http://example.com', protocol: 'http:', username: '', password: '', host: 'example.com', hostname: 'example.com', port: '', pathname: '/', search: '?abc= 123&xyz=999', searchParams: URLSearchParams { 'abc' => '123', 'xyz' => '999' }, hash: '#first' }

Only the host name is recognized and rewritten, so if you do the following, only the host name will be changed.

const u = 'http://www.beyondjapan.com/?abc=123&xyz=999#first'; const URL = require('url').URL; const beyondUrl = new URL(u); beyondUrl.hostname = ' example.com:443'; // Try adding the port number console.log(beyondUrl); // Result URL { href: 'http://example.com/?abc=123&xyz=999#first', origin: 'http://example.com', protocol: 'http:', // unchanged username: '', password: '', host: 'example.com', hostname: 'example.com', port: ' ', // unchanged pathname: '/', search: '?abc=123&xyz=999', searchParams: URLSearchParams { 'abc' => '123', 'xyz' => '999' }, hash: '# first' }

If you want to change the port number, you need to change the port number properly.

const u = 'http://www.beyondjapan.com/?abc=123&xyz=999#first'; const URL = require('url').URL; const beyondUrl = new URL(u); beyondUrl.port = 443 ; console.log(beyondUrl); // Result URL { href: 'http://www.beyondjapan.com:443/?abc=123&xyz=999#first', origin: 'http://www.beyondjapan.com :443', protocol: 'http:', // unchanged username: '', password: '', host: 'www.beyondjapan.com:443', hostname: 'www.beyondjapan.com', port: ' 443', pathname: '/', search: '?abc=123&xyz=999', searchParams: URLSearchParams { 'abc' => '123', 'xyz' => '999' }, hash: '#first' }

However, this does not seem to be the case when changing the host.

const u = 'http://www.beyondjapan.com/?abc=123&xyz=999#first'; const URL = require('url').URL; const beyondUrl = new URL(u); beyondUrl.host = ' example.com:443'; console.log(beyondUrl); // Result URL { href: 'http://example.com:443/?abc=123&xyz=999#first', origin: 'http://example .com:443', protocol: 'http:', // unchanged username: '', password: '', host: 'example.com:443', // changed hostname: 'example.com', / / changed port: '443', // changed pathname: '/', search: '?abc=123&xyz=999', searchParams: URLSearchParams { 'abc' => '123', 'xyz' => '999 ' }, hash: '#first' }

URLSearchParams class

Earlier we looked at the behavior of the URL object, but now let's look at the URLSearchParams class obtained from URL.searchParams.
This object is a class implemented in the Node.js 7 series that parses the query string and provides getters/setters.

the official documentation where it is compared to the querystring module, but it seems that the URLSearchParams class is not as flexible as the querystring module, so this does not mean that the querystring module is unnecessary.

The URLSearchParams class is provided as a class in the URL module, so it can also be used independently.
Therefore, it is a powerful class that can be used not only for analysis but also for generation.

const {URLSearchParams} = require('url'); const qs = 'abc=123&xyz=456&aaa=789'; const qsObject = { abc:123, xyz:456, aaa:789 }; const qsIterable = [ ['abc' , 123], ['xyz', 456], ['aaa', 789], ]; const qsMap = new Map(); qsMap.set('abc', 123); qsMap.set('xyz', 456 ); qsMap.set('aaa', 789); function* qsGenerator(){ yield ['abc', 123]; yield ['xyz', 456]; yield ['aaa', 789]; } const params1 = new URLSearchParams(qs); // Even a regular query string format const params2 = new URLSearchParams(qsObject); // Even a regular object const params3 = new URLSearchParams(qsIterable); // Even an iterator const params4 = new URLSearchParams (qsMap); // Also in the Map object const params5 = new URLSearchParams(qsGenerator()); // Even in the generator console.log(params1); console.log(params2); console.log(params3); console.log(params4 ); console.log(params5); // Result URLSearchParams { 'abc' => '123', 'xyz' => '456', 'aaa' => '789' } URLSearchParams { 'abc' => '123 ', 'xyz' => '456', 'aaa' => '789' } URLSearchParams { 'abc' => '123', 'xyz' => '456', 'aaa' => '789' } URLSearchParams { 'abc' => '123', 'xyz' => '456', 'aaa' => '789' } URLSearchParams { 'abc' => '123', 'xyz' => '456', 'aaa ' => '789' }

She is a picky child who will eat anything, including objects, arrays, map objects, and generators.

The created URLSearchParams object has various methods.

append to add

const {URLSearchParams} = require('url'); const qs = 'abc=123&xyz=456&aaa=789'; const params = new URLSearchParams(qs); params.append('bbb', 963); console.log(params .toString()) // result // abc=123&xyz=456&aaa=789&bbb=963

delete

const {URLSearchParams} = require('url'); const qs = 'abc=123&xyz=456&aaa=789'; const params = new URLSearchParams(qs); params.delete('bbb'); console.log(params.toString ()); // Result // abc=123&xyz=456&aaa=789

entries returning an iterator

const {URLSearchParams} = require('url'); const qs = 'abc=123&xyz=456&aaa=789'; const params = new URLSearchParams(qs); for (let v of params.entries()) console.log(v ); // result /* [ 'abc', '123' ] [ 'xyz', '456' ] [ 'aaa', '789' ] */

forEach for all loop

const {URLSearchParams} = require('url'); const qs = 'abc=123&xyz=456&aaa=789'; const params = new URLSearchParams(qs); params.forEach((value, key, p) => { console. log(value, key, p); }) // result /* 123 abc URLSearchParams { 'abc' => '123', 'xyz' => '456', 'aaa' => '789' } 456 xyz URLSearchParams { 'abc' => '123', 'xyz' => '456', 'aaa' => '789' } 789 aaa URLSearchParams { 'abc' => '123', 'xyz' => '456', 'aaa' => '789' } */

get returns the value of the argument key

const {URLSearchParams} = require('url'); const qs = 'abc=123&xyz=456&aaa=789'; const params = new URLSearchParams(qs); console.log(params.get('abc')); // Result // 123

getAll returns all the values of the argument keys

Since we are talking about how it differs from get, I created two samples.

const {URLSearchParams} = require('url'); const qs = 'abc=123&xyz=456&aaa=789'; const params = new URLSearchParams(qs); console.log(params.getAll('abc')) // Result // [ 'one two three' ]

const {URLSearchParams} = require('url'); const qs = 'abc=123&xyz=456&aaa=789&abc=777'; const params = new URLSearchParams(qs); console.log(params2.getAll('abc')) / / result // [ '123', '777' ]

The URLSearchParams object allows duplicate keys, so a getAll method is provided.

By the way, in the case of the get method, the specification is to return the first registered key among duplicate keys, so it is possible that the value can only be accessed in getAll or within a loop.

Check existence has

const {URLSearchParams} = require('url'); const qs = 'abc=123&xyz=456&aaa=789'; const params = new URLSearchParams(qs); console.log(params.has('abc')); // Result // true

keys returns an iterator of keys

const {URLSearchParams} = require('url'); const qs = 'abc=123&xyz=456&aaa=789'; const params = new URLSearchParams(qs); for (let k of params.keys()) console.log(k ); // result /* abc xyz aaa */

set to overwrite

const {URLSearchParams} = require('url'); const qs = 'abc=123&xyz=456&aaa=789'; const params = new URLSearchParams(qs); params.set('vvv', 247); console.log(params .toString()); // result // abc=123&xyz=456&aaa=789&vvv=247 params.set('vvv', 247); console.log(params.toString()); // result // abc=123&xyz =456&aaa=789&vvv=247

If the corresponding key does not exist, it performs an action equivalent to append, and if the key exists, it overwrites the contents of the key.
If multiple keys exist, it seems to append after deleting them all.

const qs = 'a=1&a=2&a=3'; const params = new URLSearchParams(qs); console.log(params.toString()); // At this point // a=1&a=2&a=3 params.set ('a', 4); console.log(params.toString()); // result // a=4

Destructive sort sort

Sorts the contents of objects by name.
It seems that reverse sorting is not possible.

Note that it does not return a URLSearchParams object with the objects rearranged, but the order of the executed objects is rearranged (although the order is not something you really care about).

const {URLSearchParams} = require('url'); const qs = 'abc=123&xyz=456&aaa=789'; const params = new URLSearchParams(qs); console.log(params.toString()); params.sort() ; console.log(params.toString()); // Result // Before execution // abc=123&xyz=456&aaa=789&vvv=247 // After execution // aaa=789&abc=123&vvv=247&xyz=456

values returns an iterator of values

const {URLSearchParams} = require('url'); const qs = 'abc=123&xyz=456&aaa=789'; const params = new URLSearchParams(qs); for (let v of params.values()) console.log(v ); // result /* 789 123 247 456 */

URLSearchParams object itself is iterable

const {URLSearchParams} = require('url'); const qs = 'abc=123&xyz=456&aaa=789'; const params = new URLSearchParams(qs); for (const [k, v] of params) console.log(k , v); // result /* aaa 789 abc 123 vvv 247 xyz 456 */

summary

I have summarized the WHATWG URL API and URLSearchParams, which will likely play an important role in URL parsing in the future.I hope this helps you understand.
The implementation around query strings, which was surprisingly troublesome, seems to be progressing.

That's it.

If you found this article helpful , please give it a like!

[2026.6.30 Amazon Linux 2 end of support] Amazon Linux server migration solution

The person who wrote this article

About the author

Yoichi Bandai

My main job is developing web APIs for social games, but I'm also fortunate to be able to do a lot of other work, including marketing.
Furthermore, my portrait rights in Beyond are treated as CC0 by him.

[AWS] How does CloudFront cost compared to other CDNs? Is Beyond a server shop?