Parsim URL

I want to share one useful utility written in pure JavaScript, the URL. In fact, this is a small URL parser, which works almost like window.location , but does not reload the browser page during manipulations.

And at the same time I will say a few words about getters & setters in JavaScript.

UPD1: at the request of the workers, I will bring up examples here:

// URL = 'http://my.site.com/somepath/' var u = new URL( 'relative/path/index.html' ) u.href // my.site.com/somepath/relative/path/index.html u.href = '/absolute/path.php?a=8#some-hash' u.href // my.site.com/absolute/path.php?a=8#some-hash u.hash // #some-hash u.protocol = 'https:' u.href // my.site.com/absolute/path.php?a=8#some-hash u.host = 'another.site.com:8080' u.href // another.site.com:8080/absolute/path.php?a=8#some-hash u.port // 8080 // , * This source code was highlighted with Source Code Highlighter .

It works in FF3 + (maybe in 2+, I have not tried it) and in IE6 + (and this is my know-how :-)).
The article also contains a fully cross-browser implementation, but in use it is a bit more cumbersome:

// URL = 'http://my.site.com/somepath/' var u = new URL( 'relative/path/index.html' ) u.href() // my.site.com/somepath/relative/path/index.html u.href( '/absolute/path.php?a=8#some-hash' ) u.href() // my.site.com/absolute/path.php?a=8#some-hash // .. * This source code was highlighted with Source Code Highlighter .

')
Yes, and I give my listing completely, sorry, it should be so.

UPD2: briefly explain the purpose of my library:
This tulza originated precisely from practical needs.
And I have already seen several handicraft developments of a similar purpose in large JS projects, such as TinyMCE. In RTE, you often deal with links to resources. And these links need to be processed in real-time.

Specifically, I had to parse the current URL and change / add a new parameter to the search, followed by a redirect.

You can think of more.

Problem

What is the problem? The problem is that:

We cannot use the window.location object, since it reloads the current page with the slightest change
We cannot create another similar object through the Location constructor - atat! prohibited by browsers!
The object itself is rather non-trivial in behavior.
Well, I did not find any finished implementation :)

I mentioned the non-trivial behavior. Here it is:

Figure: Link URL Parts

When changing any part of the URL, others should be updated.

Parsing

As a matter of fact, I will create a semblance of window.location , so I’m window.location symbols from there. Let us examine an example:

Figure: Parse URL Parts

No comments :)

No matter how cool you can do without RegExp

The main work will be done, of course, by Regular Expression:

var pattern = "^(([^:/\\?#]+):)?(//(([^:/\\?#]*)(?::([^/\\?#]*))?))?([^\\?#]*)(\\?([^#]*))?(#(.*))?$" ; * This source code was highlighted with Source Code Highlighter .

Now in more detail:

var pattern = // Match #0. URL (#0 - HREF, window.location). // , #0 == "https://example.com:8080/some/path/index.html?p=1&q=2&r=3#some-hash" "^" + // Match #1 & #2. SCHEME (#1 - PROTOCOL, window.location). // , #1 == "https:", #2 == "https" "(([^:/\\?#]+):)?" + // Match #3-#6. AUTHORITY (#4 = HOST, #5 = HOSTNAME #6 = PORT, window.location) // , #3 == "//example.com:8080", #4 == "example.com:8080", #5 == "example.com", #6 == "8080" "(" + "//(([^:/\\?#]*)(?::([^/\\?#]*))?)" + ")?" + // Match #7. PATH (#7 = PATHNAME, window.location). // , #7 == "/some/path/index.html" "([^\\?#]*)" + // Match #8 & #9. QUERY (#8 = SEARCH, window.location). // , #8 == "?p=1&q=2&r=3", #9 == "p=1&q=2&r=3" "(\\?([^#]*))?" + // Match #10 & #11. FRAGMENT (#10 = HASH, window.location). // , #10 == "#some-hash", #11 == "some-hash" "(#(.*))?" + "$" ; * This source code was highlighted with Source Code Highlighter .

As you might guess, this RegExp will work not only in JavaScript, but also in hundreds of other languages. Use on health! ;)

Attempt # 1

function URL(url) { url = url || "" ; this .parse(url); } URL.prototype = { // this.href, this.parse() href: "" , // - , this.update() protocol: "" , host: "" , hostname: "" , port: "" , pathname: "" , search: "" , hash: "" , parse: function (url) { url = url || this .href; var pattern = "^(([^:/\\?#]+):)?(//(([^:/\\?#]*)(?::([^/\\?#]*))?))?([^\\?#]*)(\\?([^#]*))?(#(.*))?$" ; var rx = new RegExp(pattern); var parts = rx.exec(url); this .href = parts[0] || "" ; this .protocol = parts[1] || "" ; this .host = parts[4] || "" ; this .hostname = parts[5] || "" ; this .port = parts[6] || "" ; this .pathname = parts[7] || "/" ; this .search = parts[8] || "" ; this .hash = parts[10] || "" ; this .update(); }, update: function () { // protocol - , if (! this .protocol) this .protocol = window.location.protocol; // relative pathname/URL - relative, "" this .pathname = this .pathname.replace(/^\s*/g, '' ); if (! this .host && this .pathname && !/^\ //.test(this.pathname)) { // , . . var _p = window.location.pathname.split( '/' ); _p[_p.length - 1] = this .pathname; this .pathname = _p.join( '/' ); }; // hostname - , if (! this .hostname) this .hostname = window.location.hostname; this .host = this .hostname + (( "" + this .port) ? ":" + this .port : "" ); this .href = this .protocol + '//' + this .host + this .pathname + this .search + this .hash; }, /** * window.location. URL. */ assign: function (url) { this .parse(url); window.location.assign( this .href); }, /** * window.location. URL, history */ replace: function (url) { this .parse(url); window.location.replace( this .href); } } * This source code was highlighted with Source Code Highlighter .

In details

As you can see, there are habitual for window.location attributes href, port, hash , etc.
The methods familiar to window.location are also present: assign(...), replace(...)
The parse(...) method does the main work - parsing the URL into its component parts.
And the update(...) method update(...) - updates all parts, if one of them has been changed.

Everything would be fine, but we oblige the user to constantly call update(...) and parse(...) after changing any part of the URL (for example, port). It's horrible. After all, the user can forget to do it, and then everything flies to Tartar.

Unfortunately, in this implementation it does not go away. But you can do everything differently :)

Attempt # 2

And now I will propose an acceptable option. We need getters & setters. The most obvious way is to create (for getProtocol() & setProtocol(newProtocol) ) getProtocol() & setProtocol(newProtocol) methods for each parameter. But I do not like this approach because of its bulkiness.

Let's do it in more javascript way. There will be one protocol(...) method and if we call it without parameters, then this is getter, and if with one parameter, then setter.

We will hide the real data in the closure.

var URL; // . , .. parseURL updateURL. ( function () { URL = function (url) { // , . URL - , . var href, protocol, host, hostname, port, pathname, search, hash; // - , . // Get/set href - set parseURL.call(this), // .. parseURL URL - this. this .href = function (val) { if ( typeof val != "undefined" ) { href = val; parseURL.call( this ); } return href; } // Get/set protocol // set href, set protocol updateURL.call(this), . this .protocol = function (val) { if ( typeof val != "undefined" ) { // - protocol , window.location if (!val) val = protocol || window.location.protocol; protocol = val; updateURL.call( this ); } return protocol; } // Get/set host // , host, hostname port - . // set host. this .host = function (val) { if ( typeof val != "undefined" ) { val = val || '' ; var v = val.split( ':' ); var h = v[0], p = v[1] || '' ; host = val; hostname = h; port = p; updateURL.call( this ); } return host; } // Get/set hostname // host, hostname port. this .hostname = function (val) { if ( typeof val != "undefined" ) { if (!val) val = hostname || window.location.hostname; hostname = val; host = val + (( "" + port) ? ":" + port : "" ); updateURL.call( this ); } return hostname; } // Get/set port // host, hostname port. this .port = function (val) { if ( typeof val != "undefined" ) { port = val; host = hostname + (( "" + port) ? ":" + port : "" ); updateURL.call( this ); } return port; } // Get/set pathname // pathname . // relative pathname, .. set' pathname, // '/', . this .pathname = function (val) { if ( typeof val != "undefined" ) { if (val.indexOf( "/" ) != 0) { // relative url var _p = (pathname || window.location.pathname).split( "/" ); _p[_p.length - 1] = val; val = _p.join( "/" ); } pathname = val; updateURL.call( this ); } return pathname; } // Get/set search this .search = function (val) { if ( typeof val != "undefined" ) { search = val; } return search; } // Get/set hash this .hash = function (val) { if ( typeof val != "undefined" ) { hash = val; } return hash; } url = url || "" ; parseURL.call( this , url); } URL.prototype = { /** * window.location. URL. */ assign: function (url) { parseURL.call( this , url); window.location.assign( this .href()); }, /** * window.location. URL, history */ replace: function (url) { parseURL.call( this , url); window.location.replace( this .href()); } } // , URL . // - URL. // , .. . function parseURL(url) { if ( this ._innerUse) return ; url = url || this .href(); var pattern = "^(([^:/\\?#]+):)?(//(([^:/\\?#]*)(?::([^/\\?#]*))?))?([^\\?#]*)(\\?([^#]*))?(#(.*))?$" ; var rx = new RegExp(pattern); var parts = rx.exec(url); // Prevent infinite recursion this ._innerUse = true ; this .href(parts[0] || "" ); this .protocol(parts[1] || "" ); //this.host(parts[4] || ""); this .hostname(parts[5] || "" ); this .port(parts[6] || "" ); this .pathname(parts[7] || "/" ); this .search(parts[8] || "" ); this .hash(parts[10] || "" ); delete this ._innerUse; updateURL.call( this ); } // , URL . // - URL. // , .. . // , , setter'. function updateURL() { if ( this ._innerUse) return ; // Prevent infinite recursion this ._innerUse = true ; this .href( this .protocol() + '//' + this .host() + this .pathname() + this .search() + this .hash()); delete this ._innerUse; } })() * This source code was highlighted with Source Code Highlighter .

In general, the code is self-documented, so I will explain only the key points:

The prototype has been depleted of 2 methods parse(...) and update(...) , which were rendered, respectively, in the parseURL(...) and updateURL(...) functions
Also, all data ( href, port, host , etc.) left the prototype, and settled in the closure created by the constructor. And work with them now goes through getters & setters

Examples

Well, immediately to the examples. After all, the main thing - to see this thing in action.

// URL = 'http://my.site.com/somepath/' var u = new URL( 'relative/path/index.html' ) u.href() // my.site.com/somepath/relative/path/index.html u.href( '/absolute/path.php?a=8#some-hash' ) u.href() // my.site.com/absolute/path.php?a=8#some-hash u.hash() // #some-hash u.protocol( 'https:' ) u.href() // my.site.com/absolute/path.php?a=8#some-hash u.host( 'another.site.com:8080' ) u.href() // another.site.com:8080/absolute/path.php?a=8#some-hash u.port() // 8080 // , * This source code was highlighted with Source Code Highlighter .

Like this. Everything is working.
In general, this is quite a working version. Let's call it version 1.0 final.
Now let's move on to version 2.0 alpha, or tru getters and setters come into play.

Attempt number 3

I will give the code, and then I will consider the interesting moments.

var URL; ( function () { var isIE = window.navigator.userAgent.indexOf( 'MSIE' ) != -1; URL = function (url) { var data = {href: '' , protocol: '' , host: '' , hostname: '' , port: '' , pathname: '' , search: '' , hash: '' }; var gs = { getHref: function () { return data.href; }, setHref: function (val) { data.href = val; parseURL.call( this ); return data.href; }, getProtocol: function () { return data.protocol; }, setProtocol: function (val) { if (!val) val = data.protocol || window.location.protocol; // update || init data.protocol = val; updateURL.call( this ); return data.protocol; }, getHost: function () { return data.host; }, setHost: function (val) { val = val || '' ; var v = val.split( ':' ); var h = v[0], p = v[1] || '' ; data.host = val; data.hostname = h; data.port = p; updateURL.call( this ); return data.host; }, getHostname: function () { return data.hostname; }, setHostname: function (val) { if (!val) val = data.hostname || window.location.hostname; // update || init data.hostname = val; data.host = val + (( "" + data.port) ? ":" + data.port : "" ); updateURL.call( this ); return data.hostname; }, getPort: function () { return data.port; }, setPort: function (val) { data.port = val; data.host = data.hostname + (( "" + data.port) ? ":" + data.port : "" ); updateURL.call( this ); return data.port; }, getPathname: function () { return data.pathname; }, setPathname: function (val) { if (val.indexOf( "/" ) != 0) { // relative url var _p = (data.pathname || window.location.pathname).split( "/" ); _p[_p.length - 1] = val; val = _p.join( "/" ); } data.pathname = val; updateURL.call( this ); return data.pathname; }, getSearch: function () { return data.search; }, setSearch: function (val) { return data.search = val; }, getHash: function () { return data.hash; }, setHash: function (val) { return data.hash = val; } }; if (isIE) { // IE5.5+ var el= document .createElement( 'div' ); el.style.display= 'none' ; document .body.appendChild(el); el.assign = URL.prototype.assign; el.replace = URL.prototype.replace; var keys = [ "href" , "protocol" , "host" , "hostname" , "port" , "pathname" , "search" , "hash" ]; el.onpropertychange= function (){ var pn = event .propertyName; var pv = event .srcElement[ event .propertyName]; if ( this ._holdOnMSIE || pn == '_holdOnMSIE' ) return pv; this ._holdOnMSIE = true ; for ( var i = 0, l = keys.length; i < l; i++) el[keys[i]] = data[keys[i]]; this ._holdOnMSIE = false ; for ( var i = 0, l = keys.length; i < l; i++) { var key = keys[i]; if (pn == key) { var sKey = 'set' + key.substr(0, 1).toUpperCase() + key.substr(1); return gs[sKey].call(el, pv); } } } url = url || "" ; parseURL.call(el, url); return el; } else if (URL.prototype.__defineSetter__) { // FF var keys = [ "href" , "protocol" , "host" , "hostname" , "port" , "pathname" , "search" , "hash" ]; for ( var i = 0, l = keys.length; i < l; i++) { ( function (i) { var key = keys[i]; var gKey = 'get' + key.substr(0, 1).toUpperCase() + key.substr(1); var sKey = 'set' + key.substr(0, 1).toUpperCase() + key.substr(1); URL.prototype.__defineGetter__(key, gs[gKey]); URL.prototype.__defineSetter__(key, gs[sKey]); })(i); } url = url || "" ; parseURL.call( this , url); } } URL.prototype = { assign: function (url) { parseURL.call( this , url); window.location.assign( this .href); }, replace: function (url) { parseURL.call( this , url); window.location.replace( this .href); } } function parseURL(url) { if ( this ._innerUse) return ; url = url || this .href; var pattern = "^(([^:/\\?#]+):)?(//(([^:/\\?#]*)(?::([^/\\?#]*))?))?([^\\?#]*)(\\?([^#]*))?(#(.*))?$" ; var rx = new RegExp(pattern); var parts = rx.exec(url); // Prevent infinite recursion this ._innerUse = true ; this .href = parts[0] || "" ; this .protocol = parts[1] || "" ; //this.host = parts[4] || ""; this .hostname = parts[5] || "" ; this .port = parts[6] || "" ; this .pathname = parts[7] || "/" ; this .search = parts[8] || "" ; this .hash = parts[10] || "" ; if (!isIE) delete this ._innerUse; else this ._innerUse = false ; updateURL.call( this ); } function updateURL() { if ( this ._innerUse) return ; // Prevent infinite recursion this ._innerUse = true ; this .href = this .protocol + '//' + this .host + this .pathname + this .search + this .hash; if (!isIE) delete this ._innerUse; else this ._innerUse = false ; } })() * This source code was highlighted with Source Code Highlighter .

Consider creating getters / setters:

Firefox case:
var keys = [ "href" , "protocol" , "host" , "hostname" , "port" , "pathname" , "search" , "hash" ]; for ( var i = 0, l = keys.length; i < l; i++) { ( function (i) { var key = keys[i]; var gKey = 'get' + key.substr(0, 1).toUpperCase() + key.substr(1); var sKey = 'set' + key.substr(0, 1).toUpperCase() + key.substr(1); URL.prototype.__defineGetter__(key, gs[gKey]); URL.prototype.__defineSetter__(key, gs[sKey]); })(i); } * This source code was highlighted with Source Code Highlighter .

Use the magic URL.prototype.__defineGetter__ and URL.prototype.__defineSetter__ . As a result, we will have pseudo-attributes url.href, url.path , etc., changing which handler functions will actually be called.
Case for Internet Explorer: here begins the dance with a tambourine. IE <8 versions do not have getter / setter mechanisms at all. However, there is a wonderful event - onpropertchange . There is nothing left to use. However, a complication arises - this event is present only in DOM elements, and even then only when these elements are already included in the DOM model . Well, so do:
var el = document .createElement( 'div' ); el.style.display = 'none' ; document .body.appendChild(el); // ... el.onpropertychange = function (){ var pn = event .propertyName; // var pv = event .srcElement[ event .propertyName]; // // ... } // ... return el; // el. .. new URL(...) // URL, DIV. // , , .. // instanceof. , IE, :) * This source code was highlighted with Source Code Highlighter .

Examples number 2

// URL = 'http://my.site.com/somepath/' var u = new URL( 'relative/path/index.html' ) u.href // my.site.com/somepath/relative/path/index.html u.href = '/absolute/path.php?a=8#some-hash' u.href // my.site.com/absolute/path.php?a=8#some-hash u.hash // #some-hash u.protocol = 'https:' u.href // my.site.com/absolute/path.php?a=8#some-hash u.host = 'another.site.com:8080' u.href // another.site.com:8080/absolute/path.php?a=8#some-hash u.port // 8080 // , * This source code was highlighted with Source Code Highlighter .

Works in FF3 +, IE6 +. You can screw for Safari / Chrome. What about Opera - not sure. RTFM required.

Like this

I hope I did something useful and not wasted my day on writing this article :-)
PS: yes, I think to write a separate article dedicated to getters and setters in different browsers. Firefox doesn’t live by one thing (small PR: in order not to load Habrahabr with my stream of thoughts - welcome to my blog - http://web-by-kott.blogspot.com/ . There’s still something deserted, but I’m just I'm starting)

Source: https://habr.com/ru/post/65407/

All Articles