📜 ⬆️ ⬇️

Practical JS: innerHTML problems

Note: The following is a translation of Julien Lecomte 's article “The Problem With innerHTML” , in which the author considers problems when using the innerHTML method in modern browsers and offers a number of tips on how to avoid it. My comments further in italics

The innerHTML property is extremely popular among web developers because of its simplicity and convenience, since it allows you to simply replace the HTML content of a particular tag. You can also use the DOM Level 2 API ( removeChild , createElement , appendChild ), but using the innerHTML much simpler and more efficient way to modify the DOM tree. However, there are a number of problems when using innerHTML , which should be avoided:


')
There are a few more minor flaws that are also worth mentioning:





Personally, I’m more concerned about security and memory usage issues related to the use of the innerHTML property. However, the described problem is far from new, and already some bright minds paid attention to it and suggested methods for its solution.

Douglas Crockford has written a purge function that removes some circular references caused by the addition of event handlers to HTML elements, allowing the garbage collector to completely free all memory associated with them.

Removing the <script> from an HTML string is not as easy as it seems at first glance. The regular expression used for this purpose should be quite complex, although I myself do not know if it covers all possible cases. Below is an option that I personally use in my work:

  / <script [^>] *> [\ S \ s] *? <\ / script [^>] *> / ig 


Now let's try to combine both of these techniques in one setInnerHTML function ( UPD: thanks to everyone who added their comments: I corrected all the errors / holes you pointed out. I also decided to enable the setInnerHTML function in YAHOO.util.Dom )

 YAHOO.util.Dom.setInnerHTML = function (el, html) {
	 el = YAHOO.util.Dom.get (el);
	 if (! el || typeof html! == 'string') {
		 return null;
	 }

	 // delete circular references
	 (function (o) {

		 var a = o.attributes, i, l, n, c;
		 if (a) {
			 l = a.length;
			 for (i = 0; i <l; i + = 1) {
				 n = a [i] .name;
				 if (typeof o [n] === 'function') {
					 o [n] = null;
				 }
			 }
		 }

		 a = o.childNodes;

		 if (a) {
			 l = a.length;
			 for (i = 0; i <l; i + = 1) {
				 c = o.childNodes [i];

				 // Remove the child nodes 
				 arguments.callee (c);

				 // Remove all event handlers,
				 // added to the element via YUI addListener
				 YAHOO.util.Event.purgeElement (c);
			 }
		 }

	 }) (el);

	 // Remove all scripts from the HTML string and set the innerHTML property
	 el.innerHTML = html.replace (/ <script [^>] *> [\ S \ s] *? <\ / script [^>] *> / ig, "");

	 // Return the link to the first child node
	 return el.firstChild;
 }; 


Voila! Please let me know if I missed something in the implementation of this function or need to improve the regular expression.

UPD: There are a huge number of ways to insert malicious code on a web page. The setInnerHTML function only normalizes the behavior of the <script> when executed on all class A browsers ( A-grade browsers ). If you are going to insert HTML code that cannot be trusted, clean it first on the server side. For this there is a huge number of libraries.

Thanks to everyone who read the note. I will be glad to hear from you on how to improve the given example.

Source: https://habr.com/ru/post/31413/


All Articles