Content
Browser - extremely hostile software environment
Douglas Crockford, JavaScript Programming Language (video lecture)')
The next part of the book will talk about web browsers. Without them there would be no javascript. And if he had, no one would have paid attention to him.
From the very beginning, web technologies were decentralized - not only technically, but also in terms of their evolution. Various browser developers added new functionality “on the occasion”, not thought out, and often this functionality gained support in other browsers and became the standard.
It is both a blessing and a curse. On the one hand, it’s great not to have a control center so that the technology can be developed by various parties, sometimes cooperating, sometimes competing. On the other hand, the unsystematic development of language has led to the fact that the result is not a vivid example of internal consistency. Some parts bring confusion and confusion.
Networks and the Internet
Computer networks appeared in the 1950s. If you run a cable between two or more computers and allow them to transfer data, you can do a lot of amazing things. And if the connection of two cars in the same building allows you to do a lot of different things, then the connection of computers across the planet should allow even more. The technology to do this was created in the 1980s, and the resulting network is called the Internet. And she met the expectations.
A computer can use this network to throw bits at another computer. In order for communication to be effective, both computers need to know what these bits mean. The value of any given sequence of bits depends on what they are trying to express, and what encoding mechanism is used.
Networking style describes the network protocol. There are protocols for sending e-mails, for receiving e-mails, for distributing files, and even for controlling computers infected with malicious software.
For example, a simple chat protocol can consist of one computer sending bits representing the text “CHAT?” To another, and the second responding with the text “OK!” To confirm that he understood the protocol. Then they can proceed to sending each other texts, reading the received texts and displaying them on the screen.
Most protocols are based on other protocols. Our chat protocol from the example treats the network as a streaming device into which you can enter bits and order their arrival at a specific address in the correct order. And ensuring this process is in itself a challenge. Transmission Control Protocol (TCP) is a protocol that solves this problem. All devices connected to the Internet speak it, and most of the communication on the Internet is based on it.
A TCP connection works like this: one computer waits, or “listens,” until others start talking to it. In order to be able to listen to different types of communication at the same time, a number (called a port) is assigned to each of them. Most protocols set the default port. For example, if we send an e-mail via SMTP, the computer through which we send it must listen to port 25.
Then another computer can establish a connection by contacting the destination computer using the correct port. If the destination machine is available, and it listens to this port, the connection is established. The listening computer is called the server, and the connecting computer is called the client.
Such a connection works as a two-way pipe through which bits flow - both machines can put data into it. When the bits are transmitted, another machine can read them. This is a convenient model. It can be said that TCP provides network abstraction.
Web
The World Wide Web, the world wide web (this is not the same as the entire Internet as a whole) is a set of protocols and formats that allows us to visit pages through a browser. Web, "web" in the title means that the pages can be easily linked with each other, resulting in a giant web-network, on which users move.
To add content to the Web, you need to connect the machine to the Internet and make it listen to port 80, using the hypertext transfer protocol (Hypertext Transfer Protocol (HTTP)). It allows other computers to request documents over the network.
Each document has a name in the form of a universal resource locator, the Universal Resource Locator (URL), which looks like this:
http:
The first part tells us that the URL uses the HTTP protocol (as opposed to, say, encrypted HTTP, which is written as https: //). Then comes the part that determines from which server we are requesting the document. The last is a path string that identifies a specific document or resource.
Each machine connected to the Internet has its own IP address, which looks like 37.187.37.82. It can sometimes be used instead of the server name in the URL. But numbers are harder to memorize and type than names - so usually you register a domain name that points to a specific machine (or set of machines). I registered eloquentjavascript.net pointing to the IP address of the machine that I control, so you can use this address to provide web pages.
If you enter the specified URL in the address bar of the browser, it will try to request and show the document located on this URL. First, the browser needs to figure out where the eloquentjavascript.net domain refers to. Then, using the HTTP protocol, it connects to the server at this address, and asks for its resource named /12_browser.html
In chapter 17, we take a closer look at the HTTP protocol.
HTML
HTML, or hypertext markup language, Hypertext Markup Language is a document format used for web pages. HTML contains text and tags that give the text a structure that describes such things as links, paragraphs, and headings.
A simple HTML document might look like this:
<!doctype html> <html> <head> <title> </title> </head> <body> <h1> </h1> <p>, .</p> <p> ! <a href="http://eloquentjavascript.net"></a>.</p> </body> </html>
Tags surrounded by angle brackets <and> describe information about the structure of the document. Everything else is just text.
The document begins with <! Doctype html>, and this tells the browser that it should be interpreted as modern HTML, in contrast to the different dialects of the past.
HTML documents have a header and body. The header contains information about the document, and the body contains the document itself. In our case, we announced that the page name would be “My homepage”, then we described the document containing the title
(<h1>, that is, heading 1, heading 1. There are <h2> - <h6> headings of different sizes) and two paragraphs.
Tags can have multiple forms. An element like body, paragraph, and link begins with the opening <p> tag and ends with the closing </ p>. Some opening tags, such as the <a> link, contain additional information in the form of name = ”value”. It is called "attributes." In our case, the link address is given as href = ”http://eloquentjavascript.net”, where href means “hypertext link”, “hypertext reference”.
Some tags do not surround anything, and they do not need to be closed. Example - image tag
<img src="http://example.com/image.jpg">
which shows a picture located at a given URL.
To include angle brackets in the text of a document, you need to use a special notation, since they have a special meaning in HTML. The opening bracket (also the “less” sign) is written as <(“less than”, “less than”), closing -> (“greater that”, “more than”). In HTML, an ampersand &, followed by a word and a semicolon, is called an entity and is replaced by a symbol that is encoded by this sequence.
This is similar to backslashes used in JavaScript strings. Due to the special meaning of the ampersand, it can be included in the text as &. In the attribute enclosed in double quotes, the quotation symbol is written as ".
HTML is parsed by the parser quite liberally with respect to possible errors. If any tags are omitted, the browser recreates them. How exactly this happens is written in the standards, so you can expect that all modern browsers will do it the same way.
The next document will be processed in the same way as the previous one.
<!doctype html> <title> </title> <h1> </h1> <p>, . <p> ! <a href=http://eloquentjavascript.net>here</a>.
There are no <html>, <head> and <body> tags. The browser knows that <title> should be in <head>, and <h1> in <body>. In addition, paragraphs are not closed, since the opening of a new paragraph or the end of a document means their forced closure. Also the address is not enclosed in quotes.
In this book, we omit the <html>, <head> and <body> tags for short. But I will close tags, and enclose attributes in quotes.
Also usually I will omit the doctype. I do not advise you to do this - browsers can sometimes do strange things when you drop them. Consider them present in the default examples.
HTML and JavaScript
In the context of our book, the most important HTML tag is <script>. It allows you to include a JavaScript program in a document.
<h1>, .</h1> <script>alert("!");</script>
Such a script will start as soon as the browser encounters the <script> tag when parsing HTML. A warning dialog will appear on the page.
Including large programs in HTML is impractical. The <script> tag has an src attribute to request a script file (the text containing the JavaScript program) from the URL.
<h1>, .</h1> <script src="code/hello.js"></script>
The code / hello.js file contains the same simple “alert ('Hello!');” Program. When a page links to another URL and includes it, the browser loads the file and includes them in the page.
The script tag should always be closed using, even if it does not contain code and refers to a script file. If you forget to do this, the rest of the page will be processed as a script.
Some attributes may also contain a JavaScript program. The tag (on the page it looks like a button) has an onClick attribute, and its contents will be launched when the button is clicked with the mouse.
<button onclick="alert('!');"> </button>
Notice that I used single quotes for the string in the onclick attribute, since double quotes are already used in the attribute itself. You could use & quot ;, but that would make reading difficult.
Sandbox
Running downloaded programs from the Internet is not safe. You do not know anything about the people who made the sites you visited, and they are not always friendly. By launching malicious people’s programs, you can infect a computer with viruses, lose your data, or give access to your accounts to third parties.
But the attractiveness of the web is that you can surf it without having to trust all the pages you visit. Therefore, browsers severely limit what JavaScript can do. It cannot open files on a computer, or change anything that is not related to the page in which it is embedded.
An isolated environment is called a sandbox - in the sense that the program is harmlessly played in a sandbox. Imagine, however, this sandbox like a cage made of thick steel rods.
The difficulty in creating a sandbox is to allow programs to do enough to make them useful, while limiting them from performing dangerous actions. Much of what the user does, such as chatting with other servers or reading the contents of the clipboard, can be used to violate privacy.
From time to time, someone comes up with a way to get around the limitations of the browser and do something harmful, from leaking some private information to full control of the computer where the script is running. The developers fix this hole in the browser, and again everything is fine - until the next problem appears, which, hopefully, will be published, and not secretly used by the government or the mafia.
Compatibility and browser wars
In the early stages of the development of the Web, a browser called Mosaic occupied a large part of the market. After a few years, the balance shifted toward Netscape, which was then strongly pressed by Microsoft's Internet Explorer browser. At any moment of the superiority of one of the browsers, its developers allowed themselves to unilaterally invent new web properties. Since most people used the same browser, websites simply started using these properties, not paying attention to other browsers.
These were the dark ages of compatibility, sometimes called "browser wars." Web developers have come across two or three incompatible platforms. In addition, browsers around 2003 were full of errors, and each had their own. The life of the people who created the web pages was hard.
Mozilla Firefox, a non-profit offshoot of Netscape, challenged the hegemony of Internet Explorer in the late 2000s. Since Microsoft did not particularly strive for competition, Firefox selected a substantial part of the market. Around this time, Google introduced its Chrome browser, and Apple introduced Safari. This led to the emergence of four main players instead of one.
New players had more serious intentions in relation to standards and more engineering experience, which led to better compatibility and fewer bugs. Microsoft, seeing the compression of its part of the market, has adopted these standards. If you start learning web development today, you're in luck. The latest versions of the main browsers work in the same way and there are few errors in them.
We can not say that the situation is already perfect. Some people on the web use very old browsers for reasons of inertia or corporate rules. Until they die out completely, writing web pages for them will require mystical knowledge of their flaws and quirks. This book is not about quirks - it represents a modern, reasonable style of web programming.