In this article, you will learn the basics of the Web and HTML language, to get a better understanding of the Internet architecture, for your ongoing projects.
INTERNET HISTORY
• 1962, Paul Baran, US Air Force: foolproof network project
• 1969: ARPANET: university project
• 1972, Ray Tomlinson: email
• 1975, ARPANET: United States Defense Communications Agency, with NCP protocol
• 1983: Arpanet becomes Internet, with TCP/IP protocol
• 1990: CERN, France: world wide web (www)
You surf the Web thanks to:
- an Internet connection,
- a browser installed on your computer,
- site addresses, or a search tool
- a web server hosting the website, installed in a remote computer
INTERNET / WORLD WIDE WEB
Internet:
Global network, connecting several networks on a planetary scale
WWW:
Huge set of documents stored on computers around the world
- Multimedia hypertext
- Distributed, Multi-platform
- Dynamic, Interactive
- Global
INTERNET PROTOCOLS
- TCP/IP:
Transmission Control Protocol / Internet Protocol - HTTP:
HyperText Transfer Protocol - HTTPS:
Secure HTTP - FTP:
File Transfer Protocol
WHO MANAGES THE WEB?
- The web is not controlled by a single entity,
- It is impossible for a single organization to define the rules of the web
- Two groups have a great influence:
World Wide Web Consortium or W3 and Browser manufacturers
DEFINITIONS
- Web page:
Document on the web which may contain text, images, sound and video
Generally: it is an html file accompanied by otherfiles possibly (images, scripts, etc.) - Website:
Set of web pages connected by links, hosted on a web server, and accessible to Internet users - Web project:
All the files making up a website, stored in a folder
IP ADDRESS
- Method of addressing computers in the TCP/IP protocol:
Unique number of a computer in the network - IP Version 4:
4 32-bit numbers, ranging from 0 to 255, separated by dots
Example: 205.237.24.101 - IP Version 6:
8 numbers of 16 bits each in hexadecimal separated by colons:
Example: 2001:0db8:0000:85a3:0000:0000:ac1f:8001 - TCP/IP makes it possible to associate names with the IP addresses of the servers thanks to a system called DNS or Domain Name Service, to facilitate use and memorization
Example: 205.237.24.101 is associated with the name www.cybermachin.qc.ca
URL ADDRESS
Uniform Resource Locator: pointer directed to an element located on the web, such as a web page.
Contains the following information:
- The protocol used: HTTP, FTP, ..
- The name of the server or its IP address, indicating the place where to look for the information: www.cyber.com, 10.1.0.50, …
- The location where the file to access is located in the site, if it is not indicated, a location and a default file is taken
example: http://www.umontreal.ca/
http://132.204.5.67
WEB BROWSER
Program that allows you to consult and search for information on the web.
Must be installed on client workstations.
Purpose of browsers:
- interpret and display web pages,
- navigate through web pages,
- Interpret “customer” scripts
- Examples: Microsoft Edge and Mozilla Firefox, Opera, Google Chrome
WEB SERVER
- Program that runs on a remote, powerful computer called a server
- Intended to respond to browser requests
- The browser requests access to a page located on the web server computer, using the HTTP protocol and the server address
- The server sends the requested web page
- The browser then formats the information received from the server.
- The web server is often confused with the computer where the server program is located, which is called Host, because it hosts the website program and files.
WEB APPLICATION
- Application hosted in a web server
- Has a web page interface
- Allows you to create a dynamic website
- Two kinds of technologies coexist
currently :
client-side technology
server side technology
CLIENT-SIDE WEB TECHNOLOGY
- Generally uses languages integrated with browsers, like JavaScript
- The programs (scripts) are interpreted on the client workstation after the page is downloaded
- The processing carried out is light: validations
SERVER-SIDE WEB TECHNOLOGY
- Use server-level languages, such as PHP, ASP, or Java
- Programs (scripts) are executed on the server before the page is downloaded
- The processing carried out is heavy: calculations
DATA ENCRYPTION / COMPRESSION
Data encryption:
- Consists of encoding the information before sending it on the net so that it cannot be intercepted.
- It is decoded upon receipt
- Secure sites use encryption, with the HTTPS protocol
Example: RSA encryption, …
Data Compression:
- Consists of reducing the size of the information without altering it in order to facilitate its transfer and storage
Example: zip, jpeg, jar, …
THE DIFFERENT TYPES OF WEBSITES
Are generally classified according to areas of use, in the form of a suffix at the level of website addresses.
The main ones are:
- .com: trade
- .gov: government
- .org: organizations
- .edu: education
- .qc.ca, .fr, …. : depending on the country
INTERNET GLOSSARY
- INTERNET: Network of networks, planetary scale
- World Wide Web or Web: Worldwide web of documents with hypertext links
- TCP/IP: Transmission Control Protocol / Internet Protocol
- IP address: Unique address of a node (such as a computer) on
Internet - URL: Uniform Resource Locator, address on the net
- HTTP: Hypertext Transfer Protocol, protocol for transferring
web pages and data - HTML language: Hyper Text Meta Language, language of
web presentation - Tags: HTML elements, are between < >
- Browser: Client software, sends requests to the web server, such as Edge, Firefox
- Web server: Server software, receives and processes requests from the browser, such as Apache, Tomcat, IIS
- Web page: a hypertext document and hyperlink published on the internet
- Website: set of web pages hosted on a web server
- Website designer: site developer
- Webmaster: Site Administrator
Web Architecture with Proxy and Cache
The proxy-cache plays the same role as your computer’s cache memory: it retains the last data used and when it is requested again, returns it much faster than if it were to be requested for the first time.
Log Files
- Information retained in the log files
- Client IP address
- day + time (hh:mm:ss) of access
- Client browser type (Mozilla, Edge, Opera, Chrome, etc.)
- Resource requested (page or Script)
- Keywords used in search engines
- Execution errors (ex.: 403 Forbidden, 404 file not found, 500 Internal Error, etc.)
The links
To a document in the same folder:
<a href="docB.html">link for document B</a>
To a document in another site:
<a href="http://www.site.com/docB.html">
link for document B in another site
</a>
To an email address:
<a href=“mailto:winckler@irit.fr”>
email from Marco Winckler
< /a>
To a region in another document:
<a href="docB.html#target">
link for the "target" region of document B
</a>
With an image as anchor:
<a href="/">
<img src="logo.gif" alt="home page">
</a>
List with sub elements
<ol type="1">
<li>WEB
<ol type="a">
<li>HTML</li>
<li>CSS</li>
</ol> </li>
<li>Graphiques
<ol type="a">
<li>Images</li>
<li>Drawings</li>
</ol></li>