Articles

Web Basics

In this article, you will learn the basics of the Web and HTML language, to get a better understanding of the Internet architecture, for your ongoing projects.

INTERNET HISTORY

• 1962, Paul Baran, US Air Force: foolproof network project
• 1969: ARPANET: university project
• 1972, Ray Tomlinson: email
• 1975, ARPANET: United States Defense Communications Agency, with NCP protocol
• 1983: Arpanet becomes Internet, with TCP/IP protocol
• 1990: CERN, France: world wide web (www)

You surf the Web thanks to:

  • an Internet connection,
  • a browser installed on your computer,
  • site addresses, or a search tool
  • a web server hosting the website, installed in a remote computer

INTERNET / WORLD WIDE WEB

Internet:
Global network, connecting several networks on a planetary scale

WWW:
Huge set of documents stored on computers around the world

  • Multimedia hypertext
  • Distributed, Multi-platform
  • Dynamic, Interactive
  • Global

INTERNET PROTOCOLS

  • TCP/IP:
    Transmission Control Protocol / Internet Protocol
  • HTTP:
    HyperText Transfer Protocol
  • HTTPS:
    Secure HTTP
  • FTP:
    File Transfer Protocol

WHO MANAGES THE WEB?

  • The web is not controlled by a single entity,
  • It is impossible for a single organization to define the rules of the web
  • Two groups have a great influence:
    World Wide Web Consortium or W3 and Browser manufacturers

DEFINITIONS

  • Web page:
    Document on the web which may contain text, images, sound and video
    Generally: it is an html file accompanied by otherfiles possibly (images, scripts, etc.)
  • Website:
    Set of web pages connected by links, hosted on a web server, and accessible to Internet users
  • Web project:
    All the files making up a website, stored in a folder

IP ADDRESS

  • Method of addressing computers in the TCP/IP protocol:
    Unique number of a computer in the network
  • IP Version 4:
    4 32-bit numbers, ranging from 0 to 255, separated by dots
    Example: 205.237.24.101
  • IP Version 6:
    8 numbers of 16 bits each in hexadecimal separated by colons:
    Example: 2001:0db8:0000:85a3:0000:0000:ac1f:8001
  • TCP/IP makes it possible to associate names with the IP addresses of the servers thanks to a system called DNS or Domain Name Service, to facilitate use and memorization
    Example: 205.237.24.101 is associated with the name www.cybermachin.qc.ca

URL ADDRESS

Uniform Resource Locator: pointer directed to an element located on the web, such as a web page.
Contains the following information:

  • The protocol used: HTTP, FTP, ..
  • The name of the server or its IP address, indicating the place where to look for the information: www.cyber.com, 10.1.0.50, …
  • The location where the file to access is located in the site, if it is not indicated, a location and a default file is taken
    example: http://www.umontreal.ca/
    http://132.204.5.67

WEB BROWSER

Program that allows you to consult and search for information on the web.
Must be installed on client workstations.
Purpose of browsers:

  • interpret and display web pages,
  • navigate through web pages,
  • Interpret “customer” scripts
  • Examples: Microsoft Edge and Mozilla Firefox, Opera, Google Chrome

WEB SERVER

  • Program that runs on a remote, powerful computer called a server
  • Intended to respond to browser requests
  • The browser requests access to a page located on the web server computer, using the HTTP protocol and the server address
  • The server sends the requested web page
  • The browser then formats the information received from the server.
  • The web server is often confused with the computer where the server program is located, which is called Host, because it hosts the website program and files.

WEB APPLICATION

  • Application hosted in a web server
  • Has a web page interface
  • Allows you to create a dynamic website
  • Two kinds of technologies coexist
    currently :
    client-side technology
    server side technology

CLIENT-SIDE WEB TECHNOLOGY

  • Generally uses languages integrated with browsers, like JavaScript
  • The programs (scripts) are interpreted on the client workstation after the page is downloaded
  • The processing carried out is light: validations

SERVER-SIDE WEB TECHNOLOGY

  • Use server-level languages, such as PHP, ASP, or Java
  • Programs (scripts) are executed on the server before the page is downloaded
  • The processing carried out is heavy: calculations

DATA ENCRYPTION / COMPRESSION

Data encryption:

  • Consists of encoding the information before sending it on the net so that it cannot be intercepted.
  • It is decoded upon receipt
  • Secure sites use encryption, with the HTTPS protocol
    Example: RSA encryption, …

Data Compression:

  • Consists of reducing the size of the information without altering it in order to facilitate its transfer and storage
    Example: zip, jpeg, jar, …

THE DIFFERENT TYPES OF WEBSITES

Are generally classified according to areas of use, in the form of a suffix at the level of website addresses.
The main ones are:

  • .com: trade
  • .gov: government
  • .org: organizations
  • .edu: education
  • .qc.ca, .fr, …. : depending on the country

INTERNET GLOSSARY

  • INTERNET: Network of networks, planetary scale
  • World Wide Web or Web: Worldwide web of documents with hypertext links
  • TCP/IP: Transmission Control Protocol / Internet Protocol
  • IP address: Unique address of a node (such as a computer) on
    Internet
  • URL: Uniform Resource Locator, address on the net
  • HTTP: Hypertext Transfer Protocol, protocol for transferring
    web pages and data
  • HTML language: Hyper Text Meta Language, language of
    web presentation
  • Tags: HTML elements, are between < >
  • Browser: Client software, sends requests to the web server, such as Edge, Firefox
  • Web server: Server software, receives and processes requests from the browser, such as Apache, Tomcat, IIS
  • Web page: a hypertext document and hyperlink published on the internet
  • Website: set of web pages hosted on a web server
  • Website designer: site developer
  • Webmaster: Site Administrator

Web Architecture with Proxy and Cache

The proxy-cache plays the same role as your computer’s cache memory: it retains the last data used and when it is requested again, returns it much faster than if it were to be requested for the first time.

web-proxy-cache

Log Files

  • Information retained in the log files
  • Client IP address
  • day + time (hh:mm:ss) of access
  • Client browser type (Mozilla, Edge, Opera, Chrome, etc.)
  • Resource requested (page or Script)
  • Keywords used in search engines
  • Execution errors (ex.: 403 Forbidden, 404 file not found, 500 Internal Error, etc.)

The links

To a document in the same folder:
<a href="docB.html">link for document B</a>

To a document in another site:
<a href="http://www.site.com/docB.html">
link for document B in another site
</a>

To an email address:
<a href=“mailto:winckler@irit.fr”>
email from Marco Winckler
< /a>

To a region in another document:
<a href="docB.html#target">
link for the "target" region of document B
</a>

With an image as anchor:
<a href="/">
<img src="logo.gif" alt="home page">
</a>

List with sub elements

<ol type="1">
<li>WEB
<ol type="a">
<li>HTML</li>
<li>CSS</li>
</ol> </li>
<li>Graphiques
<ol type="a">
<li>Images</li>
<li>Drawings</li>
</ol></li>

Tables