Understanding the basics of Internet communications

From Joomla! Documentation
Jump to: navigation, search

Contents

Introduction

When a person browse a webpage with it web browser, this person don't care how Internets actually works, how the computers manages to connect to Internet, or how the browsers manages to display all that text and images of taht webpage, all this details are "transparent" for the user (he don't care or notes a difference) as long as everything works the way it should, but any developer have to learn some basic concepts and principles about how computers communicates trough Internet to be able to troubleshot any common communication problem that any web developer will face every day.

Concepts about computer communications

Let review the following concepts before address some theory about computer communications

Transparent

In the context of computers communications the term "transparent" means that one systems is "unaware" of other systems, for example, Internet is a data service that perfectly runs on a cable connection or wireless connection, the cable connection uses a communication protocol called "ethernet" and the wireless connection could run in several wireless protocols such as wi-fi, wi-max, bluetooth, etc, but at the end the Internet service is the same and it information is not altered by any of those protocols. In this order of ideas we can say that the communication protocol is transparent for the Internet service and vice versa.

Parts of the communication process

Any communication process have few basic elements that need to be present to be able to send a message from one point to another, this parts are:

  • Message: Is the information that should travel from one point to another.
  • Transmitter: Is the starting point element who wants to send a message.
  • Receiver: Is the ending point element who receives the message.
  • Channel: Is the transmission method used on a specific medium
  • Medium: Is the material, substance or electromagnetic wave used to transmit the information

In the computer communication context, the message is the binary information that needs to transmitted, the transmitter and receivers could be any of the computers connected to the network, the medium of transmission could be a cable, optic fiber, infrared light and electromagnetic waves (radio transmissions), and the channel is the transmission method implemented on that specific medium.

The concept of "communication channel" and "communication medium" could be tricky because they depend on the context and some people confuse them as the same thing, for example a FM radio broadcasts uses electromagnetic waves as a "medium" but the "channel" are those electromagnetic waves at a specific frequency (transmission method), in the context of human voice communications we use the voice to transmit messages (channel) and the air is our medium, with no air voice waves can't travel.

In a more specific and scientific level we can say that voice waves are changes in the pressure of the air, this changes of pressure are actually energy propagating in a substance, in our case this substance is the air which is normally composed by a mixture of gases such as oxygen and hydrogen. knowing this we can say now that voice communications uses energy propagation as communication channel and gases as communication medium.

Communication and transmission modes

Some transmission mediums are able to handle communications in several directions and also handle several communications channels same time, for example:

  • The Optic fiber called "mono mode" who only allows data transmission light in only one way, this mode of communication is called unidirectional communications and the transmission mode is called unidirectional transmission.
  • Ethernet cooper cables are able to handle tow way communications using 4 cooper cables, this mode of communications is called bidirectional communications and the transmission mode is called bidirectional transmission.
  • A Wi-Fi router uses microwaves to transmit information to one or more devices using several ranges of microwave frequencies and the devices can send information back to the router using microwaves at a specific frequency each, the medium is electromagnetic waves and the channels are the electromagnetic waves at different frequencies, this kind of communications are called bidirectional communications and the transmission mode is called omnidirectional transmission because electromagnetic waves can travel in all directions.
  • FM radio stations uses omnidirectional radio antennas to broadcast to any device in the range, this kind of communications are called unidirectional communications and the transmission mode is called omnidirectional transmission because electromagnetic waves can travel in all directions.
  • Some companies got directional microwave antennas to transmit information between only 2 buildings, this kind of communications are called bidirectional communications and the transmission mode is called Directional transmission because electromagnetic waves are focused to one point and direction.

Resuming we got:

  • Unidirectional communications = 1 message can be transmitted at a time.
  • Bidirectional communications = 2 messages can be transmitted at the same time.
  • Unidirectional transmission = Energy can propagate in one direction.
  • Bidirectional transmission = Energy can propagate in 2 directions.
  • Omnidirectional transmission = Energy can propagate in all directions.
  • Directional transmission = Energy is focused in one particular point and direction.

Communication protocol

A protocols is a set of rules to keep order and control, well educated people learn at school basic rules for communication, if 2 people speak the same time none of them will understand to each other, this is called a communication problem, that is why one person speaks and the other listen, then the other person waits for it turn to speak back. This same concept is applied by computers.

Computer communications protocols defines the step and methods to follow to achieve a successful communications between 2 or more computers, this protocols also include algorithms to recover the communication process from several kinds of fault states after any communication problem.

Communication collisions

This is a common problem that happens when the computers shares the same medium and channel of communication and try to transmit information the same time, when this occur the information got corrupted and all transmissions are automatically stopped until the problem is solved by the protocols algorithms.

For example, 5 computers are connected to a Ethernet network using Ethernet cables connected to a network hub, as indicated in the following figure:

+-----+  +-----+  +-----+  +-----+  +-----+  
| PC1 |  | PC2 |  | PC3 |  | PC4 |  | PC5 |  
+-----+  +-----+  +-----+  +-----+  +-----+
   |        |        |        |        |
   |        |        |        |        |
   |        *---*    |    *---*        |
   |            |    |    |            |
   *--------*   |    |    |   *--------*
            |   |    |    |   |      
            |   |    |    |   |
        +-------------------------+
        |           HUB           |
        +-------------------------+

All this computers are sharing the same medium and channel of communication, this means that the information that is send by one computer is received by the rest of the computers, if PC1 need to send a message to PC2 the Ethernet protocols tell to the computer that it only can transmit if the network is in total silence (no ongoing transmissions), when the silence is detected PC1 waits a small random amount of time and transmit the information to PC2, actually when the silence is detected in the network all the computers who need to send information wait a small random time to avoid the possibility to transmit the same time, but eventually with hundred of transmissions per second collisions happens, then any computer of the network who detects this collision floods the network with a collision detected signal which is something like a continuous coach whistle until every body makes silence, then all the computers are forced to go silence and waits a small random time to try to transmit the information again.

Network hubs are old network devices who receive a message in one port and re-transmit this information to all the other ports, this devices got replaced by network switches who don't re-transmit the information to the all the other port, instead the network switch determiner what is the correct port that needs that information and send the information only there, this small detail makes the network dramatically faster and almost free of collisions.

Wi-Fi routers have to deal with communication collisions all the time

Computers networks based in Wi-Fi technology can also suffer of collisions because they use as transmission medium electromagnetic waves, some people think that Wi-Fi router uses the air as a medium which is wrong, anyone can transmit a Wi-Fi signals here on earth or the space which have no air at all, electromagnetic waves travel by their own. Wi-Fi devices uses advanced algorithms and methods to avoid and solve collisions as much as possible, Wi-Fi is a technology who uses several ranges of electromagnetic waves frequencies to transmit information this allow communication between several devices using separated channels, but sometimes 2 or more devices tries to use the same channel an a collision happen, the recovery algorithm detect that the received information is corrupted and issues a re-transmission order. Wi-Fi networks normally works good under several circumstances and a considerable amount of connected devices but the problems arrives when several Wi-Fi networks are too close to others, in this case the network users could experience a degradation in the network speed among other problems.

LAN

LAN stands for Local Are Networks, they represent a group of computer connect on a small geographical region, for example, houses, offices, small companies, etc.

MAN

MAN stands for Metropolitan Area Networks, they represent a network of networks connect on an area of the size of a city, some companies have privates networks among all the company branches distributed in a particular city.

WAN

WAN stands for Wide Area Networks, they represent the network where all the other networks get services like Internet, WAN can be as wide as countries, some big companies have private WAN networks to interconnect all it branches and offices at national level

Internet

Internet is the network of networks, it scale is global, is also know as the cyber space, where people around the world have access to almost any kind of information and services, some companies create virtual private networks over a Internet connection to connect all the branches and offices around the world, basically anyone in the world can create it own virtual private network.

VPN

VPN stands for Virtual Private Network, VPN is a communication protocol who encrypt the data and create a "Logical networks" over a simple Internet connection or in any network such as WAN, MAN and LAN, logical networks are something like network inside a virtual reality world where this networks can behave they way they have to and also be unaware and careless of the real world. Internet communications can be intercepted at any point which represent a huge security thread but VPN connections creates strong encrypted connections which makes the data inaccessible to other elements who are not subscribed to the VPN network.

Internet communications

When 2 computers communicate through Internet an insane amount of protocols and services does a lot of complex stuff shorter than a blink of a eye to transmit a message, or media file from the other side of the world of from the next office department, we can spend a good 6 months talking about all this amazing protocols and mechanism that makes all the magic happens, but actually we just need to understand few of them to continue with our journal as web developers.

TCP/IP

The most important protocol that we need to understand is the TCP/IP protocol also know as the Internet protocols, this protocol is actually 2 protocols working together, the job of these protocols is:

  • TCP: Chops the information into smaller chunks of data called packages.
  • TCP: Tag each packages with a port number, this port number identifies the application who is going to wait for the information at the other end.
  • TCP: Transmit the packages packages one by one,
  • IP: Tags each package with an identification number called IP number (the popular IP number), this IP number is used to route a package between an undefined number of networks (Internet)
  • TCP: When the package arrives to it destiny this is validate to check it data integrity
  • TCP: For each received package the protocols sends back and "acknowledge message" to the sender indicating that all went OK which that specific package, this method ensures that all the packages are valid and none of them is missing.
  • TCP: "Glue" together all the packages to form the original message

An important characteristic of this protocols is that it ensures 100% valid and 100% complete information, so it is ideal to send messages that can't lose information or change original information in the transmission process.

UDP/IP

UDP protocols is the counterpart of TCP, it job is similar to TCP and also works together with the IP protocol:

  • TCP: Chops the information into smaller chunks of data called packages.
  • TCP: Tag each packages with a port number, this port number identifies the application who is going to wait for the information at the other end.
  • TCP: Transmit the packages packages one by one,
  • IP: Tags each package with an identification number called IP number (the popular IP number), this IP number is used to route a package between an undefined number of networks (Internet)
  • TCP: When the package arrives to it destiny this is validate to check it data integrity
  • TCP: "Glue" together all the packages to "try" to form the original message

This protocol uses an "optimistic philosophy" which means that the protocols will "try" to do the best possible to send and receive all the packages but do nothing about missing or corrupted packages, in other words UDP will not send back an acknowledge messages for each package to ensure all of them has arrived and are valid.

This protocol may sounds actually error prone, and it totally is, but it got an advantage over TCP which is speed, when a package travels (any package) it could take about 200ms or more to arrives to it destiny, if this package is a TCP package, then TPC will send back an knowledge message which will take others 200ms to go back, this means a TCP package transmission could take about 400ms to complete the process for a single package, in the other hand UDP sends the package and forgets about it, then a UDP package completes the transmission process in 200ms, this makes this protocol specially attractive for data streaming such as video games connections, radio broadcast, video broadcast and Voice over IP services where the lose of small pieces of information is not important or is almost imperceptible.

Network sockets

A network socket is composed by an IP number and TCP or UDP port number, just like this.

 IP-number:Port-number 

For example

 173.194.73.106:80 

That is the IP and port number to browse www.google.com home page.

Network sockets just determines the computer location who is going to receive a package and the port number indicates the application that is waiting for that package. Looking at the google socket we can say that the IP "173.194.73.106" is the server IP number and the port "80" this port is and standard port also known as the HTTP port used for web services.

Servers, services and clients

The meaning of the term "server" depends on the context, in the hardware context a server is normally represented as a dedicated computer to continuously provide data services 24/7, for example, a web server could be a single computer dedicated to handle and deliver web related information and applications such as web pages.

In the software context a server is an application also known as service, this applications are normally running in the background as daemons, daemons are applications/services running in silence waiting and listening for any incoming request, for example Apache is the web server who serves webpages, this application listen to any request coming from the port 80, and delivers the requested web pages or web information.

The meaning of the term "client" depends on the context, in the software context a client is the application who makes request to the server, receives information coming from the server and interprets this information for the user consumption, for example, a web browser such as Google Chrome or Mozilla Firefox are web clients also known as web browsers, this applications makes HTTP requests to web servers, then when all the information is received and the web browser will displays all the content tot he user to it can read it (consume).

In the context of computer communication applications the server is who sends messages and a the client is the one who receives the message, if the client replies a message with another message (response) then the roles of both sides got inverted, the client becomes the server and the server becomes the client, so basically the direction of the information defines who is the client and the server.

Domain Names and DNS servers

Domain names are human readable alternatives for IP numbers, people feels more comfortable memorizing readable names instead of IP numbers. A domain names is translated into an IP number by the DNS (Domain Name Service) servers, this servers contains large databases with all the Internet domain names and their assigned IP numbers, for example, the following list are some popular domain names and their current IP numbers:

  • www.google.com ---> 173.194.75.104
  • www.joomla.org ---> 206.123.111.172
  • www.php.net ---> 69.147.83.199

Type the IP on your browser address bar to visit the pages using the IP instead the domain name, if the DNS server is down you can always use direct IP.

Thanks to domain manes a website can preserve it actual domain name and change the IP anytime, when someone changes the IP of it domain name the new IP is not recognized immediately, this new IP could be visible 72 hours after the update, this is because DNS servers around the world are all connected constantly comparing and updating their databases. When a computer makes DNS resolution request, the results are saved in a local DNS cache for certain amount of time, this way computers dont have to ask fpr DNS resolution on every page request.

To resolve domain manes you need the IP of a DNS server, this IPs are normally automatically assigned by your ISP (Internet Service Provider), check your current network configurations to see what DNS servers are available.

The following illustration represent how computers make DNS resolutions requests:

  • PC request the page "www.joomla.org"
  • PC send the DNS request to a DNS server
  • DNS receives and resolve the domain name "www.joomla.org" ---> "206.123.111.172"
  • PC receives the IP and does HTTP request to load the page

HTTP and SHTTP

HTTP stands for Hyper Text Transfer Protocol, this protocol is the base of the web communications, the information is send as plain text using the HTML markup, the browser is in charge to read, parse, interpret and render the text, figures and images that the user will visualize in the computer screen.

This protocols runs over the TCP/IP protocol and uses the port number 80, as the standard port number for HTTP transmissions

FTP and SFTP

FTP and SFTP

PING

Personal tools
Namespaces

Variants
Actions
Navigation
Joomla! Sites
Toolbox