Linux Articles

Tuesday, October 07, 2008

Basic HTTP Protocol: How Webserver And Web client Talk Each Other

Basic HTTP Protocol: How Webserver And Web client (Browser )Talk Each Other


Webserver is A "server" which is specifically used for displaying web pages or web sites. Webserver store pages on it's directory and serve when requested by any web client (Browser) via HTTP Protocol. Every webserver has their unique IP address(Internet Protocol Address) which help the Browser to locate your webserver in Internet.

HTTP Protocol is used to distribute information on world wide web, which we call website, web page etc. The main aim of HTTP Protocol is to distribute information over Internet.

Think you have some printed document to distribute it to public or your visitor or customer. What you will do? First you will create the document with the information you want to distribute among your clients by printing the information on a paper or other media. When the document is ready, you may inform your customer or public to collect the page from your office front desk. People will come to your office, ask front desk for the document an take it away.

If we want to convert above example of information distribution in HTTP Protocol, the printed document will treat as 'Web page', your office is 'Web server' and your front desk is 'Port' to serve the HTTP Request.

Bellow is the steps you may take to distribute information on a printed document to your customer or general public:

  1. You Prepare a Document with your Information you want to distribute by printing it on a paper.
  2. Store it to Your Office.
  3. You told your Front Desk to handover the paper who ask for it.
  4. You inform your customer or public that you have informational document to give out from your office Front Desk
  5. Your Customer May Visit Your Office
  6. Your Customer Will Ask for the Document on Front Desk
  7. Your Front Desk Officer verify if he/she is asking for correct document.
  8. If correct, your front desk will give the document to the person who asked for it.
  9. If not correct, officer will inform the visitor that he is asking for a document which is not available on your Office.
In above steps, you distribute the information on a printed document, which is limited and may based on high costing. Let's distribute the information on Internet via HTTP Protocol, widely known as World Wide Web (WWW).

  1. You can create a web page with the information you want to distribute
  2. You Store it to Your Web Server
  3. You Configure your web server to give only the right document (web page)
  4. You publish / inform your web server address (URL) to your target audience.
  5. Your audience will put your web server address on their browser (Web Client)
  6. Browser of your audience will connect to your web server via HTTP Protocol in Port 80, which is standard port of server use in HTTP Protocol. And ask for the document you offered before.
  7. Your Web Server will verify either your audience's browser is asking for right document (web page)
  8. If audience's browser is asking for right document, your web server will give it away
  9. If he is not asking for the document you want to distribute, your web server will inform the browser that the document is not available which he asked for.
From the above example, you may understand how HTTP Protocol can distribute inforamtion in unlimited copy!.

Bellow is an example of WebServer and WebClient Talk Each Other via HTTP Protocol:

In HTTP Standard, all webserver use port '80' to serve the document / file /information unless otherwise specifyed. Say you web server address is www.bauani.org and you put your document on directory /articles . The the URL of the document will be http://www.bauani.org/articles .

When someone want to get a copy of the document located in URL http://www.bauani.org/articles, he put the URL on his browser. Browser will look for the IP address as describe above this article. Say www.bauani.org IP address is: 69.10.136.107 . Browser will open a TCP connection to IP address 69.10.136.107 on port 80, mean time browser will also open a temporary port to receive the data from webserver. (Every Server-Client Connection need open port on both Server and Client, where Server port is static but Client Port will differ in different time, we will discuss it on other Article). Wen the TCP connection between Client browser and Web Server established, browser (Web Client) issue the following basic command along with other advance command like 'Referar:' , 'User-agent' etc on Server via 80 port of Server



GET /articles HTTP/1.1
Host:www.bauani.org


as the requested document '/articles' is available on server and this is a valid HTTP request, server will responde with code '200 OK' and start giving the document like bellow in HTML code:

HTTP/1.1 200 OK
Date: Mon, 06 Oct 2008 21:38:55 GMT
Server: Apache/1.3.37 (Unix)
Last-Modified: Mon, 06 Oct 2008 19:36:09 GMT
ETag: "3c6002c-4c0a-48ea68a9"
Accept-Ranges: bytes
Content-Length: 19466
Content-Type: text/html

Document Content Serve by Server Start Here. I have remove the HTML Code as it is too long)


The the client browser parse the HTML code and show the webpage to client browser.

If there was no file name 'articles' server would responde with code '404 Page not found. See the example of server reply on a non exesting webpage requested. Bellow is request for 'falesfile' which is not in server. Request was:

GET /falsefile HTTP/1.1
Host:www.bauani.org

Server Responded:

HTTP/1.1 404 Not Found
Date: Mon, 06 Oct 2008 21:51:13 GMT
Server: Apache/1.3.37 (Unix) mod_gzip/1.3.26.1a mod_auth_passthrough/1.8 mod_log_bytes/1.2 mod_bwlimited/1.4 FrontPage/5.0.2.2635.SR1.2 mod_ssl/2.8.28 OpenSSL/0.9.7a PHP-CGI/0.1b
Transfer-Encoding: chunked
Content-Type: text/html; charset=iso-8859-1

Document Content Serve by Server Start Here. I have remove the HTML Code as it is too long)



I hope above article will help you to understand the Basic HTTP Protocol.

On my future articles, I will try to cover the SMTP Protocol, POP3 Protocol etc. Till then Good bye.

Labels:

Apache Web Server Performance Tuning and Quick Security Tips

Apache Web Server Performance Tuning and Quick Security Tips

In WWW web publication, most of Apache Server Administrator's thinking are pointed to one issue, "How can I Improve the Performance of Apache Web Server, not Sacrificing the Security of Web Application"

In September 2008, Web Server Survey shows that out of 181,277,835 active web sites (181 Million), Apache Web Server is being used as web server on 91,418,412 (91.4 Million) domain or sub domain.

As Apache is most favorite web server among the web site publisher, one web administrator must know how can improve his site performance in very less effort. Before going to tune any webserver, you must know the clear idea that how HTTP Protocol Work!

If you maintain a Apache web server, having few visitor in hour or day even month, the default configuration comes with Apache Web Server Distribution enough to handle the HTTP request from visitors browser. But think about the server having thousand of visitor in every hour, this is become very difficult to maintain the performance of web site.

Removing a simple image from site can increase the performance of website in tremendous way. Bellow I am giving some quick tips to tune your server to handle thousand of page request from your visitors browser.

Today I will discuss on 2 option of Apache Webserver related to 'Cache' or 'Caching'.
There is a good news that by default apparently or any standard install of Apache Web Server generate 'Etags' and 'Last-Modified' response header. See bellow the HTTP request and response on a Standard Apache Installation on OS CentOS:

HTTP Client Request:

GET /favicon.ico HTTP/1.1
Accept: */*
User-Agent: Chrome/0.3.154.0
Host: www.bauani.org

HTTP Response from Server:

HTTP/1.1 200 OK
Last-Modified: Fri, 29 Oct 2004 14:51:01 GMT
ETag: "163c078-8be-418258d5"
Content-Type: image/x-icon

With this 'ETag' and 'Last-Modified' header, browser can understand the file change or not in future request. Say now I am visiting another page of this site, so browser will issue a request for this favicon.ico file, but diffrent as browser cache has a copy of this file, it will issue same request but includes few extra options like bellow:

Request:
GET /favicon.ico HTTP/1.1
Accept: */*
If-Modified-Since: Fri, 29 Oct 2004 14:51:01 GMT
If-None-Match: "163c078-8be-418258d5"
User-Agent: Chrome/0.3.154.0
Host: www.bauani.org

Responce from Server:
HTTP/1.1 304 Not Modified
ETag: "163c078-8be-418258d5"

By this extra 2 option 'If-Modified-Since' and 'If-None-Match' browser is issuing a condition to download the /favicon.ico and server responce '304 Not Modified' mean, browser not need to re-download this file as this file is already in browser cache and didn't modify scince last downloded. If you are Server administrator of www.bauani.org, you just save a little bandwidth equvalent to favicon.ico file size.

Source: http://www.sweeting.org/mark/blog/2007/02/06/website-performance-tip-for-apache

Labels: