How to access data on World Wide Web?| Hypertext Transfer Protocol| HTTP

How to access data on World Wide Web?| Hypertext Transfer Protocol| HTTP

World Wide Web, also known as www, act as a repository to link information together from all over the world. It provides distributed client-server service where web browser acts a client and different sites on the server play the role of the server. World Wide Web (www) mainly access data via Http. Hypertext transfer protocol works as a combination of FTP and SMTP. It transfers files and uses the services of TCP but it is simpler than FTP because it doesn’t use separate control connections for data transfer. Data transferred between client and server is similar to SMTP messages. MIME-like headers control the format of the messages. Http server and HTTP client are able to interpret HTTP messages. When the client sends the request to HTTP server, data/content is sent as a response by the server to HTTP client. Http uses Port 80 of TCP for service.

Data transfer on www

Hypertext transfer protocol is stateless. It uses services of TCP for data transfer on www. When the client sends a request to the server to initialize data transaction, the server sends a response in reply for confirmation.

C:\Users\Ankita\Desktop\how http access data on world wide web-msa-technosoft.png

Let us have a look on format of request messages and response messages traverse on www for the purpose of data transaction:

The format of request and response messages are nearly same in HTTP data transaction. A request message has 3 parts: Request line, Header, and Body while a response message also made up of 3 parts: Status line, Header, and Body.

C:\Users\Ankita\Desktop\format of http Request and http Response-msa-technosoft.png

Request Line & Status Line: Request line is the first line of the Request message of Http data message sent to the server. It consists of 3 parts: Request type, URL, and Http version. While Status Line is the first line of the Response message sent by the server to the client in return to the request. It also has 3 parts: Http version, Status code, and Status phrase.

Request type can be categorized into methods. There are several methods that can be mentioned as request type in request message:

  • GET is used to request a document from Server
  • POST is used to send some data/ information to Server from Client
  • HEAD is used to request for information about a document, not the actual document
  • PUT is used to send document from Server to Client
  • CONNECT is reserved
  • OPTION is used to know about available options
  • TRACE is used to echo incoming requests

URL is a standard to specify any type of information on the Internet. It defines 4 things: Protocol that is client/server program used for retrieving documents, Host that is a computer where information is located, Port that is inserted between host and path and separated by a colon, and Path that tells where the file is actually located.

Http version currently running on market is HTTP 2.0

Status Code consists of 3 digits. Codes in the range of 100 are informal status codes, codes in the range of 200 means successful request status codes, codes in the range of 300 redirect the clients to another URL, codes in the range of 400 indicates client side error and codes in the range of 500 means server side error.

Status Phrase is used to describe the status code in the textual form. Let us have a look at the meaning of status codes in textual form:

 

Code Phrase Description
Informational
100 Continue Initial part of request is received, client may continue with its request
101 Switching Server is complying with a client request to switch protocols defined in upgrade header
Success
200 OK The request is successful
201 Create A new URL is created
202 Accepted Requested is accepted but not immediately acted upon
204 No Content No content in body
Redirection
301 Moved Permanently Requested URL is no longer in use by Server
302 Moved Temporarily Requested URL has moved temporarily by Server
304 Not Modified The document has not been modified
Client Error
400 Bad Request There is a syntax error in request
401 Unauthorized Request lacks proper authorization
403 Forbidden Service is denied
404 Not Found Document is not found
405 Method Not Allowed Method is not supported in URL
406 Not Acceptable Format requested is not acceptable
Server Error
500 Internal Server Error Some server side error arises like a crash
501 Not Implemented Requested action cannot be completed
503 Service Unavailable Service is temporarily unavailable, may be requested in future

Header: It is used to exchange additional information between Client and Server. There may be one or more header lines. Each header line contains header name, colon, space, and header value. Header line has 4 categories viz., General Header, Request Header, Response Header, and Entity header. Each header line belongs to any of those categories. Request message headers belong to general, request or entity header category while response message header may be any of general, response or entity header categories.

General header gives general information about the message. It can be both in request and response message. General headers are given here with meaning:

  • Cache-control specifies information about caching
  • Connection specifies whether connection should be closed or not
  • Date tells current date
  • MIME-version tells which MIME version is in use
  • Upgrade specifies preferred communication protocol

Request header specifies client’s configuration and preferred document format. Can be used in request messages of HTTP. List of request headers with details:

  • Accept shows the medium format the client can accept
  • Accept-charset shows the character set client can handle
  • Accept-encoding shows encoding scheme the client can handle
  • Accept-language shows the language client can accept
  • Authorization shows what permission client has
  • From shows e-mail address of user
  • Host shows host and port no. of server
  • If-modified-since sends document if newer than specified date
  • If-match sends documents only if matches to given tag
  • If-non-match sends documents only if it doesn’t match given tag
  • If-range shows only the portion of the document that is missing
  • If-unmodified-since sends document if not changed since specified date
  • Referrer specifies URL of the linked document
  • User-agent identifies client program

Response header specifies server configuration and special information about the request. It can be presented only in the response message for data transaction on www. Some response headers are described here:

  • Accept-range shows if server accepts the range requested by client
  • Age shows the age of the document
  • Public shows the supported list of method
  • Retry-after specifies date after which server is available
  • Server shows name and version of server

Entity-header gives information about the document body. It is mainly available in response message but it can also be present in request messages such as POST or PUT methods. Types of headers used are explained here:

  • Allow tells all valid methods can be used with URL
  • Content-encoding specify encoding scheme
  • Content-language shows length of document
  • Content-range specify range of document
  • Content-type specify medium type
  • Etag gives an entity tag
  • Expires gives time and date when content may change
  • Last-modified gives date and time of last change
  • Location specifies location of created or moved document

Body: It is an optional part in both request and response messages of HTTP for data traversal on World Wide Web. It contains the document that is to send or receive.

C:\Users\Ankita\Desktop\http data transfer on www example-msa-technosoft.png

Example of HTTP data transaction on WWW

Suppose a Client wants to send data to Server by using POST method. Request line shows the method (POST), URL and HTTP version (2.0). We have taken 4 header lines here. The body of request message contains input information. The response message contains Status line and 4 header lines. A CGI document is created that is included as the body.

Categories:   tech blogs

Comments