Chapter 29
29.1 L ECTURE
GOAL
2
29.2 U NIFORM
RESOURCE
LOCATOR
(URL) 2
29.3 HTML 2
29.4 W EB
BROWSER
2
29.5 HTTP 3
29.6 MIME 3
29.7 RFC 3
29.8 E NCODING
AND DECODING
3
29.9 E NCODING
EXAMPLE
ESCAPE
SEQUENCE
3
29.10 V IRTUAL
DIRECTORY
4
29.11 W EB
BROWSER
FETCHES
A PAGES 4
29.12 HTTP C LIENT
REQUEST
4
29.13 F ILE
EXTENSION
AND MIME 5
29.14 MIME E NCODING
5
29.15 HTTP S TATUS
CODES 6
29.16 HTTP R EDIRECTION
6
29.17 HTTP R EQUEST
PER 1 TCP/IP CONNECTION
6
29.18 S ERVER
ARCHITECTURE
7
S UMMARY
7
E XERCISES
7
Network Programming Part III
2
29.1 Lecture Goal
This lecture goal is to develop a little Web Server.
This Web Server will serve HTTP requests, sent via a Web
Browser using following
URLs:
http://www.ku.com/default.html
http://www.ku.com/index.asp
http://www.ku.com/win32.html
http://www.ku.com/courses/win32.html
29.2 Uniform Resource Locator (URL)
Anatomy of a URL (Uniform Resource Locator):
http://www.ku.com/courses/win32.html
http:// protocol
www.ku.com Web Server
courses/win32.html location of file on server
Or http://www.ku.com:80/.../....
:80 is the specifies Port Number to use for connection
29.3 HTML
HTML stands for Hyper Text Mark-up Language.
This language contains text-formatting information e.g. font
faces, font colors, font sizes,
alignment etc. and also contains
HyperLinks:
text that can be clicked to go to another
HTML document on the Internet. HTML tags are embedded within
normal text to make
it hypertext.
29.4 Web Browser
HTTP Client – a Web Browser examples are:
Microsoft Internet Explorer
Netscape Navigator
These web servers connect to your HTTP web server, requests
a document, and displays
in its window
Network Programming Part III
3
29.5 HTTP
HTTP is a Stateless protocol.
• No information or
“state” is maintained about previous HTTP requests
• Easier to implement than
state-aware protocols
29.6 MIME
MIME stands for Multi-purpose Internet Mail Extensions.
MIME contains encoding features, added to enable transfer of
binary data, e.g. images
(GIF, JPEG etc.) via mail. Using MIME encoding HTTP can now
transfer complex
binary data, e.g. images and video.
29.7 RFC
Short for Request for Comments, a series of notes about the
Internet, started in 1969
(when the Internet was the ARPANET). An Internet Document
can be submitted to the
IETF by anyone, but the IETF decides if the document becomes
an RFC. Eventually, if it
gains enough interest, it may evolve into an Internet
standard.
HTTP version 1.1 is derived from HTTP/1.1, Internet RFC
2616, Fielding, et al. Each
RFC is designated by an RFC number. Once published, an RFC
never changes.
Modifications to an original RFC are assigned a new RFC
number.
29.8 Encoding and Decoding
HTTP is a Text Transport Protocol
Transferring binary data over HTTP needs Data Encoding and
Decoding because binary
characters are not permitted Similarly some characters are
not permitted in a URL, e.g.
SPACE. Here, URL encoding is used
29.9 Encoding Example Escape Sequence
Including a Carriage Return / Line feed in a string
printf(“Line One\nThis is new line”);
Including a character in a string not found on our normal
keyboards
printf(“The funny character \xB2”);
Network Programming Part III
4
29.10 Virtual Directory
Represents the Home Directory of a Web Server
IIS (Internet Information Server) has c:\inetpub\wwwroot\ as
its default Home Directory
Here, /courses/ either corresponds to a Physical Directory
c:\inetpub\wwwroot\courses
OR Virtual Directoy
In a Web Server, we may specify that /courses/ will
represent some other physical
directory on the Web Server like D:\MyWeb\. Then /courses/
will be a Virtual Directory.
In Windows2000 and IIS 5.0 (Internet Information Server), a
folder’s “Web Sharing…”
is used to create a Virtual Directory for any folder.
29.11 Web Browser Fetches a pages
•
http://www.ku.com/courses/win32.html
• Hostname/DNS lookup for
www.ku.com to get IP address
• HTTP protocol uses port
80.
• Connect to port 80 of
the IP address discovered above!
• Request the server for
/courses/win32.html
29.12 HTTP Client Request
GET /courses/win32.html HTTP/1.0
Request line is followed by 2 Carriage-Return /Line-feed
sequences
Method Resource
Identifier
HTTP
Version
Crlf
Crlf
Network Programming Part III
5
HTTP/1.1 200 OK }Status Line
Content-type: text/html
Content-Length:2061
Headers delimited by CR/LF sequence
Crlf
Actual data follows the headers
29.13 File Extension and MIME
File extensions are non-standard across different platforms
and cannot be used to
determine the type of contents of any file.
Different common MIME types
image/gif GIF image
image/jpeg JPEG image
text/html HTML document
text/plain plain text
In an HTTP response, a Web Server tells the browser MIME
type of data being sent
MIME type is used by the browser to handle the data
appropriately i.e. show an image,
display HTML etc.
MIME:
MIME: Multi-purpose Internet Mail Extensions MIME Encoding
features were added
to enable transfer of binary data, e.g. images (GIF, JPEG
etc.) via mail. Using MIME
encoding HTTP can now transfer complex binary data, e.g.
images and video
29.14 MIME Encoding
MIME: Short for Multipurpose Internet Mail Extensions, a
specification for formatting
non-ASCII messages so that they can be sent over the
Internet.
HTTP version Status Code Description
Network Programming Part III
6
Enables us to send and receive graphics, audio, and video
files via the Internet mail
system.
There are many predefined MIME types, such as GIF graphics
files and PostScript files.
It is also possible to define your own MIME types.
In addition to e-mail applications, Web browsers also
support various MIME types. This
enables the browser to display or output files that are not
in HTML format.
MIME was defined in 1992 by the Internet Engineering Task
Force (IETF). A new
version, called S/MIME, supports encrypted messages.
29.15 HTTP Status codes
404 Not Found
- requested document not found on this server
200 OK
- request secceeded, requested object later in this message
400 Bad Request
- request message not understood by server
302 Object Moved
- requested document has been moved to some other location
29.16 HTTP Redirection
HTTP/1.1 302 Object Moved
Location: http://www.ku.com
crlf
Most browsers will send another HTTP request to the new
location, i.e.
http://www.ku.com
This is called Browser Redirection
29.17 HTTP Request per 1 TCP/IP Connection
HTML text is received in one HTTP request from the Web
Server
Browser reads all the HTML web page and paints its client
area according to the HTML
tags specified. Browser generates one fresh HTTP request for
each image specified in the
HTML file
Network Programming Part III
7
29.18 Server Architecture
Our server architecture will be based upon the following
points
• Ability to serve up to 5
clients simultaneously
• Multi-threaded HTTP Web
Server
• 1 thread dedicated to
accept client connections
• 1 thread per client to
serve HTTP requests
• 1 thread dedicated to
perform termination housekeeping of communication
threads
• Use of Synchronization
Objects
Many WinSock function calls e.g. accept() are blocking calls
Server needs to serve up 5 clients simultaneously. Using
other WinSock blocking calls,
need to perform termination tasks for asynchronously
terminating communication
threads.
Summary
In this lecture, we studied some terms and their jobs. We
studied HTTP (hyper
text transfer protocol) which is used to transfer text data
across the net work. We also
studied HTML that is hyper text markup language which is
simply a text script. Html is
loaded in web browser and web browser translates the text
and executes instruction
written in form of text. For transferring media like image
data and movie data, we
overviewed MIME.
Note: For example and more information connect to Virtual
University resource Online.
Exercises
1. Create a chat application. Using that application, you
should be able to chat with
your friend on network.
|