29.1 LECTURE
GOAL
2
29.2 UNIFORM
RESOURCE
LOCATOR
(URL) 2
29.3 HTML 2
29.4 WEB
BROWSER
2
29.5 HTTP 3
29.6 MIME 3
29.7 RFC 3
29.8 ENCODING
AND DECODING
3
29.9 ENCODING
EXAMPLE
ESCAPE
SEQUENCE
3
29.10 VIRTUAL
DIRECTORY
4
29.11 WEB
BROWSER
FETCHES
A PAGES 4
29.12 HTTP CLIENT
REQUEST
4
29.13 FILE
EXTENSION
AND MIME 5
29.14 MIME ENCODING
5
29.15 HTTP STATUS
CODES 6
29.16 HTTP REDIRECTION
6
29.17 HTTP REQUEST
PER 1 TCP/IP CONNECTION
6
29.18 SERVER
ARCHITECTURE
7
SUMMARY
7
EXERCISES
7
Network Programming Part III 2
29.1 Lecture Goal
This lecture goal is to develop a little Web Server.
This Web Server will serve HTTP requests, sent via a Web Browser
using following
URLs:
http://www.ku.com/default.html
http://www.ku.com/index.asp
http://www.ku.com/win32.html
http://www.ku.com/courses/win32.html
29.2 Uniform Resource Locator (URL)
Anatomy of a URL (Uniform Resource Locator):
http://www.ku.com/courses/win32.html
http:// protocol
www.ku.com Web Server
courses/win32.html location of file on server
Or http://www.ku.com:80/.../....
:80 is the specifies Port Number to use for connection
29.3 HTML
HTML stands for Hyper Text Mark-up Language.
This language contains text-formatting information e.g. font
faces, font colors, font sizes,
alignment etc. and also contains
HyperLinks: text that can be
clicked to go to another
HTML document on the Internet. HTML tags are embedded within
normal text to make
it hypertext.
29.4 Web Browser
HTTP Client – a Web Browser examples are:
Microsoft Internet Explorer
Netscape Navigator
These web servers connect to your HTTP web server, requests a
document, and displays
in its window
Network Programming Part III 3
29.5 HTTP
HTTP is a Stateless protocol.
• No information or “state” is
maintained about previous HTTP requests
• Easier to implement than
state-aware protocols
29.6 MIME
MIME stands for Multi-purpose Internet Mail Extensions.
MIME contains encoding features, added to enable transfer of
binary data, e.g. images
(GIF, JPEG etc.) via mail. Using MIME encoding HTTP can now
transfer complex
binary data, e.g. images and video.
29.7 RFC
Short for Request for Comments, a series of notes about the
Internet, started in 1969
(when the Internet was the ARPANET). An Internet Document can be
submitted to the
IETF by anyone, but the IETF decides if the document becomes an
RFC. Eventually, if it
gains enough interest, it may evolve into an Internet standard.
HTTP version 1.1 is derived from HTTP/1.1, Internet RFC 2616,
Fielding, et al. Each
RFC is designated by an RFC number. Once published, an RFC never
changes.
Modifications to an original RFC are assigned a new RFC number.
29.8 Encoding and Decoding
HTTP is a Text Transport Protocol
Transferring binary data over HTTP needs Data Encoding and
Decoding because binary
characters are not permitted Similarly some characters are not
permitted in a URL, e.g.
SPACE. Here, URL encoding is used
29.9 Encoding Example Escape Sequence
Including a Carriage Return / Line feed in a string
printf(“Line One\nThis is new line”);
Including a character in a string not found on our normal
keyboards
printf(“The funny character \xB2”);
Network Programming Part III 4
29.10 Virtual Directory
Represents the Home Directory of a Web Server
IIS (Internet Information Server) has c:\inetpub\wwwroot\ as its
default Home Directory
Here, /courses/ either corresponds to a Physical Directory
c:\inetpub\wwwroot\courses
OR Virtual Directoy
In a Web Server, we may specify that /courses/ will represent
some other physical
directory on the Web Server like D:\MyWeb\. Then /courses/ will
be a Virtual Directory.
In Windows2000 and IIS 5.0 (Internet Information Server), a
folder’s “Web Sharing…”
is used to create a Virtual Directory for any folder.
29.11 Web Browser Fetches a pages
•
http://www.ku.com/courses/win32.html
• Hostname/DNS lookup for
www.ku.com to get IP address
• HTTP protocol uses port 80.
• Connect to port 80 of the IP
address discovered above!
• Request the server for
/courses/win32.html
29.12 HTTP Client Request
GET /courses/win32.html HTTP/1.0
Request line is followed by 2 Carriage-Return /Line-feed
sequences
Method Resource
Identifier
HTTP
Version
Crlf
Crlf
Network Programming Part III 5
HTTP/1.1 200 OK }Status Line
Content-type: text/html
Content-Length:2061
Headers delimited by CR/LF sequence
Crlf
Actual data follows the headers
29.13 File Extension and MIME
File extensions are non-standard across different platforms and
cannot be used to
determine the type of contents of any file.
Different common MIME types
image/gif GIF image
image/jpeg JPEG image
text/html HTML document
text/plain plain text
In an HTTP response, a Web Server tells the browser MIME type of
data being sent
MIME type is used by the browser to handle the data
appropriately i.e. show an image,
display HTML etc.
MIME:
MIME: Multi-purpose Internet Mail Extensions MIME Encoding
features were added
to enable transfer of binary data, e.g. images (GIF, JPEG etc.)
via mail. Using MIME
encoding HTTP can now transfer complex binary data, e.g. images
and video
29.14 MIME Encoding
MIME: Short for Multipurpose Internet Mail Extensions, a
specification for formatting
non-ASCII messages so that they can be sent over the Internet.
HTTP version Status Code Description
Network Programming Part III 6
Enables us to send and receive graphics, audio, and video files
via the Internet mail
system.
There are many predefined MIME types, such as GIF graphics files
and PostScript files.
It is also possible to define your own MIME types.
In addition to e-mail applications, Web browsers also support
various MIME types. This
enables the browser to display or output files that are not in
HTML format.
MIME was defined in 1992 by the Internet Engineering Task Force
(IETF). A new
version, called S/MIME, supports encrypted messages.
29.15 HTTP Status codes
404 Not Found
- requested document not found on this server
200 OK
- request secceeded, requested object later in this message
400 Bad Request
- request message not understood by server
302 Object Moved
- requested document has been moved to some other location
29.16 HTTP Redirection
HTTP/1.1 302 Object Moved
Location: http://www.ku.com
crlf
Most browsers will send another HTTP request to the new
location, i.e.
http://www.ku.com
This is called Browser Redirection
29.17 HTTP Request per 1 TCP/IP Connection
HTML text is received in one HTTP request from the Web Server
Browser reads all the HTML web page and paints its client area
according to the HTML
tags specified. Browser generates one fresh HTTP request for
each image specified in the
HTML file
Network Programming Part III 7
29.18 Server Architecture
Our server architecture will be based upon the following points
• Ability to serve up to 5
clients simultaneously
• Multi-threaded HTTP Web
Server
• 1 thread dedicated to accept
client connections
• 1 thread per client to serve
HTTP requests
• 1 thread dedicated to
perform termination housekeeping of communication
threads
• Use of Synchronization
Objects
Many WinSock function calls e.g. accept() are blocking calls
Server needs to serve up 5 clients simultaneously. Using other
WinSock blocking calls,
need to perform termination tasks for asynchronously terminating
communication
threads.
Summary
In this lecture, we studied some terms and their jobs. We
studied HTTP (hyper
text transfer protocol) which is used to transfer text data
across the net work. We also
studied HTML that is hyper text markup language which is simply
a text script. Html is
loaded in web browser and web browser translates the text and
executes instruction
written in form of text. For transferring media like image data
and movie data, we
overviewed MIME.
Note: For example and more information connect to Virtual
University resource Online.
Exercises
1. Create a chat application. Using that application, you should
be able to chat with
your friend on network. |