|
Laboratory Week 4 - Implementing a Web Client
Learning Objectives:
- Implement a simple web client that request URLs and display the responses it receives.
- Learn the classes and the methods available in Perl's LWP to complete assignment 1 part II.
Required Reading:
Lectures: Week 3 lectures
Unit Reader 3: Web Clients pages 247-260Access to Software:
In this lab, you will need access to perl to write your scripts. See week 3's lab part B on how to do this. You will also need Internet access, as well as access to your Apache web server you set up in week 2's lab.
Access to Example Scripts:
The following exercises make use of Perl scripts described in the lectures. These example scripts are available in the unit coordinator's directory (~hiew/examples/web/) on gryphon. Use the command "cp" to copy the file over to your work directory on gryphon. Eg.
gryphon:~> mkdir labwk4
gryphon:~> cd labwk4
gryphon:~/labwk4> cp ~hiew/examples/web/* .
Instructions:
Using your favourite web browser, access the URL http://www.murdoch.edu.au. Use your browser View>Source function to look at the HTML source code for the retreved page.
Use the get_url.pl script to access the same URL:
perl get_url.pl http://www.murdoch.edu.au
The script should display the contents of the HTML page. Compare the output to the HTML code in (1). You should see that they are the same. If the results from the script scrolls by too fast for you to view, you can add "| more" to the end of the above commands. Eg.
perl get_url.pl http://www.murdoch.edu.au | moreUse another example script getall.pl to try the same access as in (2). The getall.pl script does exactly what get_url.pl does, except it display the whole response message instead of just a successful retrieved resource. Note the difference between the code (last 2 lines) in the two scripts, and the difference between the output of the two scripts. Identify the status line, headers, and message body of the response.
Try using the getall.pl script to access other sites. Eg.
- http://mirror.aarnet.edu.au/
- http://www.microsoft.com.au/
- http://www.netscape.com.au/
Look at the information available in the responses from these sites. Note especially the status line and the the headers. Try and interpret some of the basic header information (some of it is described in week 1 lecture 2). For example, can you determine what kind of web server was handling each of those sites?
Exercises:
Start your own Apache web server on gryphon. Use the getall.pl script to access resources from that server's web site (as established in lab week 3 part A exercise 1). Show the responses for each one of the following situations:
- Access a resource that exists.
- Access a resource that doesn't exists (ie. wrong URL).
What is in response for each request? What is the value that a web client should look at the determine if the resource exists?
Modify getall.pl so that instead of displaying the response it receives, it displays the request that it sends out.
Modify your script from (2) so that you add a header with the name "LittleHeader" and value "ABC".
Modify getall.pl so that the script will only display the name of the server that sent the response, and the MIME-type of the response message body.
Modify getall.pl so that the script will display the response message only if the response came from an Apache web server.
Modify get_url.pl so that after receiving the response, the script displays lines in the content one line at a time, with a line count appearing at the beginning of each line. [Hint: use split to break up the whole response into individual lines.]
Modify get_url.pl so that after receiving the response, the script only displays lines in the content containing the string "http".
Modify get_url.pl so that after receiving the response, the script displays all URLs contained in <a href=...>...</a> tags. Assume the simpler case that the tag will not be over multiple lines. Note however that the tag may contain any other attribute besides "href" appearing in any order.
Assessable Tasks:
Exercises 1 to 5 above.
Internal students should demonstrate to your tutor by executing your scripts with your tutor present. This must be done in week 4 or week 5. No marks will be awarded if the work is demonstrated after that - your tutor have no discretionary power on this deadline.
External students should submit the scripts and sample runs during the deadline as indicated in the Study Schedule.
Perl References:
The purpose of this lab is only to get you familiar with the example web client from the lectures. However, it doesn't cover everything that is available from the lecture notes and the Unit Reader. I suggest you try some of the other methods available in HTTP::Request and HTTP::Response modules as listed in Unit Reader 4 to familiarise yourself with the modules.
There will be times when the material supplied in the lectures and Unit Reader is not sufficient to support what you want to do in your scripts. When that situation arises, you may refer to the Perl references as suggested in week 2 lecture 2 Introduction to Perl.
Final Reminder:
Remember to stop your server from Exercise 1 before logging off!
H.L. Hiew
Unit Coordinator
Document author: H.L.
Hiew, Unit Coordinator
Last Modified: Wednesday, 25-Feb-2004 03:00:34 MST
Disclaimer & Copyright
Notice © 2004 Murdoch University
This document is relevant for semester 1, 2004 only