CJST 4524/6604 – Module 1 Assignment
Web page source code and using online network tools
With the abundance of web sites, there is a possibility of gaining open source information about the individuals behind the site when we are conducting an investigation. In this assignment, you will be learning how to examine the code behind a web page and how to use tools that are available online that can provide us information about a domain.
Web page source code
The first thing to do in this assignment is to find a web site that links to at least 5 other web pages at other domains. One you have the selected web page loaded, you need to look at the source code behind the page. All browsers will let you see the source code behind the page. The following menu choices will open up the source code in a separate window in several popular browsers. If you are using a browser not included here, you may have to poke about a bit to find the view source option.
Chrome
View | Developer | View Source
Firefox
Tools | Web Developer | Page Source
Safari
View | View Source
Internet Explorer
Right mouse click | View Source
Once you have the window with the source code open, it should look something like the image below.
Web pages are written in HTML, Hyper Text Markup Language. The information that controls how the text, images, and links are placed on the page are contained in commands called tags. HTML tags are contained in <>’s. The tag we are interested in for this assignment is the one that contains the address of other web pages that a page links to, the “a href” (anchor hyper text reference) command shown below.
This command contains the address of the web page that will be loaded if the individual viewing the page clicks on the link within the “ “s. In the example above, the address is http://www.newhaven.edi/prospective/. The domain of the link immediately follows the http://. In this example, the domain is www.newhaven.edu. In the first part of the assignment, you should find five different domains on the web page you have chosen.
Using online network tools
There are many web sites that offer free online tools that assist in investing Internet domains. One of my favorites is:
Network Tools: The Trusted Free Online Network Tools Provider For 20 Years
but feel free to goggle “online whois” and try any of the other sites that you find.
First, when we are looking up domains, we only need the name or the domain and its associated TLD (Top Level Domain). In the example, the domain name is newhaven and the TLD is .edu. We don’t want anything before the domain name, such as www and we don’t want anything following the TLD.
The example to the left shows what options are available through network-tools.com web page menu. For this exercise, make sure that “Express” is selected. Using the network-tools web site, we are going to use several different network tools all at the same time.
The image below shows where to type in the domain name that is being investigated. One you have typed in the domain name followed by the TLD, just click the “GO!” button.
The first result that is returned is the lookup information. This gives us the IPv4 number that is associated with the domain we entered. We could also enter the IP number if we need to find the name of the domain associated with the IP number if we have that information instead. We will be given any aliases used by the domain and the geographical location of the IP
Next, we see the results from the traceroute. The traceroute shows all of the hops that a packet makes while traveling from our computer to the domain we are investigating. While this information is not always that helpful, it can give us an idea about how well connected to the Internet that a domain is. Often, though, the information ends up being unavailable as the example below shows.
The next information we get tells us the address of the DNS servers used by the domain. Every domain needs to have a DNS server to translate domain names to IP numbers and vise versa.
The answer records contain more information about the DNS server as well as what servers the domain hosts. MX indicates a mail server. NS is a name server. “A” is the IP address associated with the domain.
Lastly is the “whois” search. Whois, returns the name of the organization to whom the domain is registered. Also returned is information as to what DNS servers the domain is using, when the domain was created, the last time there was a change in any information about the domain, and when the domain expires. Very often, the name, snail mail addresses,and e-mail addresses to the individual responsible for the administration of the domain and the person responsible for the technical aspects of the domain are also returned, as in the example.
Where does all of this information reside? There are several organizations responsible for maintaining the databases of information about domains and their associated IP addresses. These organizations cover different geographical areas. ARIN does all of the Americas, RIPE handles all of Europe, and so on. Often though, the information about a domain can exist on several smaller databases under the control of the main organizations.
Unfortunately, the system is not perfect. Very often, there is no one to check if the names and addresses of those registering a domain are, in fact, legitimate. There are also companies, such as GoDaddy and Tucows, which register domains for individuals. While the individual still registers the domain name, a whois will return the name of the registration company. This protects the privacy of someone registering a domain, but can make it more difficult for investigators.
Since the information about a domain can reside on more than one database, the investigator should try more than one online tools service to see if there are any differences. We won’t do that in this first exercise.
So, to complete your first assignment, investigate the web page you have chosen and the five other domains your page links to. In a Word document, list each domain name and include a screen capture of the information that network-tools returns to you.
Using the information above – complete the following assignment
Assignment:
• Select a fairly good size web page and using an online whois tool of your choice, trace the approximate geographic location and the owner of your chosen site.
• Record all the information you can find about the page and submit the following:
o a) the URL (address) of your chosen web page;
o b) the source code for the page;
o c) the IP address, domain name, approximate geographical address of the server based on the IP address, administrative contact & address, technical contact & address, and other information you might find useful in an investigation.
• Include the answers to the following questions:
o a) if you were not able to obtain direct contact information with the owners of the domain, why was that?
o b) did you uncover any other discrepancies such as the administrative or technical contacts appeared to be in a different geographic location than the computers hosting the domain? Hypothesize about what this could mean.