534fb19deda8d661a65743880572bcb7.ppt
- Количество слайдов: 147
The Internet
Overview § An introduction to HTML § Dynamic HTML § Encryption § Public Key Infrastructure § Development of the Internet § Web Browsers
Top 10 uses of Internet at Work (2000) § 1. E-mail: 73% § 2. Business related research: 35 § 3. Academic Research: 23 § 4. General browsing/surfing: 17 § 5. IT information: 11 § 6. Downloading Software: 11 § 7. News information: 10 § 8. Searching for personal information: 9 § 9. Reading Magazines/Newspapers: 7 § 10. Sports information: 7
Overall Structure of Internet
How does the World Wide Web works? 1. User must have a program called "browser" running on the computer: Internet Explorer (IE) or Netscape 2. User establishes a connection with an ISP (Internet Service Provider) via dialup or LAN (local area network). 3. User types in an URL (Uniform Resource Locator) as the target webpage address in browser's address field. For example, http: //www. csd. uwo. ca/~cs 031
4. (4 -6 are behind the scene) Through ISPs, the English URL is translated into a numerical IP (Internet Protocol) address. Eg: 130. 100. 11. 3 5. User's browser uses the IP address to establish a connection via local, regional, and/or national ISPs, with the target computer (a web server). 6. The web page that the user wants, HTML page, is sent back to user's browser. 7. User's browser interprets HTML commands, and displays the page with nice format to the user. HTML pages can have o o o Formatting information (text formatting, framing, etc. ) Hyperlinks (user clicks on and browser repeats steps 3 -6) Multimedia (pictures, audio, video, animations)
A Simple Example (simple. html)
- Adventurous travelling around the world
- Watching good movies
- Reading news at CNN
Building Webpages § Writing html files directly (using notepad or other text editors) § Using MS Word and save as html § Using specialized software: MS Frontpage, Dreamwaver, etc. § Adding animations, forms, javascript, database functionality, …
Writing Simple html pages § Start notepad and writing html code § § directly Save it as an html file (eg, my. html) Start browser (eg, Internet Explore) Click file > open, click browse to locate and open the html file (eg, my. html). You will see how the html file is displayed!
HTML § HTML – Hyper. Text Markup Language § A language used to define the content of, and the presentation instructions for, a Web document
§ When a browser presents a Web document, the browser scans the document and applies the presentation instructions to the content § Content that does not have presentation instructions will be presented using default instructions built into the browser
§ HTML documents must employ a simple format so anyone can create documents § HTML documents are stored in text (ASCII) files § This type of document can be created using any editor that allows you to save the document as a text file
§ To combine the content and the presentation instructions in the same file, there must be a way to distinguish between these two components § In HTML, the presentation instructions are inserted as “tags” § Anything that isn’t a presentation instruction is content
§ HTML tags normally occur in pairs § The pair of tags surround the content to which they apply § A start tag is indicated with angle brackets §
§ HTML has a set of predefined tags § These tags can be used to § Control how the text in the document is displayed § Insert images into the document § Insert links to other documents
Document Tags § HTML documents are enclosed within and tags § Every HTML document will have a head and a body § The document head is enclosed within the
and tags § The body is enclosed within the and tags§ The basic structure of an HTML document is § The
§ The
of the document contains information used by the browser § All of the content for the document and the associated presentation instructions are placed inside the tagsFormatting Tags § HTML contains tag definitions that allow you to control § § § Headings Style Ordered Lists Unordered Lists Definition Lists etc.
Heading Tags § There are six heading levels § The levels are named H 1, H 2, H 3, … H 6 where H 1 is the largest and H 6 is the smallest § To create a heading, you enclose the text of the heading inside the opening and closing tags for the heading level
Heading Examples
Physical Style Tags § Used to control the display of text § - bold § - italics § - underline § - typewriter type face
Physical Style Tag Example
Logical Style Tags § Examples of logical style tags § - for emphasis § - stronger emphasis § - citation § - computer code
Logical Style Tag Example
Layout Style Tags § Used to control text layout §
- new paragraph §
- break, start a new line §
- horizontal rule, draw a line
Layout Style Tag Example
Lists § Lists of data can be defined using § Ordered List – enumerated lists § Unordered List – bulleted lists § Definition List – lists that are made of terms and their associated definitions
Ordered List § Use the
- and
§ The TYPE parameter controls what enumeration scheme is used § The types are: § § § 1 – numbers (default) a – lower case letters A – upper case letters i – small Roman numerals I – large Roman numerals
Ordered List
Ordered List
Ordered List
Unordered List § Use the
- and
§ The TYPE parameter can be used to control the look of the list § The types are: § § § Disc – a solid disc Circle – a hollow circle Square – a square symbol
Unordered List
Unordered List
Definition List § The
- and
Definition List
URL § An URL is a Uniform Resource Locator § An URL contains information about § § The address of a document on the Internet The protocol that will be used to access the document
Protocols § HTTP – Hyper. Text Transfer Protocol § Designed to transmit files on the World Wide Web § FTP – File Transfer Protocol § Designed to transmit files over the Internet (before the Web developed) § ftp: //ftp. csd. uwo. ca § Email: mailto: ling@csd. uwo. ca § These protocols are sets of rules that dictate how files are transmitted between computers
URL Example § In the following URL example, the protocol to be used is HTTP (before the “: //”) § The document is “browse. html” and it is located in the “selected” folder at the World Wide Web site for UWO in Canada
Images § Images are added to documents using the tag § A tag is not required § The SRC parameter is used to indicate the Sou. RCe of the image
Image Formats § Standard image formats are needed so images can be § § § stored retrieved transmitted over the Web
§ Examples of image formats used on the Web are: § § GIF – Graphics Interchange Format JPG ( JPEG ) – Joint Photographic Experts Group PNG – Portable Network Graphics BMP – Windows Bitmap
Graphics Interchange Format § Uses the Lempel-Ziv Welch (LZW) compression algorithm § The algorithm searches the image for big blocks of the same color and then compresses these blocks § This compression reduces the size of the image
§ The algorithm also uses an indexed color scheme, in which a custom color palette for the image is selected using only 256 of the over 16 million available colors § This format is used when the image does not contain a wide range of colors or color shades
Joint Photographic Experts Group § Images can contain millions of colors § Uses Lossy compression algorithm § When the image is compressed it permanently loses some of its quality § The algorithm looks for similar colors (like a range of reds) and chooses the same red for very close shades
§ If the original image had 1, 000 shades of red, the compressed image may have only 500 shades § The human eye cannot detect all the shades so in general the lose will not be noticed § This format is used when the image contains many colors and many color shades
Portable Network Graphics § Portable Network Graphics format § § was designed to replace GIF Uses loss less compression like GIF Provides better resolution and more colors like JPG Generates smaller files like GIF Is not supported by all versions of browsers
Windows Bitmap § Every pixel in the image is represented by a piece of data § The data represents the color of the pixel § Bitmap images are very large § Rarely used on Web pages because of the time required to download the image
Image Tag
Anchors § Anchor tags ( and ) are used to insert hyperlinks and bookmarks into HTML documents § A hyperlink is a link to another document on the World Wide Web § A bookmark is a named location within an HTML document
No. 1 use of Anchors: Anchors as Hyperlinks § An example of a link to the UWO home page § When the HTML is rendered the document will contain a link to UWO
§ The Link Item is the text or image that you click on to activate the link § The HREF parameter is the Hypertext REFerence parameter § The HREF parameter is used to define the link destination
An Image as a Link
No. 2 use: Anchors as Bookmarks (in the same document) § An example of the definition of an (invisible) bookmark using the NAME parameter (normally in a long html file) …. .
§ An example of a link to a bookmark within the same document (in the same html document) § Note the use of # You can see conclusions here … Back to top … …
In a long html file (say papers. html) …. You can see conclusions here …… ……
No. 3 use: combining 1 and 2 Anchors as Hyperlinks to bookmark in a different document § The form of the anchor tags used as a hypertext link is Link Item
§ An example of a link to a bookmark within another html document Click here to jump to conclusions in that document. § See a real example in http: //www. csd. uwo. ca/faculty/ling/cs 031/simple. html § If linking to a bookmark in the same document, the URL is omitted
Web Page Example 1 § Create a Web page with § “My First Web Page” as the title § Your name as a level 2 heading § An enumerated list of your three favorite University courses § An image for the University. Try “http: //www. uwo. ca/gifs/uwologo 4. gif” as the source URL. If this URL doesn’t work, look at the HTML source for the University’s home page to find an URL
Web Page Example 2 § Create a Web Page with § A TV show name as a level 1 heading at the top of the page § A paragraph of text about the show § Bold the stars names and italicize the night that the show is broadcast within this text § § A horizontal line A link to a Web page for the show. Use the name of the show as the link text A horizontal line A link to the heading at the top of the page, using “Top” as the link text
DHTML § Dynamic HTML § Supported by fourth generation and later browsers (Netscape and IE) § DHTML allows the user to interact with a web page § The user can enter values and select buttons
§ The user of a DHTML page can enter data and then have the data sent (posted) to a web site § The computer hosting the web site can then process the data § DHTML Example 1 § DHTML Example 2
Encryption § Encryption involves encoding a § § message to conceal the meaning Consider the name NORMA JEAN BAKER The name has been encrypted as OPSNB!KFBO!CBLFS What is the encryption algorithm? How would you decode the message?
§ For encryption to work there must be an algorithm that is applied to the original message § There must also be a way to decode the encrypted message to obtain the original message
§ Encryption algorithms use a binary “key” to encrypt and decrypt messages § There are two types of encryption algorithms used to secure Internet transmissions § § Symmetric Key Encryption Asymmetric (Public) Key Encryption
Symmetric Key Encryption § Symmetric Key Encryption can use the same key for both encryption and decryption § The sender and the receiver must both know the key § Both must ensure that the key is kept secret § If the key becomes public then others can decrypt valid messages and create fake messages
Key Length § For Symmetric Key Encryption, the typical key lengths are 40, 56 and 128 bits § Key length is one measure of encryption strength § Longer keys provide stronger encryption § An additional bit in the key doubles the strength of the key
Data Encryption Standard § The Data Encryption Standard (DES) is the U. S. government’s standard for data encryption § Uses the Data Encryption Algorithm (DEA) to encrypt/decrypt the message § An improvement on the Lucifer algorithm developed by IBM in the early 1970 s § Uses a 56 bit key
Triple DES § Uses a key three times as long as Standard DES § 168 bit key § Used for banks and other organizations that transmit highly sensitive data
Public Key Encryption § Asymmetric Key Encryption § Uses a pair of keys, one public and one private § Key length is a least 512 bits § The public key is published so any sender can obtain it § The private key is kept secret
§ Messages encrypted using the public key can only be decrypted by using the private key § There reverse is also true, messages encrypted using the private key can only be decrypted using the public key § This is one way to generate a digital certificate (to sign a message)
§ Rhonda wants to send an email to Rick § § Rhonda finds Rick’s Public Key through a Public Key directory She encrypts the message using Rick’s Public Key and sends the message Rick uses his Private Key to decrypt the message (his Public Key will NOT decrypt the message) For Rick to respond, he must use Rhonda’s Public Key to encrypt the message
Encryption Strength § The strength of an encryption depends on the algorithm used and the length of the key § The algorithms used in most implementations of Public Key Encryption are patented by RSA Data Security Inc.
RSA Algorithms § The RSA Public Key Cryptosystem was developed in 1977 by § § § Ronald Rivest Adi Shamir Leonard Adleman § They have created a number of 128 bit key algorithms § For example, RC 2 and RC 4
Code Breaking § For Symmetric Key Encryption, the typical key lengths are 40, 56 and 128 bits § Tests have been conducted to determine how long it will take to break messages encoded using various key lengths
Key Length 40 bits Broken in … 3 hours 48 bits 13 days 56 bits 40 days 128 bits ? ? ? § The 128 bit encryption has not been broken yet! § “The sun will burn out first” is a frequent estimate of how long it will take!
US vs International Security § Under current U. S. policy, software manufacturers can only sell 40 bit key encryption systems overseas § Some exceptions can use 56 bit keys § International banks
§ In the U. S. , 128 bit keys are recommended to ensure secure communications § Why would the U. S. want to restrict key length in software used in other countries?
Public Key Infrastructure § A Public Key Infrastructure is an encryption and digital certificate delivery system which makes secure electronic transactions possible § The X. 509 Standard
§ PKI uses Digital Certificates § A digital signature § Digital Certificates carry the same legal weight as a written signature § Provides a way for others to verify your identity § Uses Public Key Encryption
§ A Digital Certificate relates you to a set of public and private keys § Digital Certificates are used to provide secure transactions through the Secure Sockets Layer Protocol (SSL)
SSL Protocol Developed by Netscape Goal is to provide secure and reliable communication between applications § § § For example, between a Web application (your browser) and a Web site
§ § Public Key Encryption is used by each application to establish the identity of the other application Symmetric Encryption is used for data encryption
§ Public Key Encryption is used to exchange the key used by the Symmetric Encryption of the data § The reliability of the message is ensured by including a Message Authentication Code (MAC) as part of the data
§ SSL takes the message to be transmitted and § § § fragments the data into manageable blocks optionally, compresses the data performs a message integrity check encrypts the data transmits the result
§ Received data is § decrypted § verified § decompressed § reassembled § delivered to the client
Digital Trust § Public Key Infrastructure (PKI) manages all aspects of Digital Trust § In the digital world, trust requires § § Privacy Integrity Non-repudiation Authentication
Privacy § To ensure privacy, messages are encrypted § Encryption ensures that the message cannot be read in transit or by anyone except the recipient
Integrity § Verify the integrity of the message § Ensure that the message that is received is exactly what was sent
Non-repudiation § The sender cannot deny or repudiate a valid message § For example, when a stock broker receives an order for stock trades, the client cannot later claim that they didn’t send the message
Authentication § Verify that the sender is who they claim to be
The Internet § Networks of networks § Tens of thousands of computer networks § Reaches 100’s of millions of people § How did the Internet develop?
§ Started with ARPANET, an experimental project of the U. S. Department of Defense Advanced Research Projects Agency (DARPA) in 1969 § The original purpose was to explore experimental networking technologies for the military
§ How large is the Internet? § Nobody knows for sure! § According to the Internet Society (ISOC), a professional organization of Internet developers, influencers, and users, the Internet reaches more than 170 countries
Internet Growth Year 1969 1974 1979 1984 1989 1994 1999 Number of Hosts 4 62 188 1, 024 80, 000 2, 217, 000 43, 230, 000
§ One of the reasons the Internet has been so successful is the commitment of its developers to producing “open” standards § The specifications or rules that computers need to communicate are publicly and freely available; published so that everyone can obtain them
TCP/IP § The standards that the Internet uses are known as TCP/IP § Transmission Control Protocol/Internet Protocol suite § Without open standards, only computers from the same vendor could talk to one another
§ Computers and networks that conform to the same communications standards are able to “interoperate”, regardless of the manufacturer § All of the networks and computers act as peers in the exchange of information and communication
Packets § Communication on the Internet revolves around the concept of a packet, a basic building block § All information and communications transmitted on the Internet are broken into packets, each of which is considered to be an independent entity
§ The packets are individually routed from network to network until they reach their destination, where they are reassembled and presented to the user
§ This method of networking is very flexible and robust § It allows diverse computers and systems to communicate by means of network software, not proprietary hardware
§ If a network goes “down” (breaks down), then the packets can be rerouted through other parts of the network of networks § This dynamic alternate routing of information creates a very persistent means of communication
Internet Development § There have been three generations of Internet development § They characterize the evolution of the Internet
First Generation § There were three main First Generation Tools § § § Electronic mail Remote logon File transfer § These tools are still available on all parts of the Internet
Electronic Mail § Uses Simple Mail Transfer Protocol (SMTP) § Standardized in 1983 § Originally designed to transmit plain text § § Printable characters NOT binary files, graphics or sound
§ Current systems use Multipurpose Internet Mail Extensions (MIME) § MIME allows the email system to transport § Plain text, binary files, graphics and sound § MIME encodes and decodes complex messages into a simpler form that SMTP can transport
§ Characteristics of email programs § Composition § Response § Read § Delete § Organize § Filter
Email Address § An email address consists of a local part and a host part § For example, csdept@csd. uwo. ca
csdept@csd. uwo. ca § The local part is a user name, mailbox, login name or user id § csdept § The host part is the name of an email server on the Internet § csd. uwo. ca
POP and IMAP § Protocols like the Post Office Protocol (POP) and the Internet Message Access Protocol (IMAP) are used to transmit email from § § your computer to your email server to your computer
Simple Mail Transfer Protocol § The Simple Mail Transfer Protocol (SMTP) is used to transmit email between email servers
§ To send an email § Construct the message on your computer § When you click on “Send”, the message is moved using POP or IMAP to your email server § The email server uses the host part of the address to determine where to send the message
§ When the message arrives at the destination email server, it is stored and the recipient is notified of its arrival § When the recipient wants to read the message, it is moved using POP or IMAP to their computer
Remote Logon § Allows you to logon to a computer over the Internet § A utility that handles remote logon is Telnet § To remotely connect to a computer, you must know the address of the computer § For example, mccarthy. csd. uwo. ca
§ On most host computers, you must have an account on the computer § Some host computers allow you to logon as “Anonymous” or “Guest” with your email address as the password § Anonymous logon
File Transfer § The File Transfer Protocol (FTP) § Used to copy (download) files over the Internet
§ FTP was designed to copy plain text files § HTTP was designed to transmit text files, graphics, sound, etc. § FTP is faster than HTTP because FTP doesn't perform as many checks on the data during the download process
§ FTP allows you to § connect to another computer § list the files in a folder on the other computer § copy files back and forth between the two computers § Anonymous FTP allows you to logon as “Anonymous” or “Guest” with your email address as the password
Second Generation § The Second Generation saw large increases in § § The amount of data being made public The number of Internet users § There was an increasing need for tools that would aid users in finding resources
Tools § The first tool was Gopher § Developed at the University of Minnesota, where the mascot is a Golden Gopher!
§ Gopher was a hierarchical system of menus § § The top level menu contained general categories The information became more specific as you drilled down § Looked a lot like Yahoo!
Veronica § Very Easy Rodent-Oriented Net-wide Index to Computerized Archives § The University of Nevada § Gopher allowed you to search through the categories looking for interesting resources § But it was a manual search § Veronica allowed the user to submit keywords and the utility did a search of gopher space
Archie § Archie is derived from the word archive § Developed at the Mc. Gill University School of Computer Science § Maintained a database of all the names of files stored at known public FTP sites § Helped find files at FTP sites
Network News - USENET § USENET is a network within the Internet § Divided into newsgroups § Each newsgroup is devoted to a topic § To read or post to a newsgroup you need a news reader application
Newsgroups § More than 80, 000 newsgroups § Newsgroups are divided into hierarchies § § alt – 10, 159 alternate groups microsoft – 991 groups bionet – 94 groups biz – 48 groups § Newsgroups are added daily so these numbers are out of date!
Third Generation § The World Wide Web § Tools § Browsers § Search engines § Directories
World Wide Web § Originally developed by the European Laboratory for Particle Physics (also known as CERN) by Tim Berners-Lee of Switzerland § He developed a system to link together scholarly references § The links from one document to another are imagined to form a web!
§ The World Wide Web is a browsing and searching system § Built on the concept of hypertext and hypermedia
§ The Web is a continuous distributed information construction project § Tens of thousands of people are adding knowledge to it daily by bringing up their own servers or posting documents on existing servers
Browsers § A browser is application software § Browsers use HTML documents as their input § The HTML tags in the document are applied to the content and the result is displayed in the browser
Mosaic § The first popular graphical browser § It was developed at the National Center for Supercomputing Applications (NCSA) in Champaign, Illinois by Marc Andreessen § Allows a user to click on text, graphics, buttons or icons that link to other resources
Netscape § Developed by Netscape Communications Corporation § The company was founded in April of 1994 by Marc Andreessen, creator of the NCSA Mosaic software and Dr. James H. Clark, the founder of Silicon Graphics, Inc.
§ Microsoft’s Web browser is Internet Explorer § All browsers have the same basic functionality, they just have a slightly different “look and feel”
Browser Functionality § Typical functionality § Display HTML documents § Create bookmarks § Send and read email § Read news § Display and create the source HTML for documents § Debug script on DHTML pages
Search Engines § One of the most difficult tasks for a Web browser is to make it easy for the user to find resources § Search engines allow users to do keyword searches § These searches are actually database searches § Search engines keep databases that match keywords to document URLs
Directories § The top level of directories indicate general categories § As the user drills down into a category, they are presented with more specific categories
§ Consider Web. Crawler and Google § These two are typical World Web tools § They both provide basic and advanced search capabilities as well as directories
Advanced Searches § Each search engine has its own syntax for describing a search § Most engines AND together keywords § The document must have all of the keywords § The search engine should also support OR, NOT and exact phrases
§ Check out the Web. Crawler and Google advanced search pages for examples of typical advanced search strategies
§ You can submit a page to be included in searches and directories § § Web. Crawler Google § Search engine databases also get information about documents from programs called robots that explore the Web looking for documents to add to their database