HTTP is an application layer is object-oriented protocol, because of its simple, quick way to apply to distributed hypermedia information systems. It made in 1990, after years of use and development are constantly improving and expanding. Currently used in the WWW is the sixth version of HTTP/1.0, HTTP/1.1 standardization work is ongoing, and HTTP-NG (Next Generation of HTTP) of the recommendations have been made.
The main features of HTTP protocol can be summarized as follows:
1. To support client / server model.
2. Simple and fast: our customers request to the server service, just send the request method and path. Request method used are GET, HEAD, POST. Each method provides the client and the server connection type. As the HTTP protocol is simple, so small HTTP server process, thus fast communication.
3. Flexible: HTTP allows transmission of any type of data object. The type being transmitted by the Content-Type to be marked.
4. No connection: no connection means that the restrictions only handle one request per connection. Server processed client request and receive the customer's answers, that is disconnected. In this way can save transmission time.
5. No state: HTTP protocol is stateless protocol. Stateless protocol for transaction processing is not memory. The lack of state means that if the information in front of the follow-up treatment needed, it must be retransmitted, this may result in the amount of data transmitted per connection increases. On the other hand, the server does not need prior information on its rapid response.
1, HTTP protocol Xiangjie the article URL
http (Hypertext Transfer Protocol) is a model based on the request and response, no status, and application layer protocol, TCP-based connections frequently, HTTP1.1 version gives a persistent connection mechanism, the vast majority Web development, are built on the HTTP protocol on the Web application.
HTTP URL (URL is a special type of URI, contains a resource used to find enough information) format as follows: http://host [":" Port] [abs_path ]
expressed through HTTP protocol http to locate network resources; host that legitimate Internet host name or IP address; port specify a port number, use the default port for the space 80; abs_path specify the requested resource URI; If the URL is not given abs_path, then when it is a request URI, must be "/" given in the form, usually the work of the browser automatically help us to complete.
eg: 1, Input: www.guet.edu.cn
Browser automatically convert: http://www.guet.edu.cn/
Second, HTTP protocol request of articles Xiangjie
http request consists of three parts, namely: the request line, message header and request body
1, the request line to the beginning of a method symbol, separated by a space, followed by the request URI and protocol version, the format is as follows: Method Request-URI HTTP-Version CRLF
One Method that the request method; Request-URI is a Uniform Resource Identifier; HTTP-Version of the HTTP protocol version that the request; CRLF that carriage return and line feed (in addition to the CRLF as the end of things, does not allow for a separate CR or LF characters) .
Request method (all methods are all uppercase) There are many, various methods are explained as follows:
Request-URI GET requests for access to the resources identified
POST Request-URI in the resource identified by the additional new data after
HEAD request from the Request-URI for the resource identified by the response message headers
PUT requests the server to store a resource, and with the Request-URI as its logo
DELETE Request-URI request for the server delete the resource identified by the
TRACE requests the server received a request to send back information, mainly for testing or diagnostic
CONNECT reserved for future use
OPTIONS request query performance of the server, or check the options and resources and needs related
GET method: in the browser's address bar enter the URL of the access page, the browser to the server using GET method of access to resources, eg: GET / form.html HTTP/1.1 (CRLF)
POST method requires the server to accept the requested data attached to the back of the request, commonly used in the submit the form.
eg: POST / reg.jsp HTTP / (CRLF)
Accept: image / gif, image / x-xbit, ... (CRLF)
HOST: www.guet.edu.cn (CRLF)
Content-Length: 22 (CRLF)
Connection: Keep-Alive (CRLF)
Cache-Control: no-cache (CRLF)
(CRLF) / / The message header CRLF that has ended before the message header
user = jeffrey & pwd = 1234 / / This line is followed for the submission of data
HEAD method is almost the same with the GET method, the response to the request for the HEAD section, it's HTTP header information contained in GET request through the information received is the same. Using this method, do not transfer the resource content, it can be identified by Request-URI resource information. The method used in testing the effectiveness of the hyperlink, can access, and whether the recent update.
2, the latter request header
3, the request body
3, HTTP protocol Xiangjie the response of articles
In receiving and interpreting a request message, the server returns an HTTP response message.
HTTP response is composed of three parts, namely: the state line, message headers, response body
1, status line format is as follows:
HTTP-Version Status-Code Reason-Phrase CRLF
Which, HTTP-Version HTTP protocol version that the server; Status-Code that the server sends the response status code; Reason-Phrase that describes a status code of the text.
Three-digit status code, the first number defines the type of response, and there are five possible values:
1xx: instructions - that request has been received and continue to deal with
2xx: Success - that request has been successfully received, understood, accepted
3xx: Redirection - further to complete the requested operation must be carried out
4xx: Client Error - Request a syntax error or a request can not be achieved
5xx: Server-side error - server failed to achieve a legitimate request
Common status code, status description, explain:
200 OK / / client request successful
400 Bad Request / / client request has a syntax error, can not be understood by the server
401 Unauthorized / / request without authorization, the status code must be reported and WWW-Authenticate / / header field together with
403 Forbidden / / server receives a request, but refused to provide services
404 Not Found / / request resource does not exist, eg: enter the wrong URL
500 Internal Server Error / / server unexpected error occurred
503 Server Unavailable / / server can not handle the current client's request, over time, may return to normal
eg: HTTP/1.1 200 OK (CRLF)
2, the latter response headers
3, the response body is the server returns the contents of the resources
4, HTTP protocol Xiangjie the message header Posts
HTTP message from the client to the server to the client request and server response form. Request messages and response messages are from the start line (for the request message, start line is the request line, the response message, start line is the status line), the message header (optional), blank line (CRLF line only), the message text (optional) component.
HTTP message header including the general header, request header response header, the entity header.
Each header field by name +":"+ space + value form, the message header field names are case independent.
1, common header header in general, there are a few header fields for all of the request and response messages, but not for the entities to be transmitted, only the message for transmission.
Cache-Control directive is used to specify the cache, the cache instruction is one-way (response appeared instruction cache may not appear in the request), and is independent (a message cache directive does not affect the caching mechanism other message processing ), HTTP1.0 use a similar header field to Pragma.
Cache request directive include: no-cache (used to indicate a request or response message can not cache), no-store, max-age, max-stale, min-fresh, only-if-cached;
Cache response instructions include: public, private, no-cache, no-store, no-transform, must-revalidate, proxy-revalidate, max-age, s-maxage.
eg: To instruct IE browser (client) do not cache the page, the JSP server-side program can be written as follows: response.sehHeader ("Cache-Control", "no-cache");
/ / Response.setHeader ("Pragma", "no-cache"); role equivalent to the code, this code is usually shared between the two will send the response message header fields to set general: Cache-Control: no-cache
Date general message header field that the date and time created
Connection general header field allows the option to send the specified connection. For example, specify the connection is continuous, or specify "close" option, notification server, in response to complete, close the connection
2, the request header request header allows the client to the server-side transfer request additional information and the client's own information.
Commonly used request headers
Accept: Accept request header field is used to specify what type of client receiving the information.
Accept: image / gif, that client wishes to accept the GIF image format of resources; Accept: text / html, that client wishes to accept the html text.
Accept-Charset request header field is used to specify the character set the client to accept.
eg: Accept-Charset: iso-8859-1, gb2312. If the request message is not set in this field, the default is that any character set can be accepted.
Accept-Encoding request header field is similar to Accept, but it is acceptable for the specified content encoding.
eg: Accept-Encoding: gzip.deflate. If the request message is not set the domain server assumes that the client can accept a variety of content coding.
Accept-Language request header field is similar to Accept, but it is used to specify a natural language. eg: Accept-Language: zh-cn. If the request message is not set this header field, the server assumes that the client can accept all kinds of languages.
Authorization request header field is mainly used to prove the client the right to view a resource. When the browser visits a page, if you receive the server response code 401 (unauthorized), you can send a request header field contains the Authorization request from the server validate.
Host (send request, the newspaper head domain is necessary)
Host request header field used to specify the requested resources for Internet host and port number, it is usually extracted from the HTTP URL, the
eg: We are in the browser, type: http://www.guet.edu.cn/index.html
Browser sends a request message will contain Host request header field, as follows:
Here using the default port number 80, if the specified port number, it becomes: Host: www.guet.edu.cn : specified port number
We visit the Internet forum, they often see some welcome message, which lists the name of your operating system and version, your browser name and version, which often make many people feel wonderful, in fact, , the server application is the request from the User-Agent header field to access the information. User-Agent request header field allows the client to its operating system, browser and other attributes to tell the server. However, this header field is not necessary, if we ourselves write a browser that does not use User-Agent request header field, then the server can not know the information we had.
For example, the request header:
GET / form.html HTTP/1.1 (CRLF)
Accept: image / gif, image / x-xbitmap, image / jpeg, application / x-shockwave-flash, application / vnd.ms-excel, application / vnd.ms-powerpoint, application / msword, * / * (CRLF)
Accept-Language: zh-cn (CRLF)
Accept-Encoding: gzip, deflate (CRLF)
If-Modified-Since: Wed, 05 Jan 2007 11:21:25 GMT (CRLF)
If-None-Match: W / "80b1a4c018f3c41: 8317" (CRLF)
User-Agent: Mozilla/4.0 (compatible; MSIE6.0; Windows NT 5.0) (CRLF)
Host: www.guet.edu.cn (CRLF)
Connection: Keep-Alive (CRLF)
3, response headers
Response header allows the server to pass the state line can not be placed on the additional response information, as well as information on the server and identified by Request-URI to access the information resources to the next step.
Common response headers
Location response header field to redirect the recipient to a new location. Location response header field when used in the replacement of domain names.
Server response header field contains the server software used to process the request information. And User-Agent request header field is the corresponding. The following Server response header field is an example:
WWW-Authenticate response header fields must be included in the 401 (unauthorized) response message, the client receives the 401 response message, the time and send Authorization header field requests the server to be verified, the server response headers will contain the paper first domain.
eg: WWW-Authenticate: Basic realm = "Basic Auth Test!" / / you can see the server on the requested resource using the basic authentication mechanism.
4, the entity header request and response messages can be sent one entity. An entity from the entity header field and entity body composition, but not that entity and the entity header fields with the body to send, you can only send the entity header field. Entity header defines the text on the entity (eg: whether the entity body) and the resource identified by the request meta-information.
Common entity header
Content-Encoding entity header field is used as the media type of modifier, its value indicates the physical body has been applied to the additional content encoding, so to get Content-Type header field referenced by the media type, must use the appropriate decoding mechanism. Content-Encoding such a compression method used to record the document,
eg: Content-Encoding: gzip
Content-Language entity header field describes the natural language used in the resource. Not set the domain think that the content will be available to all entities in the language of the reader.
eg: Content-Language: da
Content-Length entity header field for the specified entity body length, in bytes, stored in a decimal number to represent.
Content-Type entity header field sent to the recipient specified in terms of the physical body of media type.
Content-Type: text / html; charset = ISO-8859-1
Content-Type: text / html; charset = GB2312
Last-Modified entity header field indicates the resource was last modified date and time.
Expires entity header field gives the date and time to respond to expire. In order for a proxy server or browser to update the cache after a period of time (another visit to have visited the page, loaded directly from the cache, reducing response time and reduce server load) the page, we can use Expires header field designated entity page expiration of the time.
eg: Expires: Thu, 15 Sep 2006 16:23:12 GMT
HTTP1.1 the client and the cache must be other illegal date format (including 0) as of date. eg: To make the browser not to cache the page, we can use the Expires header field entities, is set to 0, jsp in the procedures are as follows: response.setDateHeader ("Expires", "0");
1, based on:
High-level agreement are: file transfer protocol FTP, e-mail transfer protocol SMTP, DNS services, DNS, Network News Transfer Protocol NNTP, and HTTP protocols, etc.
Intermediary by the three types: proxy (Proxy), Gateway (Gateway) and channel (Tunnel),
According to the absolute URI of a proxy form to accept the request, rewrite all or part, by the identity of the URI has been formatted request to the server. Gateway is a receiving agent, as the upper number of other server, and if necessary, you can translate the request to lower the server protocols. A channel does not change the message as a relay point between two connections. When the communication through an intermediary (such as: firewalls, etc.) or the intermediary does not recognize the contents of the message, the channel is often used.
Agent (Proxy): an intermediate program that acts as a server, it can act as a client for other clients create a request. Request is translated by the possible in-house or through transfer to other server. An agent sends a request message, you must rewrite it to explain and, if possible. Agents often used as client-side through the firewall gateway, the proxy can also be applied as a help to deal with by agreement has not been completed the user agent's request.
Gateway (Gateway): the middle of a media server as other server. The difference is that with the agent, the gateway to accept the request as if the resources being requested for it is the source server; the requesting client is not aware of its dealings with the gateway.
Often through the firewall gateway server as a gateway, the gateway can also serve as a protocol translator in order to access those stored in the non-HTTP system resources.
Channel (Tunnel): two connected relay as an intermediary process. Once activated, the channel will be considered not part of HTTP communication, though access may be initialized by a HTTP request. When both ends of the connection relay closed, the channel will disappear. When a portal (Portal) must exist or intermediary (Intermediary) can not explain the relay communication is often used when the channel.
2, protocol analysis of dominant-HTTP analyzer detect network attacks in a modular way of dealing with high-level agreements, will be the future direction of intrusion detection.
Common HTTP and proxy ports 80,3128 and 8080 in the network part of the label with the required port
3, HTTP protocol Content Lenth limit vulnerability to cause a denial of service attack using the POST method, you can set ContentLenth need to send the data to define the length of, for example ContentLenth: 999999999, in the transfer completed, the memory will not be released, the attacker could use this flaw, straight to the WEB server, WEB server to send junk data until the memory runs out. This attack method basically does not leave marks.
4, using HTTP protocol characteristics of a number of denial of service attacks preoccupied with the idea of server-side attacker forged TCP connection requests and no time to ignore the normal client requests (after all, a normal client request rate is very small), this time from the normal client point of view, the server lost response, we call this situation: the server side by SYNFlood attack (SYN flood attack).
The Smurf, TearDrop and so is the use of ICMP packets to Flood and IP fragmentation attacks. In this paper, the "normal connection" method to generate a denial of service attacks.
19 ports have been used for early Chargen attack that Chargen_Denial_of_Service, but! The method they use is generated between the two Chargen server UDP connections to the server handle too much information and DOWN off, then get rid of a WEB server, there must be two conditions: 1. There Chargen service 2. There is HTTP service method: an attacker to forge the source IP N Taiwan Chargen send connection requests (Connect), Chargen receive the connection will return 72 bytes per second, the character stream (in fact the actual situation in the network, the faster) to the server .
5, Http fingerprint recognition technology
Http principle of fingerprint recognition is generally the same: record Http different server implementation of the agreement on small differences in the identification. Http fingerprint than TCP / IP stack fingerprinting more complicated on the ground that custom Http server configuration files, additional plug-ins or Http response component allows to change the information becomes very easy, which makes identification difficult to change; but custom TCP / IP stack's behavior need to modify the core layer, so it is easy to identify.
Banner to make the server returns a different set of information is very simple, such as the open source Apache Http server, the user can modify the source code in Banner information, and then restart Http service is in force; for there is no open source The Http server such as Microsoft IIS or Netscape, you can store Banner Dll file information to modify the relevant articles for discussion, not repeat them here, of course, the effect of such changes is good. another kind of fuzzy Banner Information is to use plug-ins.
Common test request:
1: HEAD/Http/1.0 basic Http request to send
2: DELETE/Http/1.0 not allowed to send those requests, such as Delete request
3: GET/Http/3.0 send an illegal version of Http protocol request
4: GET/JUNK/1.0 send an incorrect specification of the Http protocol request
Http fingerprinting tools Httprint, it is through the use of statistical theory, fuzzy logic combination of technology, can be very effective to determine the type of Http server. It can be used to collect and analyze the signatures produced in different Http server.
6, Other: In order to improve user performance when using the browser, modern browsers also support concurrent access mode, browsing a web page to create multiple connections at the same time, to quickly get a page number icon, this can be completed more quickly the transmission of the entire page.
HTTP1.1 This persistent connection provides a way, the next-generation HTTP protocol: HTTP-NG also increase the session control, rich support for content negotiation, etc., to provide more efficient connections.