Understanding HTTP plug-in failover in a clustered environment

Problem (Abstract) After setting up the HTTP plug-in for load balancing in a clustered IBM ® WebSphere ® environment, the HTTP plug-in is not performing failover in a timely manner or at all when a cluster member becomes unavailable. Cause In most cases, the preceding behavior is observed because of a misunderstanding of how HTTP plug-in failover works or might be due to an improper configuration. Also, the type of Web server (multi-threaded versus single threaded) being used can affect this behavior. Resolving the problem The following document is designed to assist you in understanding how HTTP plug-in failover works, along with providing you some helpful tuning parameters and suggestions to better maximize the ability of the HTTP plug-in to failover effectively and in a timely manner . Note: The following information is written specifically for the IBM HTTP Server, however, this information in general is applicable to other Web servers which currently support the HTTP plug-in (for example: IIS, SunOne, Domino ®, and so on) . Failover Background In clustered IBM WebSphere Application Server environments, the HTTP plug-in has the ability to provide failover in the event the HTTP plug-in is no longer able to send requests to a particular cluster member. By default, there are several conditions under which the HTTP plug-in will mark a particular cluster member down and failover client requests to another cluster member that is still able to receive connections. They are listed as follows: The HTTP plug-in is unable to establish a connection to a cluster member's Application Server transport. The HTTP plug-in detects a newly connected socket that was prematurely closed by a cluster member during an active read or write. There are several configurable settings in the plugin-cfg.xml that can be tuned to affect how quickly the HTTP plug-in will mark a cluster member down and failover to another cluster member. ConnectTimeout The ConnectTimeout attribute of a Server element enables the HTTP plug-in to perform non-blocking connections with a backend cluster member. Non-blocking connections are beneficial when the HTTP plug-in is unable to contact the destination to determine if the port is available or unavailable for a particular cluster member. <server cloneid = "10k66djk2" connecttimeout = "10" extendedhandshake = "false" loadbalanceweight = "1000" maxconnections = "0" name = "Server1_WebSphere_Appserver" waitforcontinue = "false"> <transport hostname="server1.domain.com" port="9091" protocol="http"> </ transport> </ server> If no ConnectTimeout value is specified, the HTTP plug-in performs a blocking connect in which the HTTP plug-in sits until an operating system TCP timeout occurs (as long as 2 minutes depending on the platform) and allows the HTTP plug-in to mark the cluster member unavailable . A value of 0 causes the HTTP plug-in to perform a blocking connect. A value greater than 0 specifies the number of seconds you want the HTTP plug-in to wait for a successful connection. If a connection does not occur after that time interval, the HTTP plug-in marks the cluster member unavailable and fails over to one of the other cluster members defined in the cluster. Caution: In an environment with busy workload or a slow network connection, setting this value too low could make the HTTP plug-in mark a cluster member down falsely. Therefore, caution should be used whenever choosing a value for ConnectTimeout. ServerIOTimeout The ServerIOTimeout attribute of a server element enables the HTTP plug-in to set a time out value, in seconds, for sending requests to and reading responses from a cluster member. If a value is not set for the ServerIOTimeout attribute, the HTTP plug-in, by default, uses blocked I / O to write request to and read responses from the cluster member until the TCP connection times out. For example, if you specify: <server cloneid = "10k66djk2" serveriotimeout = "120" connecttimeout = "10" extendedhandshake = "false" loadbalanceweight = "1000" maxconnections = "0" name = "Server1_WebSphere_Appserver" waitforcontinue = "false "> <transport hostname="server1.domain.com" port="9091" protocol="http"> </ transport> </ server> In this case, if a cluster member stops responding to requests, the HTTP plug-in waits 120 seconds (2 minutes) before timing out the TCP connection. Setting the ServerIOTimeout attribute to a reasonable value enables the HTTP plug-in to time out the connection sooner, and transfer requests to another cluster member when possible. When selecting a value for this attribute, remember that sometimes it might take a couple of minutes for a cluster member to process a request. Setting the value of the ServerIOTimeout attribute too low could cause the HTTP plug-in to send a false server error response to the client. The ServerIOTimeout is ideal for situations where Keep-Alive connections exist between the WebSphere Application Server and HTTP plug-in, and the Application Server machine is abruptly disconnected from the network. For example, without ServerIOTimeout, the HTTP plug-in would take a long time to detect that the connection was closed abruptly on the WebSphere Application Server machine. This is illustrated as follows: When an application host machine is shut down abruptly, the Keep-Alive connections between HTTP plug-in and Application Server might not get closed completely. As a result, when the HTTP plug-in needs to route a request to the host machine, the HTTP plug-in would use an existing Keep-Alive connection if there was one in the pool. When plug-in sends the request over such a connection, since the host machine had been taken down abruptly, the HTTP plug-in machine does not receive any TCP packets to close the connection. The HTTP plug-in request writing would not return a failure until the connection timed out at the TCP level. The HTTP Plug-in would then try to contact to the same application server by establishing a new connection. The connect () call would then fail after the TCP timeout. As a result, it could take a considerable amount of time depending on the operating system TCP timeout setting for the HTTP plug-in to detect the application server status and mark it down before failing over to another application server. If there were many requests sent to the server during this time, this fact would apply to every request. Note: To avoid the preceding behavior, ServerIOTimeout attribute was introduced with APAR PQ96015 and included in WebSphere Application Server V5.0.2.10 and 5.1.1.4. Caution: When both ConnecTimeout and ServerIOTimeout are specified, it could take as long as (ConnecTimeout + ServerIOTimeout) for the HTTP plug-in to detect and mark a server down. RetryInterval An integer specifying the length of time that should elapse from the time that a server is marked down to the time that the HTTP plug-in will retry a connection. The default is 60 seconds. This setting is specified in the ServerCluster element. An example of this in the plugin-cfg.xml file is as follows: <servercluster cloneseparatorchange = "false" loadbalance = "Round Robin" name = " Server_WebSphere_Cluster "postsizelimit =" 10000000 "removespecialheaders =" true "retryinterval =" 120 "> This would mean that if a cluster member were marked as down, the HTTP plug-in would not retry it for 120 seconds. There is no way to recommend one specific value; the value chosen depends on your environment. For example, if you have numerous cluster members, and one cluster member being unavailable does not affect the performance of your application, then you can safely set the value to a very high number. Alternatively, if your optimum load has been calculated assuming all cluster members to be available or if you do not have very many, then you will want your cluster members to be retried more often to maintain the load. Also, take into consideration the time it takes to restart your server. If a server takes a long time to boot up and load applications, then you will need a longer retry interval. PrimaryServers versus BackupServers The HTTP plug-in can be configured for true failover by using PrimaryServers and BackupServers Elements in the plugin-cfg.xml configuration file. In the following example, the plug-in will load balance between both servers, Server1_WebSphere_Appserver and Server2_WebSphere_Appserver defined in PrimaryServers element only. However, in the event that bothServer1_WebSphere_Appserver and Server1_WebSphere_Appserver become unavailable and marked down, the HTTP plug-in will then failover and start sending requests to Server3_WebSphere_Appserver defined in the BackupServers Element. <servercluster cloneseparatorchange = "false" loadbalance = "Round Robin" name = "Server_WebSphere_Cluster" postsizelimit = "10000000" removespecialheaders = "true" retryinterval = " 120 "> <server cloneid="10k66djk2" serveriotimeout="120" connecttimeout="10" extendedhandshake="false" loadbalanceweight="1000" maxconnections="0" name="Server1_WebSphere_Appserver" waitforcontinue="false"> <transport hostname = "server1.domain.com" port = "9091" protocol = "http"> </ transport> </ server> <server cloneid = "10k67eta9" serveriotimeout = "120" connecttimeout = "10" extendedhandshake = "false" loadbalanceweight = "999" maxconnections = "0" name = "Server2_WebSphere_Appserver" waitforcontinue = "false"> <transport hostname="server2.domain.com" port="9091" protocol="http"> </ transport> </ server> < server cloneid = "10k68xtw10" serveriotimeout = "120" connecttimeout = "10" extendedhandshake = "false" loadbalanceweight = "998" maxconnections = "0" name = "Server3_WebSphere_Appserver" waitforcontinue = "false"> <transport hostname = "server3.domain. Com "port =" 9091 "protocol =" http "> </ transport> </ server> <primaryservers> <server name="Server1_WebSphere_Appserver"> </ server> <server name="Server2_WebSphere_Appserver"> </ server> < / primaryservers> <backupservers> <server name="Server3_WebSphere_Appserver"> </ server> </ backupservers> </ servercluster> </ servercluster>

分类:Tech 时间:2010-03-17 人气:460
分享到:
blog comments powered by Disqus

相关文章

  • [Transfer] Web server load balancing solution - DNS poll 2011-06-14

    Web server load balancing solutions - DNS poll Around early 2005, the public comment CAPE run more than a year, site traffic has been not simply rely on a Web server, a database server to support. Prepared to increase the several front-end Web server

  • Turn: F5 help eBay database server load balancing 2008-10-02

    U.S. eBay is the world's largest online trading platform. According to statistics, every day thousands of categories involving several millions of items on eBay to sell; eBay's annual growth rate of 50%. But, compared to the fast-growing business, eB

  • Using apache's mod_proxy module for server load balancing (transfer) 2008-11-21

    Using apache's mod_proxy module for the site load balancing, apache indeed strong. Let's explain what is load balancing: Load balancing is to divert the client's request to the backend all the real servers to achieve load balancing purposes. Another

  • FCS multi-server load balancing solution 2010-04-10

    Because FCS server bandwidth consumption is very powerful, multiple FCS in load balancing between servers is more troublesome things, Peldi several articles very well, can learn from the next. Really good idea, indeed the FCS's ancestors, huh, huh ~

  • HTTP Server load balancing secrets 2010-03-17

    The HTTP Server plugin that ships with WAS is quite a nice little piece of router magic that ships for free, yet work load management through the HTTP Server seems like such black magic to most people. There are just a few things you should consider

  • nginx + tomcat + memcached server load balancing configuration notes 2011-09-04

    How to install nginx, tomcat6 memcached installation please refer to http://wudx.iteye.com/blog/1165260 memcached-session-manager configuration Let tomcat called memcached to store the session early on is a very mature solution, the source of the msm

  • Server load balancing algorithm 2011-09-19

    1) The round robin algorithm (Round Robin) 2) hash algorithm (HASH) 3) at least join algorithm (Least Connection) 4) response algorithm (Response Time) 5) The weighted (Weighted) 2a) Consistent Hashing http://weblogs.java.net/blog/2007/11/27/consiste

  • Load Balancing: F5 Load Balance Q_A 2010-05-24

    [Reprinted from] http://blog.cnr.cn/?uid-18-action-viewspace-itemid-16814 Q: What is Server Load Balancing Implementation? A: server load balancing to achieve a variety of methods, common methods are: 1. DNS-based polling method: that is, in the same

  • apache2.2 tomcat6 jk load balancing, do backup server 2010-06-26

    1. Download the http://tomcat.apache.org/connectors-doc/ Download JK-1.2.30 mod_jk-1.2.30-httpd-2.2.3.so apache2.2, tomcat6, download down 2. To mod_jk-1.2.30-httpd-2.2.3.so Placed under apache2.2/modules 3. Apache2.2 configuration three files: (1) m

  • Configuring Windows Server 2003 Network Load Balancing 2010-10-10

    Original http://windows.chinaitlab.com/administer/728525.html Load balancing is more than one server in a symmetrical manner to form a server set, each server has the equivalent status, can provide services outside alone without the aid of other serv

iOS 开发

Android 开发

Python 开发

JAVA 开发

开发语言

PHP 开发

Ruby 开发

搜索

前端开发

数据库

开发工具

开放平台

Javascript 开发

.NET 开发

云计算

服务器

Copyright (C) codeweblog.com, All Rights Reserved.

CodeWeblog.com 版权所有 黔ICP备15002463号-1

processed in 0.470 (s). 12 q(s)