IntroductionThis implementation of the SE ASE requires Apache ActiveMQ version 5.5+. Monitoring the Petals SE ASE at ActiveMQ level
The following indicators are interesting:
|
Table of contents
Contributors
No contributors found for: authors on selected page(s)
|
Monitoring with basic tools
The command-lines and configuration files mentionned in following sub-chapters are available on Ubuntu 11.10 |
JVisualVM
As ActiveMQ is provided with a JMX API, it is very easy to connect the JVisualVM to the ActiveMQ's JVM. See http://activemq.apache.org/jmx.html.
Don't forget to install into JVisualVM its plugin VisualVM-MBeans previously. |
Command line tools of ActiveMQ
ActiveMQ is provided with a command-line tools to get statistics: activemq-admin
For example, use the following command to get the number of the requests waiting to be sent to the target service provider:
activemq-admin query --objname Type=Queue,Destination=testQueue --view QueueSize | grep QueueSize
Monitoring with Nagios
Several options are available to monitor ActiveMQ using Naggios:
- Integrating Munin with Naggios, sending messages from Munin to Naggios: http://munin-monitoring.org/wiki/HowToContactNagios
- Integrating Munin with Naggios, using Naggios active checks: http://exchange.nagios.org/directory/Plugins/Uncategorized/Operating-Systems/Linux/check_munin_rrd/details
- Integrating ActiveMQ with Naggios, using ActiveMQ's command-line tools: TODO
- Integrating ActiveMQ with Naggios, using ActiveMQ's JMX API: TODO
Monitoring with ActiveMQ's JMX API
In progress |
First and foremost, you must have an ActiveMQ instance correctly configured about JMX. You must be able to use JVisualVM with ActiveMQ remotely.
'check_jmx' installation
First, install the Nagios plugin 'check_jmx' (http://exchange.nagios.org/directory/Plugins/Java-Applications-and-Servers/check_jmx/details).
Next, we recommend to define specific Nagios command to interact with ActiveMQ:
- activemq_queue_size: to get the number of pending messages in a queue,
- activemq_queue_traffic: to get the number of transacted messages in queue
According to our environment defined above, create the file 'activemq.cfg' in the directory '/etc/nagios-plugins/config' with the following content:
# 'activemq_queue_size' command definition define command{ command_name activemq_queue_size command_line /usr/lib/nagios/plugins/check_jmx -U service:jmx:rmi:///jndi/rmi://$HOSTADDRESS$:$_HOSTJMXPORT$/jmxrmi -O org.apache.activemq:BrokerName=$ARG1$,Type=Queue,Destination=$ARG2$ -A QueueSize -w $ARG3$ -c $ARG4$ } # 'activemq_queue_traffic' command definition define command{ command_name activemq_queue_traffic command_line /usr/lib/nagios/plugins/check_jmx -U service:jmx:rmi:///jndi/rmi://$HOSTADDRESS$:$_HOSTJMXPORT$/jmxrmi -O org.apache.activemq:BrokerName=$ARG1$,Type=Queue,Destination=$ARG2$ -A EnqueueCount -w $ARG3$ -c $ARG4$ }
ActiveMQ host template
A best practice to an ActiveMQ nodes is to create a template 'ActiveMQ host' that inherites from the 'JVM host'.
According to our environment defined above, create the file 'activemq-nagios2.cfg' in the directory '/etc/nagios3/conf.d' with the following content:
define host{ use jvm-host name activmq-host ; The name of this host template notifications_enabled 1 ; Host notifications are enabled event_handler_enabled 1 ; Host event handler is enabled flap_detection_enabled 1 ; Flap detection is enabled failure_prediction_enabled 1 ; Failure prediction is enabled process_perf_data 1 ; Process performance data retain_status_information 1 ; Retain status information across program restarts retain_nonstatus_information 1 ; Retain non-status information across program restarts check_command check-host-alive max_check_attempts 10 notification_interval 0 notification_period 24x7 notification_options d,u,r contact_groups admins register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE! # Specific attributes _jmxport 1099 ; Listening port of the JVM JMX agent } define hostextinfo{ name activemq-node notes Petals ESB - SE ASE - ActiveMQ node icon_image base/activemq.jpg icon_image_alt Petals ESB/Node vrml_image base/activemq.jpg statusmap_image base/activemq.jpg }
Defining your ActiveMQ host
For the ActiveMQ node of your Petals ESB topology, create an instance of the template 'activemq-host'.
According to our environment defined above, create the file 'activemq-host-node1.cfg' in the directory '/etc/nagios3/conf.d' with the following content:
define host{ use activemq-host ; Name of host template to use host_name activemq-node alias Petals ESB - SE ASE - ActiveMQ node address 127.0.0.1 _jmxport 1099 ; This value should be set with the JMX ; agent listener port of your ActiveMQ node. }
Adding your ActiveMQ host to the Petals ESB host group
According to our environment defined above, update the file 'petals-esb-hostgroup.cfg' in the directory '/etc/nagios3/conf.d' to add the member 'activemq-node':
define hostgroup { hostgroup_name petals-esb alias Petals ESB members petals-esb-node-1, petals-esb-node-2, activemq-node }
ActiveMQ host services
According to our environment defined above, create the file 'activemq-services.cfg' in the directory '/etc/nagios3/conf.d' with the following content:
# Define a service to check the queue size of an ActiveMQ queue used by the SE ASE define service{ host_name activemq-node service_description se-ase-queue-size check_command activemq_queue_size!localhost!testQueue!10!50 use generic-service } # Define a service to check the traffic of an ActiveMQ queue used by the SE ASE define service{ host_name activemq-node service_description se-ase-traffic check_command activemq_queue_traffic!localhost!testQueue!500!1000 use generic-service }
Monitoring with Cacti
Solution based on an article of R.I.Pienaar |
Monitoring with Munin
A plugin ActiveMQ for Munin exists: http://munin-activemq.sourceforge.net. It is very easy to install it on a Debian-based system using the Debian package. Don't forget to install Munin previously.
The downloaded package can be installed with the followinf command:
sudo dpkg -i munin-java-activemq-plugins_0.0.4_i386.deb
Pre-requisites
The plugin ActiveMQ for Munin requires a remote JMX connection to the ActiveMQ server, so you needs to configure your ActiveMQ to enable the JMX connector:
<beans ... > <broker xmlns="http://activemq.apache.org/schema/core" ... > ... <managementContext> <managementContext createConnector="true"/> </managementContext> ... </broker> ... </beans>
Configuration
Edit the file /etc/munin/plugin-conf.d/activemq_ to add the queues to monitor in parameter env.DESTINATIONS of the section ?activemq*. :
[activemq_*] ## The hostname to connect to. ## Default: localhost #env.JMX_HOST localhost ## The port where the JMX server is listening ## Default: 1099 #env.JMX_PORT 1099 ## The username required to authenticate to the JMX server. ## When enabling JMX for a plain ActiveMQ install, no authentication is needed. ## The default username for JMX run by ServiceMix is 'smx' ## Default: #env.JMX_USER smx ## The password required to authenticate to the JMX server. ## The default password for JMX run by ServiceMix is 'smx' ## Default: #env.JMX_PASS smx ## Space separated list of destinations to create graphs for. ## Default: env.DESTINATIONS Queue:foo Queue:bar ## You can override certain configuration variables for specific plugins #[activemq_traffic] #env.DESTINATIONS Topic:MyTopic Queue:foo
Integrating Munin with Naggios using Naggios active checks
This chapter is based on information available here |
Installation of the Nagios plugin for Munin
On your Nagios host:
- Download the Perl script check_munin_rdd.pl into the Nagios plugins directory (under Ubuntu: /usr/lib/nagios/plugins),
- Check that the owner file and permissions are the same as other ones (root, and 755). Fix them if needed.
Nagios commands definition to interact with a Munin agent
A specific Nagios command to interact with Munin agent must be defined on your Nagios host:
- create the file munin.cfg in the directory /etc/nagios-plugins/config (except for Ubuntu, adapt the directory name to your operating system).
- check that the owner file and permissions are the same as other ones (root, and 644). Fix them if needed.
- edit the previous file with the following content:
define command{ command_name check_munin command_line /usr/lib/nagios/plugins/check_munin_rrd.pl -H $HOSTALIAS$ -M $ARG1$ -w $ARG2$ -c $ARG3$ }
Nagios template service to interact with a Munin agent
A specific template service to interact with Munin agent must be defined on your Nagios host:
- create the file generic-munin-service.cfg in the directory /etc/nagios3/conf.d (except for Ubuntu, adapt the directory name to your operating system).
- check that the owner file and permissions are the same as other ones (root, and 644). Fix them if needed.
- edit the previous file with the following content:
define service{ name generic-munin-service ; The 'name' of this service template active_checks_enabled 1 ; Active service checks are enabled passive_checks_enabled 0 ; Passive service checks are enabled/accepted parallelize_check 1 ; Active service checks should be parallelized (disabling this can lead to major performance problems) obsess_over_service 1 ; We should obsess over this service (if necessary) check_freshness 0 ; Default is to NOT check service 'freshness' notifications_enabled 1 ; Service notifications are enabled event_handler_enabled 1 ; Service event handler is enabled flap_detection_enabled 1 ; Flap detection is enabled failure_prediction_enabled 1 ; Failure prediction is enabled process_perf_data 1 ; Process performance data retain_status_information 1 ; Retain status information across program restarts retain_nonstatus_information 1 ; Retain non-status information across program restarts notification_interval 0 ; Only send notifications on status change by default. is_volatile 0 check_period 24x7 normal_check_interval 5 ; This directive is used to define the number of "time units" to wait before scheduling the next "regular" check of the service. retry_check_interval 3 ; This directive is used to define the number of "time units" to wait before scheduling a re-check of the service. max_check_attempts 2 ; This directive is used to define the number of times that Nagios will retry the service check command if it returns any state other than an OK state. Setting this value to 1 will cause Nagios to generate an alert without retrying the service check again. notification_period 24x7 notification_options w,u,c,r contact_groups admins register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE! }
Define ActiveMQ check as service of a Petals node
See Monitoring Petals ESB with Nagios to configure Nagios to monitor Petals ESB |
In main use-cases, the ActiveMQ server is collocated with the Petals ESB node running the SE ASE. So, it is a good practice to define ActiveMQ as a service of the Petals node running the SE ASE:
- edit the Nagios configuration file of your Petals ESB node (for example: /etc/nagios3/conf.d/petals-esb-host-node1.cfg, following the monitoring Petals ESB sample),
- and add the following content:
# Define a service to check the queue size of an ActiveMQ queue used by the SE ASE define service{ host_name petals-esb-node-1 service_description se-ase-queue-size check_command check_munin!activemq_size!10!50 use generic-munin-service } # Define a service to check the traffic of an ActiveMQ queue used by the SE ASE define service{ host_name petals-esb-node-1 service_description se-ase-queue-traffic check_command check_munin!activemq_traffic!500!1000 use generic-munin-service }
In our example:
- in nominal running, we should not have more than 10 pending messages. Over 50 pending messages, an error is thrown.
- and according to our volumetric estimations, we should not have more than 500 messages per 5 min. We accept up to twice our estimation: 1000 messages per 5 min.
Next, restart Nagios, start Petals and ActiveMQ. Go to the Nagios console:
On network domain configuration error, the message "I can't guess your domain, please add the domain manually" can appear on services associated to the queue size and queue traffic. So update:
- the command check_munin to force the domain name, example:
define command{ command_name check_munin command_line /usr/lib/nagios/plugins/check_munin_rrd.pl -H localhost.localdomain -d localdomain -M $ARG1$ -w $ARG2$ -c $ARG3$ }
Screenshots
Nagios screenshots
Munin screenshots
Queue size sample
Traffic sample
Nagios screenshots
Monitoring the component internals
Using metrics
Several probes providing metrics are included in the component, and are available through the JMX MBean 'org.ow2.petals:type=custom,name=monitoring_<component-id>', where <component-id> is the unique JBI identifier of the component.
Common metrics
The following metrics are provided through the Petals CDK, and are common to all components:
Metrics, as MBean attribute | Description | Detail of the value | Configurable |
---|---|---|---|
MessageExchangeAcceptorThreadPoolMaxSize | The maximum number of threads of the message exchange acceptor thread pool | integer value, since the last startup of the component | yes, through acceptor-pool-size |
MessageExchangeAcceptorThreadPoolCurrentSize | The current number of threads of the message exchange acceptor thread pool. Should be always equals to MessageExchangeAcceptorThreadPoolMaxSize. | instant integer value | no |
MessageExchangeAcceptorCurrentWorking | The current number of working message exchange acceptors. | instant long value | no |
MessageExchangeAcceptorMaxWorking | The max number of working message exchange acceptors. | long value, since the last startup of the component | no |
MessageExchangeAcceptorAbsoluteDurations | The aggregated durations of the working message exchange acceptors since the last startup of the component. | n-tuple value containing, in nanosecond:
|
no |
MessageExchangeAcceptorRelativeDurations | The aggregated durations of the working message exchange acceptors on the last sample. | n-tuple value containing, in nanosecond:
|
no |
MessageExchangeProcessorAbsoluteDurations | The aggregated durations of the working message exchange processor since the last startup of the component. | n-tuple value containing, in milliseconds:
|
no |
MessageExchangeProcessorRelativeDurations | The aggregated durations of the working message exchange processor on the last sample. | n-tuple value containing, in milliseconds:
|
no |
MessageExchangeProcessorThreadPoolActiveThreadsCurrent | The current number of active threads of the message exchange processor thread pool | instant integer value | no |
MessageExchangeProcessorThreadPoolActiveThreadsMax | The maximum number of threads of the message exchange processor thread pool that was active | integer value, since the last startup of the component | no |
MessageExchangeProcessorThreadPoolIdleThreadsCurrent | The current number of idle threads of the message exchange processor thread pool | instant integer value | no |
MessageExchangeProcessorThreadPoolIdleThreadsMax | The maximum number of threads of the message exchange processor thread pool that was idle | integer value, since the last startup of the component | no |
MessageExchangeProcessorThreadPoolMaxSize | The maximum size, in threads, of the message exchange processor thread pool | instant integer value | yes, through http-thread-pool-size-max |
MessageExchangeProcessorThreadPoolMinSize | The minimum size, in threads, of the message exchange processor thread pool | instant integer value | yes, through http-thread-pool-size-min |
MessageExchangeProcessorThreadPoolQueuedRequestsCurrent | The current number of enqueued requests waiting to be processed by the message exchange processor thread pool | instant integer value | no |
MessageExchangeProcessorThreadPoolQueuedRequestsMax | The maximum number of enqueued requests waiting to be processed by the message exchange processor thread pool since the last startup of the component | instant integer value | no |
ServiceProviderInvocations | The number of service provider invocations grouped by:
|
integer counter value since the last startup of the component | no |
ServiceProviderInvocationsResponseTimeAbs | The aggregated response times of the service provider invocations since the last startup of the component grouped by:
|
n-tuple value containing, in millisecond:
|
no |
ServiceProviderInvocationsResponseTimeRel | The aggregated response times of the service provider invocations on the last sample, grouped by:
|
n-tuple value containing, in millisecond:
|
no |
Dedicated metrics
No dedicated metric is available.
Receiving alerts
Several alerts are notified by the component through notification of the JMX MBean 'org.ow2.petals:type=custom,name=monitoring_<component-id>', where <component-id> is the unique JBI identifier of the component.
To integrate these alerts with Nagios, see Receiving Petals ESB defects in Nagios. |
Common alerts
Defect | JMX Notification |
---|---|
A message exchange acceptor thread is dead |
|
No more thread is available in the message exchange acceptor thread pool |
|
No more thread is available to run a message exchange processor |
|
Dedicated alerts
No dedicated alert is available.