Petals SE-ASE 1.0.2+

This version must be installed on Petals ESB 5.1.0+

Introduction

This implementation of the SE ASE requires Apache ActiveMQ version 5.5+.

Monitoring the Petals SE ASE at ActiveMQ level

In this version of the Petals ASE, the monitoring is based mainly on the ActiveMQ monitoring.

The following indicators are interesting:

  • number of requests processed with fault in the persistence area: a fast increase of this value should show:
    • the target service provider or its backend are overloaded or down,
    • a DoD of the ASE service provider client
  • number of retried requests: an increase of this value should show:
    • the target service provider or its backend are overloaded or down,
    • the ASE service provider client doesn't respect the SLA
Contributors
No contributors found for: authors on selected page(s)

Monitoring with basic tools

The command-lines and configuration files mentionned in following sub-chapters are available on Ubuntu 11.10

JVisualVM

As ActiveMQ is provided with a JMX API, it is very easy to connect the JVisualVM to the ActiveMQ's JVM. See http://activemq.apache.org/jmx.html.

Don't forget to install into JVisualVM its plugin VisualVM-MBeans previously.

Command line tools of ActiveMQ

ActiveMQ is provided with a command-line tools to get statistics: activemq-admin

For example, use the following command to get the number of the requests waiting to be sent to the target service provider:

activemq-admin query --objname Type=Queue,Destination=testQueue --view QueueSize | grep QueueSize

Monitoring with Nagios

Several options are available to monitor ActiveMQ using Naggios:

Monitoring with ActiveMQ's JMX API

In progress

First and foremost, you must have an ActiveMQ instance correctly configured about JMX. You must be able to use JVisualVM with ActiveMQ remotely.

'check_jmx' installation

First, install the Nagios plugin 'check_jmx' (http://exchange.nagios.org/directory/Plugins/Java-Applications-and-Servers/check_jmx/details).
Next, we recommend to define specific Nagios command to interact with ActiveMQ:

  • activemq_queue_size: to get the number of pending messages in a queue,
  • activemq_queue_traffic: to get the number of transacted messages in queue

According to our environment defined above, create the file 'activemq.cfg' in the directory '/etc/nagios-plugins/config' with the following content:

# 'activemq_queue_size' command definition
define command{
        command_name    activemq_queue_size
        command_line    /usr/lib/nagios/plugins/check_jmx -U service:jmx:rmi:///jndi/rmi://$HOSTADDRESS$:$_HOSTJMXPORT$/jmxrmi -O org.apache.activemq:BrokerName=$ARG1$,Type=Queue,Destination=$ARG2$ -A QueueSize -w $ARG3$ -c $ARG4$
        }

# 'activemq_queue_traffic' command definition
define command{
        command_name    activemq_queue_traffic
        command_line    /usr/lib/nagios/plugins/check_jmx -U service:jmx:rmi:///jndi/rmi://$HOSTADDRESS$:$_HOSTJMXPORT$/jmxrmi -O org.apache.activemq:BrokerName=$ARG1$,Type=Queue,Destination=$ARG2$ -A EnqueueCount -w $ARG3$ -c $ARG4$
        }

ActiveMQ host template

A best practice to an ActiveMQ nodes is to create a template 'ActiveMQ host' that inherites from the 'JVM host'.

According to our environment defined above, create the file 'activemq-nagios2.cfg' in the directory '/etc/nagios3/conf.d' with the following content:

define host{
        use                             jvm-host
        name                            activmq-host    ; The name of this host template
        notifications_enabled           1               ; Host notifications are enabled
        event_handler_enabled           1               ; Host event handler is enabled
        flap_detection_enabled          1               ; Flap detection is enabled
        failure_prediction_enabled      1               ; Failure prediction is enabled
        process_perf_data               1               ; Process performance data
        retain_status_information       1               ; Retain status information across program restarts
        retain_nonstatus_information    1               ; Retain non-status information across program restarts
                check_command                   check-host-alive
                max_check_attempts              10
                notification_interval           0
                notification_period             24x7
                notification_options            d,u,r
                contact_groups                  admins
        register                        0               ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!

# Specific attributes
        _jmxport                        1099            ; Listening port of the JVM JMX agent
        }

define hostextinfo{
        name             activemq-node
        notes            Petals ESB - SE ASE - ActiveMQ node
        icon_image       base/activemq.jpg
        icon_image_alt   Petals ESB/Node
        vrml_image       base/activemq.jpg
        statusmap_image  base/activemq.jpg
        }

Defining your ActiveMQ host

For the ActiveMQ node of your Petals ESB topology, create an instance of the template 'activemq-host'.

According to our environment defined above, create the file 'activemq-host-node1.cfg' in the directory '/etc/nagios3/conf.d' with the following content:

define host{
        use                     activemq-host            ; Name of host template to use
        host_name               activemq-node
        alias                   Petals ESB - SE ASE - ActiveMQ node
        address                 127.0.0.1
        _jmxport                1099                     ; This value should be set with the JMX
                                                         ; agent listener port of your ActiveMQ node.
        }

Adding your ActiveMQ host to the Petals ESB host group

According to our environment defined above, update the file 'petals-esb-hostgroup.cfg' in the directory '/etc/nagios3/conf.d' to add the member 'activemq-node':

define hostgroup {
        hostgroup_name   petals-esb
        alias            Petals ESB
        members          petals-esb-node-1, petals-esb-node-2, activemq-node
        }

ActiveMQ host services

According to our environment defined above, create the file 'activemq-services.cfg' in the directory '/etc/nagios3/conf.d' with the following content:

# Define a service to check the queue size of an ActiveMQ queue used by the SE ASE
define service{
       host_name                       activemq-node
       service_description             se-ase-queue-size
       check_command                   activemq_queue_size!localhost!testQueue!10!50
       use                             generic-service
     }

# Define a service to check the traffic of an ActiveMQ queue used by the SE ASE
define service{
       host_name                       activemq-node
       service_description             se-ase-traffic
       check_command                   activemq_queue_traffic!localhost!testQueue!500!1000
       use                             generic-service
     }

Monitoring with Cacti

Solution based on an article of R.I.Pienaar

Monitoring with Munin

A plugin ActiveMQ for Munin exists: http://munin-activemq.sourceforge.net. It is very easy to install it on a Debian-based system using the Debian package. Don't forget to install Munin previously.
The downloaded package can be installed with the followinf command:

sudo dpkg -i munin-java-activemq-plugins_0.0.4_i386.deb

Pre-requisites

The plugin ActiveMQ for Munin requires a remote JMX connection to the ActiveMQ server, so you needs to configure your ActiveMQ to enable the JMX connector:

<beans ... >
  <broker xmlns="http://activemq.apache.org/schema/core" ... >
    ...
    <managementContext>
      <managementContext createConnector="true"/>
    </managementContext>
    ...
  </broker>
  ...
</beans>

Configuration

Edit the file /etc/munin/plugin-conf.d/activemq_ to add the queues to monitor in parameter env.DESTINATIONS of the section ?activemq*. :

[activemq_*]
## The hostname to connect to.
## Default: localhost
#env.JMX_HOST localhost

## The port where the JMX server is listening
## Default: 1099
#env.JMX_PORT 1099

## The username required to authenticate to the JMX server.
## When enabling JMX for a plain ActiveMQ install, no authentication is needed.
## The default username for JMX run by ServiceMix is 'smx'
## Default:
#env.JMX_USER smx

## The password required to authenticate to the JMX server.
## The default password for JMX run by ServiceMix is 'smx'
## Default:
#env.JMX_PASS smx

## Space separated list of destinations to create graphs for.
## Default:
env.DESTINATIONS Queue:foo Queue:bar

## You can override certain configuration variables for specific plugins
#[activemq_traffic]
#env.DESTINATIONS Topic:MyTopic Queue:foo

Integrating Munin with Naggios using Naggios active checks

This chapter is based on information available here

Installation of the Nagios plugin for Munin

On your Nagios host:

  1. Download the Perl script check_munin_rdd.pl into the Nagios plugins directory (under Ubuntu: /usr/lib/nagios/plugins),
  2. Check that the owner file and permissions are the same as other ones (root, and 755). Fix them if needed.

Nagios commands definition to interact with a Munin agent

A specific Nagios command to interact with Munin agent must be defined on your Nagios host:

  1. create the file munin.cfg in the directory /etc/nagios-plugins/config (except for Ubuntu, adapt the directory name to your operating system).
  2. check that the owner file and permissions are the same as other ones (root, and 644). Fix them if needed.
  3. edit the previous file with the following content:
    define command{
         command_name check_munin
         command_line /usr/lib/nagios/plugins/check_munin_rrd.pl -H $HOSTALIAS$ -M $ARG1$ -w $ARG2$ -c $ARG3$
         }
    

Nagios template service to interact with a Munin agent

A specific template service to interact with Munin agent must be defined on your Nagios host:

  1. create the file generic-munin-service.cfg in the directory /etc/nagios3/conf.d (except for Ubuntu, adapt the directory name to your operating system).
  2. check that the owner file and permissions are the same as other ones (root, and 644). Fix them if needed.
  3. edit the previous file with the following content:
    define service{
           name                            generic-munin-service ; The 'name' of this service template
           active_checks_enabled           1       ; Active service checks are enabled
           passive_checks_enabled          0       ; Passive service checks are enabled/accepted
           parallelize_check               1       ; Active service checks should be parallelized (disabling this can lead to major performance problems)
           obsess_over_service             1       ; We should obsess over this service (if necessary)
           check_freshness                 0       ; Default is to NOT check service 'freshness'
           notifications_enabled           1       ; Service notifications are enabled
           event_handler_enabled           1       ; Service event handler is enabled
           flap_detection_enabled          1       ; Flap detection is enabled 
           failure_prediction_enabled      1       ; Failure prediction is enabled
           process_perf_data               1       ; Process performance data
           retain_status_information       1       ; Retain status information across program restarts
           retain_nonstatus_information    1       ; Retain non-status information across program restarts
           notification_interval           0       ; Only send notifications on status change by default.
           is_volatile                     0
           check_period                    24x7
           normal_check_interval           5       ; This directive is used to define the number of "time units" to wait before scheduling the next "regular" check of the service.
           retry_check_interval            3       ; This directive is used to define the number of "time units" to wait before scheduling a re-check of the service.
           max_check_attempts              2       ; This directive is used to define the number of times that Nagios will retry the service check command if it returns any state other than an OK state. Setting this value to 1 will cause Nagios to generate an alert without retrying the service check again.
           notification_period             24x7
           notification_options            w,u,c,r
           contact_groups                  admins
           register                        0       ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
           }
    

Define ActiveMQ check as service of a Petals node

See Monitoring Petals ESB with Nagios to configure Nagios to monitor Petals ESB

In main use-cases, the ActiveMQ server is collocated with the Petals ESB node running the SE ASE. So, it is a good practice to define ActiveMQ as a service of the Petals node running the SE ASE:

  1. edit the Nagios configuration file of your Petals ESB node (for example: /etc/nagios3/conf.d/petals-esb-host-node1.cfg, following the monitoring Petals ESB sample),
  2. and add the following content:
    # Define a service to check the queue size of an ActiveMQ queue used by the SE ASE
    define service{
           host_name                       petals-esb-node-1
           service_description             se-ase-queue-size
           check_command                   check_munin!activemq_size!10!50
           use                             generic-munin-service
         }
    
    # Define a service to check the traffic of an ActiveMQ queue used by the SE ASE
    define service{
           host_name                       petals-esb-node-1
           service_description             se-ase-queue-traffic
           check_command                   check_munin!activemq_traffic!500!1000
           use                             generic-munin-service
         }
    

In our example:

  • in nominal running, we should not have more than 10 pending messages. Over 50 pending messages, an error is thrown.
  • and according to our volumetric estimations, we should not have more than 500 messages per 5 min. We accept up to twice our estimation: 1000 messages per 5 min.

Next, restart Nagios, start Petals and ActiveMQ. Go to the Nagios console:

On network domain configuration error, the message "I can't guess your domain, please add the domain manually" can appear on services associated to the queue size and queue traffic. So update:

  • the command check_munin to force the domain name, example:
    define command{
         command_name check_munin
         command_line /usr/lib/nagios/plugins/check_munin_rrd.pl -H localhost.localdomain -d localdomain -M $ARG1$ -w $ARG2$ -c $ARG3$
         }
    

Screenshots

Nagios screenshots

Munin screenshots

Queue size sample

Traffic sample

Nagios screenshots

Monitoring the component internals

Using metrics

Several probes providing metrics are included in the component, and are available through the JMX MBean 'org.ow2.petals:type=custom,name=monitoring_<component-id>', where <component-id> is the unique JBI identifier of the component.

Common metrics

The following metrics are provided through the Petals CDK, and are common to all components:

Metrics, as MBean attribute Description Detail of the value Configurable
MessageExchangeAcceptorThreadPoolMaxSize The maximum number of threads of the message exchange acceptor thread pool integer value, since the last startup of the component yes, through acceptor-pool-size
MessageExchangeAcceptorThreadPoolCurrentSize The current number of threads of the message exchange acceptor thread pool. Should be always equals to MessageExchangeAcceptorThreadPoolMaxSize. instant integer value no
MessageExchangeAcceptorCurrentWorking The current number of working message exchange acceptors. instant long value no
MessageExchangeAcceptorMaxWorking The max number of working message exchange acceptors. long value, since the last startup of the component no
MessageExchangeAcceptorAbsoluteDurations The aggregated durations of the working message exchange acceptors since the last startup of the component. n-tuple value containing, in nanosecond:
  • the maximum duration,
  • the average duration,
  • the minimum duration.
no
MessageExchangeAcceptorRelativeDurations The aggregated durations of the working message exchange acceptors on the last sample. n-tuple value containing, in nanosecond:
  • the maximum duration,
  • the average duration,
  • the minimum duration,
  • the 10-percentile duration (10% of the durations are lesser than this value),
  • the 50-percentile duration (50% of the durations are lesser than this value),
  • the 90-percentile duration (90% of the durations are upper than this value).
no
MessageExchangeProcessorAbsoluteDurations The aggregated durations of the working message exchange processor since the last startup of the component. n-tuple value containing, in milliseconds:
  • the maximum duration,
  • the average duration,
  • the minimum duration.
no
MessageExchangeProcessorRelativeDurations The aggregated durations of the working message exchange processor on the last sample. n-tuple value containing, in milliseconds:
  • the maximum duration,
  • the average duration,
  • the minimum duration,
  • the 10-percentile duration (10% of the durations are lesser than this value),
  • the 50-percentile duration (50% of the durations are lesser than this value),
  • the 90-percentile duration (90% of the durations are upper than this value).
no
MessageExchangeProcessorThreadPoolActiveThreadsCurrent The current number of active threads of the message exchange processor thread pool instant integer value no
MessageExchangeProcessorThreadPoolActiveThreadsMax The maximum number of threads of the message exchange processor thread pool that was active integer value, since the last startup of the component no
MessageExchangeProcessorThreadPoolIdleThreadsCurrent The current number of idle threads of the message exchange processor thread pool instant integer value no
MessageExchangeProcessorThreadPoolIdleThreadsMax The maximum number of threads of the message exchange processor thread pool that was idle integer value, since the last startup of the component no
MessageExchangeProcessorThreadPoolMaxSize The maximum size, in threads, of the message exchange processor thread pool instant integer value yes, through http-thread-pool-size-max
MessageExchangeProcessorThreadPoolMinSize The minimum size, in threads, of the message exchange processor thread pool instant integer value yes, through http-thread-pool-size-min
MessageExchangeProcessorThreadPoolQueuedRequestsCurrent The current number of enqueued requests waiting to be processed by the message exchange processor thread pool instant integer value no
MessageExchangeProcessorThreadPoolQueuedRequestsMax The maximum number of enqueued requests waiting to be processed by the message exchange processor thread pool since the last startup of the component instant integer value no
ServiceProviderInvocations The number of service provider invocations grouped by:
  • interface name, as QName, the invoked service provider,
  • service name, as QName, the invoked service provider,
  • invoked operation, as QName,
  • message exchange pattern,
  • and execution status (PENDING, ERROR, FAULT, SUCCEEDED).
integer counter value since the last startup of the component no
ServiceProviderInvocationsResponseTimeAbs The aggregated response times of the service provider invocations since the last startup of the component grouped by:
  • interface name, as QName, the invoked service provider,
  • service name, as QName, the invoked service provider,
  • invoked operation, as QName,
  • message exchange pattern,
  • and execution status (PENDING, ERROR, FAULT, SUCCEEDED).
n-tuple value containing, in millisecond:
  • the maximum response time,
  • the average response time,
  • the minimum response time.
no
ServiceProviderInvocationsResponseTimeRel The aggregated response times of the service provider invocations on the last sample, grouped by:
  • interface name, as QName, the invoked service provider,
  • service name, as QName, the invoked service provider,
  • invoked operation, as QName,
  • message exchange pattern,
  • and execution status (PENDING, ERROR, FAULT, SUCCEEDED).
n-tuple value containing, in millisecond:
  • the maximum response time,
  • the average response time,
  • the minimum response time,
  • the 10-percentile response time (10% of the response times are lesser than this value),
  • the 50-percentile response time (50% of the response times are lesser than this value),
  • the 90-percentile response time (90% of the response times are lesser than this value).
no

Dedicated metrics

No dedicated metric is available.

Receiving alerts

Several alerts are notified by the component through notification of the JMX MBean 'org.ow2.petals:type=custom,name=monitoring_<component-id>', where <component-id> is the unique JBI identifier of the component.

To integrate these alerts with Nagios, see Receiving Petals ESB defects in Nagios.

Common alerts

Defect JMX Notification
A message exchange acceptor thread is dead
  • type: org.ow2.petals.component.framework.process.message.acceptor.pool.thread.dead
  • no user data
No more thread is available in the message exchange acceptor thread pool
  • type: org.ow2.petals.component.framework.process.message.acceptor.pool.exhausted
  • no user data
No more thread is available to run a message exchange processor
  • type: org.ow2.petals.component.framework.process.message.processor.thread.pool.exhausted
  • no user data

Dedicated alerts

No dedicated alert is available.

Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.