Build service dashboard using Shinken






Rohit Gupta

Developer @plivo

@rohit01

Presenter Notes

Why are we here?

  • Monitor everything: Shinken
  • Shinken Architecture
  • NRPE Plugins
  • Write your own NRPE plugin
  • Automated service checks
  • Integrate your plugins
  • API based User Interface

Presenter Notes

What is Shinken?

Presenter Notes

Shinken Architecture

Presenter Notes

NRPE Plugins

Presenter Notes

Write your own NRPE plugin!

Presenter Notes

Python NRPE Example

 1 import requests
 2 import sys
 3 import time
 4
 5 ST_OK = 0
 6 ST_WR = 1
 7 ST_CR = 2
 8 ST_CR = 3
 9
10 url = 'http://plivo.com'
11 try:
12     start_time = time.time()
13     response = requests.get(url, timeout=2)
14     end_time = time.time()
15 except Exception:
16     print 'Critical - Exception occured while fetching url: %s' % url
17     sys.exit(ST_CR)

Presenter Notes

Continue...

 1 latency = end_time - start_time
 2 if response.status_code < 200 or response.status_code >= 300:
 3     print 'Critical - Response code: %s for url: %s' \
 4           % (response.status_code, url)
 5     sys.exit(ST_WR)
 6 elif latency > 1:
 7     print 'Warning - latency: %s secs for url: %s' % (latency, url)
 8     sys.exit(ST_WR)
 9 print 'Ok - %s tested successfully' % url
10 sys.exit(ST_OK)

Presenter Notes

Things to Consider

  • Dont Re-Invent the wheel!
  • Exit values
  • Setup and Teardown
  • Locks
  • Exceptions
  • Logs
  • Be verbose but you have single line
  • Graph everything

Presenter Notes

Integrate your plugins

  • Configuration DIR: /usr/local/shinken/etc/
  • resource.cfg
$PYTHON_VENV$=/usr/src/python_env
$PLUGINSDIR$=/usr/local/nagios/libexec
  • commands.cfg
define command{
    command_name    my_web_check
    command_line    $PYTHON_VENV$/bin/python $PLUGINSDIR$/my_web_check.py
}
  • contacts.cfg

Presenter Notes

Continue...

  • templates.cfg
# This is how critical call + email alerts are sent, 24x7 way.
define notificationway{
       notificationway_name                 email_and_call_alerts
       service_notification_period          24x7
       host_notification_period             24x7
       service_notification_options         c
       host_notification_options            d
       host_notification_commands           host_email_alerts
       service_notification_commands        service_email_and_call_alerts
}
  • pip install pyro

Presenter Notes

Some Good Practices

  • Dont write for success scenarios
  • Atomic checks
  • Avoid mutual dependency
  • Stats collection
  • Avoid many hops
  • Tricky for long running tests
  • Be more verbose

Presenter Notes

API based User Interface

Presenter Notes

questions

Presenter Notes

Thank You


Rohit Gupta

@rohit01

Developer @plivo

Email: [email protected]

Ph: (+65) 31583923

Slide link: http://bit.ly/plivosg

Presenter Notes