CMS Features

- -

Problem Specification

- -

Original Problem

- -

- This is the original specification given to us when we - started the project. The i-scream central monitoring - system meets this specification, and aims to extend it - further. This is, however, where it all began. -

- -

Centralised Machine Monitoring

- -

- The Computer Science department has a number of different machines - running a variety of different operating systems. One of the tasks - of the systems administrators is to make sure that the machines - don't run out of resources. This involves watching processor loads, - available disk space, swap space, etc. -

- -

- It isn't practicle to monitor a large number of machines by logging - on and running commands such as 'uptime' on the unix machines, or - by using performance monitor for NT servers. Thus this project is - to write monitoring software for each platform supported which - reports resource usage back to one centralized location. System - Administrators would then be able to monitor all machines from this - centralised location. -

- -

- Once this basic functionality is implemented it could usefully be - expanded to include logging of resource usage to identify longterm - trends/problems, alerter services which can directly contact - sysadmins (or even the general public) to bring attention to problem - areas. Ideally it should be possible to run multiple instances of - the reporting tool (with all instances being updated in realtime) - and to to be able to run the reporting tool as both as stand alone - application and embeded in a web page. -

- -

- This project will require you to write code for the unix and Win32 - APIs using C and knowledge of how the underlying operating systems - manage resources. It will also require some network/distributed - systems code and a GUI front end for the reporting tool. It is - important for students undertaking this project to understand the - importance of writing efficient and small code as the end product - will really be most useful when machines start run out of processing - power/memory/disk. -

- -

- John Cinnamond (email jc) whose idea this is, will provide technical - support for the project. -

- -

Features

- -

Key Features of The System

- -

A centrally stored, dynamically reloaded, system wide configuration system
A totally extendable monitoring system, nothing except the Host (which - generates the data) and the Clients (which view it) know any details about - the data being sent, allowing data to be modified without changes to the - server architecture.
Central server and reporting tools all Java based for multi-platform portability
Distribution of core server components over CORBA to allow appropriate components - to run independently and to allow new components to be written to conform with the - CORBA interfaces.
Use of CORBA to create a hierarchical set of data entry points to the system - allowing the system to handle event storms and remote office locations.
One location for all system messages, despite being distributed.
XML data protocol used to make data processing and analysing easily extendable
A stateless server which can be moved and restarted at will, while Hosts, - Clients, and reporting tools are unaffected and simply reconnect when the - server is available again.
Simple and open end protocols to allow easy extension and platform porting of Hosts - and Clients.
Self monitoring, as all data queues within the system can be monitored and raise - alerts to warn of event storms and impending failures (should any occur).
A variety of web based information displays based on Java/SQL reporting and - PHP on-the-fly page generation to show the latest alerts and data
Large overhead monitor Helpdesk style displays for latest Alerting information

- -

An Overview of the i-scream Central Monitoring System

- -

- The i-scream system monitors status and performance information - obtained from machines feeding data into it and then displays - this information in a variety of ways. -

- -

- This data is obtained through the running of small applications - on the reporting machines. These applications are known as - "Hosts". The i-scream system provides a range of hosts which are - designed to be small and lightweight in their configuration and - operation. See the website and appropriate documentation to - locate currently available Host applications. These hosts are - simply told where to contact the server at which point they are - totally autonomous. They are able to obtain configuration from - the server, detect changes in their configuration, send data - packets (via UDP) containing monitoring information, and send - so called "Heartbeat" packets (via TCP) periodically to indicate - to the server that they are still alive. -

- -

- It is then fed into the i-scream server. The server then splits - the data two ways. First it places the data in a database system, - typically MySQL based, for later extraction and processing by the - i-scream report generation tools. It then passes it onto to - real-time "Clients" which handle the data as it enters the system. - The system itself has an internal real-time client called the "Local - Client" which has a series of Monitors running which can analyse the - data. One of these Monitors also feeds the data off to a file - repository, which is updated as new data comes in for each machine, - this data is then read and displayed by the i-scream web services - to provide a web interface to the data. The system also allows TCP - connections by non-local clients (such as the i-scream supplied - Conient), these applications provide a real-time view of the data - as it flows through the system. -

- -

- The final section of the system links the Local Client Monitors to - an alerting system. These Monitors can be configured to detect - changes in the data past threshold levels. When a threshold is - breached an alert is raised. This alert is then escalated as the - alert persists through four live levels, NOTICE, WARNING, CAUTION - and CRITICAL. The alerting system keeps an eye on the level and - when a certain level is reached, certain alerting mechanisms fire - through whatever medium they are configured to send. -

- +

+ CMS Features +

+ Problem Specification +

+ Original Problem +

+ This is the original specification given to us when we + started the project. The i-scream central monitoring system + meets this specification, and aims to extend it further. + This is, however, where it all began. +

+ Centralised Machine Monitoring +

+ The Computer Science department has a number of different + machines running a variety of different operating systems. + One of the tasks of the systems administrators is to make + sure that the machines don't run out of resources. This + involves watching processor loads, available disk space, + swap space, etc. +

+ It isn't practicle to monitor a large number of machines by + logging on and running commands such as 'uptime' on the + unix machines, or by using performance monitor for NT + servers. Thus this project is to write monitoring software + for each platform supported which reports resource usage + back to one centralised location. System Administrators + would then be able to monitor all machines from this + centralised location. +

+ Once this basic functionality is implemented it could + usefully be expanded to include logging of resource usage + to identify longterm trends/problems, alerter services + which can directly contact sysadmins (or even the general + public) to bring attention to problem areas. Ideally it + should be possible to run multiple instances of the + reporting tool (with all instances being updated in + realtime) and to to be able to run the reporting tool as + both as stand alone application and embeded in a web page. +

+ This project will require you to write code for the unix + and Win32 APIs using C and knowledge of how the underlying + operating systems manage resources. It will also require + some network/distributed systems code and a GUI front end + for the reporting tool. It is important for students + undertaking this project to understand the importance of + writing efficient and small code as the end product will + really be most useful when machines start run out of + processing power/memory/disk. +

+ John Cinnamond (email jc) whose idea this is, will provide + technical support for the project. +

+ Features +

+ Key Features of The System +

A centrally stored, dynamically reloaded, system wide + configuration system +
A totally extendable monitoring system, nothing except + the Host (which generates the data) and the Clients (which + view it) know any details about the data being sent, + allowing data to be modified without changes to the server + architecture. +
Central server and reporting tools all Java based for + multi-platform portability +
Distribution of core server components over CORBA to + allow appropriate components to run independently and to + allow new components to be written to conform with the + CORBA interfaces. +
Use of CORBA to create a hierarchical set of data entry + points to the system allowing the system to handle event + storms and remote office locations. +
One location for all system messages, despite being + distributed. +
XML data protocol used to make data processing and + analysing easily extendable +
A stateless server which can be moved and restarted at + will, while Hosts, Clients, and reporting tools are + unaffected and simply reconnect when the server is + available again. +
Simple and open end protocols to allow easy extension + and platform porting of Hosts and Clients. +
Self monitoring, as all data queues within the system + can be monitored and raise alerts to warn of event storms + and impending failures (should any occur). +
A variety of web based information displays based on + Java/SQL reporting and PHP on-the-fly page generation to + show the latest alerts and data +
Large overhead monitor Helpdesk style displays for + latest Alerting information +

+ An Overview of the i-scream Central Monitoring System +

+ The i-scream system monitors status and performance + information obtained from machines feeding data into it and + then displays this information in a variety of ways. +

+ This data is obtained through the running of small + applications on the reporting machines. These applications + are known as "Hosts". The i-scream system provides a range + of hosts which are designed to be small and lightweight in + their configuration and operation. See the website and + appropriate documentation to locate currently available + Host applications. These hosts are simply told where to + contact the server at which point they are totally + autonomous. They are able to obtain configuration from the + server, detect changes in their configuration, send data + packets (via UDP) containing monitoring information, and + send so called "Heartbeat" packets (via TCP) periodically + to indicate to the server that they are still alive. +

+ It is then fed into the i-scream server. The server then + splits the data two ways. First it places the data in a + database system, typically MySQL based, for later + extraction and processing by the i-scream report generation + tools. It then passes it onto to real-time "Clients" which + handle the data as it enters the system. The system itself + has an internal real-time client called the "Local Client" + which has a series of Monitors running which can analyse + the data. One of these Monitors also feeds the data off to + a file repository, which is updated as new data comes in + for each machine, this data is then read and displayed by + the i-scream web services to provide a web interface to the + data. The system also allows TCP connections by non-local + clients (such as the i-scream supplied Conient), these + applications provide a real-time view of the data as it + flows through the system. +

+ The final section of the system links the Local Client + Monitors to an alerting system. These Monitors can be + configured to detect changes in the data past threshold + levels. When a threshold is breached an alert is raised. + This alert is then escalated as the alert persists through + four live levels, NOTICE, WARNING, CAUTION and CRITICAL. + The alerting system keeps an eye on the level and when a + certain level is reached, certain alerting mechanisms fire + through whatever medium they are configured to send. +

- -

- +

- -