ViewVC Help
View File | Revision Log | Show Annotations | Revision Graph | Root Listing
root/i-scream/web/www/cms/features.xhtml
Revision: 1.5
Committed: Tue Mar 23 23:43:26 2004 UTC (20 years, 8 months ago) by tdb
Branch: MAIN
Changes since 1.4: +181 -163 lines
Log Message:
Another biggish commit.

All pages are now XHTML 1.1 compliant. I've also tided (with the help of
the tidy tool) all the pages, so they're neater.

There are still parts of the site that won't validate - such as the CGI
scripts, and the CVS stuff - but I'll get to them tomorrow.

File Contents

# Content
1 <!--#include virtual="/doctype.inc" -->
2 <head>
3 <title>
4 CMS Features
5 </title>
6 <!--#include virtual="/style.inc" -->
7 </head>
8 <body>
9 <div id="container">
10 <div id="main">
11 <!--#include virtual="/header.inc" -->
12 <div id="contents">
13 <h1 class="top">
14 CMS Features
15 </h1>
16 <h2>
17 Problem Specification
18 </h2>
19 <h3>
20 Original Problem
21 </h3>
22 <p>
23 This is the original specification given to us when we
24 started the project. The i-scream central monitoring system
25 meets this specification, and aims to extend it further.
26 This is, however, where it all began.
27 </p>
28 <h3>
29 Centralised Machine Monitoring
30 </h3>
31 <p>
32 The Computer Science department has a number of different
33 machines running a variety of different operating systems.
34 One of the tasks of the systems administrators is to make
35 sure that the machines don't run out of resources. This
36 involves watching processor loads, available disk space,
37 swap space, etc.
38 </p>
39 <p>
40 It isn't practicle to monitor a large number of machines by
41 logging on and running commands such as 'uptime' on the
42 unix machines, or by using performance monitor for NT
43 servers. Thus this project is to write monitoring software
44 for each platform supported which reports resource usage
45 back to one centralized location. System Administrators
46 would then be able to monitor all machines from this
47 centralised location.
48 </p>
49 <p>
50 Once this basic functionality is implemented it could
51 usefully be expanded to include logging of resource usage
52 to identify longterm trends/problems, alerter services
53 which can directly contact sysadmins (or even the general
54 public) to bring attention to problem areas. Ideally it
55 should be possible to run multiple instances of the
56 reporting tool (with all instances being updated in
57 realtime) and to to be able to run the reporting tool as
58 both as stand alone application and embeded in a web page.
59 </p>
60 <p>
61 This project will require you to write code for the unix
62 and Win32 APIs using C and knowledge of how the underlying
63 operating systems manage resources. It will also require
64 some network/distributed systems code and a GUI front end
65 for the reporting tool. It is important for students
66 undertaking this project to understand the importance of
67 writing efficient and small code as the end product will
68 really be most useful when machines start run out of
69 processing power/memory/disk.
70 </p>
71 <p>
72 John Cinnamond (email jc) whose idea this is, will provide
73 technical support for the project.
74 </p>
75 <h2>
76 Features
77 </h2>
78 <h3>
79 Key Features of The System
80 </h3>
81 <ul>
82 <li>A centrally stored, dynamically reloaded, system wide
83 configuration system
84 </li>
85 <li>A totally extendable monitoring system, nothing except
86 the Host (which generates the data) and the Clients (which
87 view it) know any details about the data being sent,
88 allowing data to be modified without changes to the server
89 architecture.
90 </li>
91 <li>Central server and reporting tools all Java based for
92 multi-platform portability
93 </li>
94 <li>Distribution of core server components over CORBA to
95 allow appropriate components to run independently and to
96 allow new components to be written to conform with the
97 CORBA interfaces.
98 </li>
99 <li>Use of CORBA to create a hierarchical set of data entry
100 points to the system allowing the system to handle event
101 storms and remote office locations.
102 </li>
103 <li>One location for all system messages, despite being
104 distributed.
105 </li>
106 <li>XML data protocol used to make data processing and
107 analysing easily extendable
108 </li>
109 <li>A stateless server which can be moved and restarted at
110 will, while Hosts, Clients, and reporting tools are
111 unaffected and simply reconnect when the server is
112 available again.
113 </li>
114 <li>Simple and open end protocols to allow easy extension
115 and platform porting of Hosts and Clients.
116 </li>
117 <li>Self monitoring, as all data queues within the system
118 can be monitored and raise alerts to warn of event storms
119 and impending failures (should any occur).
120 </li>
121 <li>A variety of web based information displays based on
122 Java/SQL reporting and PHP on-the-fly page generation to
123 show the latest alerts and data
124 </li>
125 <li>Large overhead monitor Helpdesk style displays for
126 latest Alerting information
127 </li>
128 </ul>
129 <h3>
130 An Overview of the i-scream Central Monitoring System
131 </h3>
132 <p>
133 The i-scream system monitors status and performance
134 information obtained from machines feeding data into it and
135 then displays this information in a variety of ways.
136 </p>
137 <p>
138 This data is obtained through the running of small
139 applications on the reporting machines. These applications
140 are known as "Hosts". The i-scream system provides a range
141 of hosts which are designed to be small and lightweight in
142 their configuration and operation. See the website and
143 appropriate documentation to locate currently available
144 Host applications. These hosts are simply told where to
145 contact the server at which point they are totally
146 autonomous. They are able to obtain configuration from the
147 server, detect changes in their configuration, send data
148 packets (via UDP) containing monitoring information, and
149 send so called "Heartbeat" packets (via TCP) periodically
150 to indicate to the server that they are still alive.
151 </p>
152 <p>
153 It is then fed into the i-scream server. The server then
154 splits the data two ways. First it places the data in a
155 database system, typically MySQL based, for later
156 extraction and processing by the i-scream report generation
157 tools. It then passes it onto to real-time "Clients" which
158 handle the data as it enters the system. The system itself
159 has an internal real-time client called the "Local Client"
160 which has a series of Monitors running which can analyse
161 the data. One of these Monitors also feeds the data off to
162 a file repository, which is updated as new data comes in
163 for each machine, this data is then read and displayed by
164 the i-scream web services to provide a web interface to the
165 data. The system also allows TCP connections by non-local
166 clients (such as the i-scream supplied Conient), these
167 applications provide a real-time view of the data as it
168 flows through the system.
169 </p>
170 <p>
171 The final section of the system links the Local Client
172 Monitors to an alerting system. These Monitors can be
173 configured to detect changes in the data past threshold
174 levels. When a threshold is breached an alert is raised.
175 This alert is then escalated as the alert persists through
176 four live levels, NOTICE, WARNING, CAUTION and CRITICAL.
177 The alerting system keeps an eye on the level and when a
178 certain level is reached, certain alerting mechanisms fire
179 through whatever medium they are configured to send.
180 </p>
181 </div>
182 <!--#include virtual="/footer.inc" -->
183 </div>
184 <!--#include virtual="/menu.inc" -->
185 </div>
186 </body>
187 </html>