www/cms/features.xhtml

<!--#include virtual="/doctype.inc" -->
  <head>
    <title>
      CMS Features
    </title>
<!--#include virtual="/style.inc" -->
  </head>
  <body>
    <div id="container">
      <div id="main">
<!--#include virtual="/header.inc" -->
        <div id="contents">
          <h1 class="top">
            CMS Features
          </h1>
          <h2>
            Problem Specification
          </h2>
          <h3>
            Original Problem
          </h3>
          <p>
            This is the original specification given to us when we
            started the project. The i-scream central monitoring system
            meets this specification, and aims to extend it further.
            This is, however, where it all began.
          </p>
          <h3>
            Centralised Machine Monitoring
          </h3>
          <p>
            The Computer Science department has a number of different
            machines running a variety of different operating systems.
            One of the tasks of the systems administrators is to make
            sure that the machines don't run out of resources. This
            involves watching processor loads, available disk space,
            swap space, etc.
          </p>
          <p>
            It isn't practicle to monitor a large number of machines by
            logging on and running commands such as 'uptime' on the
            unix machines, or by using performance monitor for NT
            servers. Thus this project is to write monitoring software
            for each platform supported which reports resource usage
            back to one centralized location. System Administrators
            would then be able to monitor all machines from this
            centralised location.
          </p>
          <p>
            Once this basic functionality is implemented it could
            usefully be expanded to include logging of resource usage
            to identify longterm trends/problems, alerter services
            which can directly contact sysadmins (or even the general
            public) to bring attention to problem areas. Ideally it
            should be possible to run multiple instances of the
            reporting tool (with all instances being updated in
            realtime) and to to be able to run the reporting tool as
            both as stand alone application and embeded in a web page.
          </p>
          <p>
            This project will require you to write code for the unix
            and Win32 APIs using C and knowledge of how the underlying
            operating systems manage resources. It will also require
            some network/distributed systems code and a GUI front end
            for the reporting tool. It is important for students
            undertaking this project to understand the importance of
            writing efficient and small code as the end product will
            really be most useful when machines start run out of
            processing power/memory/disk.
          </p>
          <p>
            John Cinnamond (email jc) whose idea this is, will provide
            technical support for the project.
          </p>
          <h2>
            Features
          </h2>
          <h3>
            Key Features of The System
          </h3>
          <ul>
            <li>A centrally stored, dynamically reloaded, system wide
            configuration system
            </li>
            <li>A totally extendable monitoring system, nothing except
            the Host (which generates the data) and the Clients (which
            view it) know any details about the data being sent,
            allowing data to be modified without changes to the server
            architecture.
            </li>
            <li>Central server and reporting tools all Java based for
            multi-platform portability
            </li>
            <li>Distribution of core server components over CORBA to
            allow appropriate components to run independently and to
            allow new components to be written to conform with the
            CORBA interfaces.
            </li>
            <li>Use of CORBA to create a hierarchical set of data entry
            points to the system allowing the system to handle event
            storms and remote office locations.
            </li>
            <li>One location for all system messages, despite being
            distributed.
            </li>
            <li>XML data protocol used to make data processing and
            analysing easily extendable
            </li>
            <li>A stateless server which can be moved and restarted at
            will, while Hosts, Clients, and reporting tools are
            unaffected and simply reconnect when the server is
            available again.
            </li>
            <li>Simple and open end protocols to allow easy extension
            and platform porting of Hosts and Clients.
            </li>
            <li>Self monitoring, as all data queues within the system
            can be monitored and raise alerts to warn of event storms
            and impending failures (should any occur).
            </li>
            <li>A variety of web based information displays based on
            Java/SQL reporting and PHP on-the-fly page generation to
            show the latest alerts and data
            </li>
            <li>Large overhead monitor Helpdesk style displays for
            latest Alerting information
            </li>
          </ul>
          <h3>
            An Overview of the i-scream Central Monitoring System
          </h3>
          <p>
            The i-scream system monitors status and performance
            information obtained from machines feeding data into it and
            then displays this information in a variety of ways.
          </p>
          <p>
            This data is obtained through the running of small
            applications on the reporting machines. These applications
            are known as "Hosts". The i-scream system provides a range
            of hosts which are designed to be small and lightweight in
            their configuration and operation. See the website and
            appropriate documentation to locate currently available
            Host applications. These hosts are simply told where to
            contact the server at which point they are totally
            autonomous. They are able to obtain configuration from the
            server, detect changes in their configuration, send data
            packets (via UDP) containing monitoring information, and
            send so called "Heartbeat" packets (via TCP) periodically
            to indicate to the server that they are still alive.
          </p>
          <p>
            It is then fed into the i-scream server. The server then
            splits the data two ways. First it places the data in a
            database system, typically MySQL based, for later
            extraction and processing by the i-scream report generation
            tools. It then passes it onto to real-time "Clients" which
            handle the data as it enters the system. The system itself
            has an internal real-time client called the "Local Client"
            which has a series of Monitors running which can analyse
            the data. One of these Monitors also feeds the data off to
            a file repository, which is updated as new data comes in
            for each machine, this data is then read and displayed by
            the i-scream web services to provide a web interface to the
            data. The system also allows TCP connections by non-local
            clients (such as the i-scream supplied Conient), these
            applications provide a real-time view of the data as it
            flows through the system.
          </p>
          <p>
            The final section of the system links the Local Client
            Monitors to an alerting system. These Monitors can be
            configured to detect changes in the data past threshold
            levels. When a threshold is breached an alert is raised.
            This alert is then escalated as the alert persists through
            four live levels, NOTICE, WARNING, CAUTION and CRITICAL.
            The alerting system keeps an eye on the level and when a
            certain level is reached, certain alerting mechanisms fire
            through whatever medium they are configured to send.
          </p>
        </div>
<!--#include virtual="/footer.inc" -->
      </div>
<!--#include virtual="/menu.inc" -->
    </div>
  </body>
</html>
Revision:	1.5
Committed:	Tue Mar 23 23:43:26 2004 UTC (21 years, 5 months ago) by tdb
Branch:	MAIN
Changes since 1.4:	+181 -163 lines
Log Message:	Another biggish commit. All pages are now XHTML 1.1 compliant. I've also tided (with the help of the tidy tool) all the pages, so they're neater. There are still parts of the site that won't validate - such as the CGI scripts, and the CVS stuff - but I'll get to them tomorrow.
#	Content
1	<!--#include virtual="/doctype.inc" -->
2	<head>
3	<title>
4	CMS Features
5	</title>
6	<!--#include virtual="/style.inc" -->
7	</head>
8	<body>
9	<div id="container">
10	<div id="main">
11	<!--#include virtual="/header.inc" -->
12	<div id="contents">
13	<h1 class="top">
14	CMS Features
15	</h1>
16	<h2>
17	Problem Specification
18	</h2>
19	<h3>
20	Original Problem
21	</h3>
22	<p>
23	This is the original specification given to us when we
24	started the project. The i-scream central monitoring system
25	meets this specification, and aims to extend it further.
26	This is, however, where it all began.
27	</p>
28	<h3>
29	Centralised Machine Monitoring
30	</h3>
31	<p>
32	The Computer Science department has a number of different
33	machines running a variety of different operating systems.
34	One of the tasks of the systems administrators is to make
35	sure that the machines don't run out of resources. This
36	involves watching processor loads, available disk space,
37	swap space, etc.
38	</p>
39	<p>
40	It isn't practicle to monitor a large number of machines by
41	logging on and running commands such as 'uptime' on the
42	unix machines, or by using performance monitor for NT
43	servers. Thus this project is to write monitoring software
44	for each platform supported which reports resource usage
45	back to one centralized location. System Administrators
46	would then be able to monitor all machines from this
47	centralised location.
48	</p>
49	<p>
50	Once this basic functionality is implemented it could
51	usefully be expanded to include logging of resource usage
52	to identify longterm trends/problems, alerter services
53	which can directly contact sysadmins (or even the general
54	public) to bring attention to problem areas. Ideally it
55	should be possible to run multiple instances of the
56	reporting tool (with all instances being updated in
57	realtime) and to to be able to run the reporting tool as
58	both as stand alone application and embeded in a web page.
59	</p>
60	<p>
61	This project will require you to write code for the unix
62	and Win32 APIs using C and knowledge of how the underlying
63	operating systems manage resources. It will also require
64	some network/distributed systems code and a GUI front end
65	for the reporting tool. It is important for students
66	undertaking this project to understand the importance of
67	writing efficient and small code as the end product will
68	really be most useful when machines start run out of
69	processing power/memory/disk.
70	</p>
71	<p>
72	John Cinnamond (email jc) whose idea this is, will provide
73	technical support for the project.
74	</p>
75	<h2>
76	Features
77	</h2>
78	<h3>
79	Key Features of The System
80	</h3>
81	<ul>
82	<li>A centrally stored, dynamically reloaded, system wide
83	configuration system
84	</li>
85	<li>A totally extendable monitoring system, nothing except
86	the Host (which generates the data) and the Clients (which
87	view it) know any details about the data being sent,
88	allowing data to be modified without changes to the server
89	architecture.
90	</li>
91	<li>Central server and reporting tools all Java based for
92	multi-platform portability
93	</li>
94	<li>Distribution of core server components over CORBA to
95	allow appropriate components to run independently and to
96	allow new components to be written to conform with the
97	CORBA interfaces.
98	</li>
99	<li>Use of CORBA to create a hierarchical set of data entry
100	points to the system allowing the system to handle event
101	storms and remote office locations.
102	</li>
103	<li>One location for all system messages, despite being
104	distributed.
105	</li>
106	<li>XML data protocol used to make data processing and
107	analysing easily extendable
108	</li>
109	<li>A stateless server which can be moved and restarted at
110	will, while Hosts, Clients, and reporting tools are
111	unaffected and simply reconnect when the server is
112	available again.
113	</li>
114	<li>Simple and open end protocols to allow easy extension
115	and platform porting of Hosts and Clients.
116	</li>
117	<li>Self monitoring, as all data queues within the system
118	can be monitored and raise alerts to warn of event storms
119	and impending failures (should any occur).
120	</li>
121	<li>A variety of web based information displays based on
122	Java/SQL reporting and PHP on-the-fly page generation to
123	show the latest alerts and data
124	</li>
125	<li>Large overhead monitor Helpdesk style displays for
126	latest Alerting information
127	</li>
128	</ul>
129	<h3>
130	An Overview of the i-scream Central Monitoring System
131	</h3>
132	<p>
133	The i-scream system monitors status and performance
134	information obtained from machines feeding data into it and
135	then displays this information in a variety of ways.
136	</p>
137	<p>
138	This data is obtained through the running of small
139	applications on the reporting machines. These applications
140	are known as "Hosts". The i-scream system provides a range
141	of hosts which are designed to be small and lightweight in
142	their configuration and operation. See the website and
143	appropriate documentation to locate currently available
144	Host applications. These hosts are simply told where to
145	contact the server at which point they are totally
146	autonomous. They are able to obtain configuration from the
147	server, detect changes in their configuration, send data
148	packets (via UDP) containing monitoring information, and
149	send so called "Heartbeat" packets (via TCP) periodically
150	to indicate to the server that they are still alive.
151	</p>
152	<p>
153	It is then fed into the i-scream server. The server then
154	splits the data two ways. First it places the data in a
155	database system, typically MySQL based, for later
156	extraction and processing by the i-scream report generation
157	tools. It then passes it onto to real-time "Clients" which
158	handle the data as it enters the system. The system itself
159	has an internal real-time client called the "Local Client"
160	which has a series of Monitors running which can analyse
161	the data. One of these Monitors also feeds the data off to
162	a file repository, which is updated as new data comes in
163	for each machine, this data is then read and displayed by
164	the i-scream web services to provide a web interface to the
165	data. The system also allows TCP connections by non-local
166	clients (such as the i-scream supplied Conient), these
167	applications provide a real-time view of the data as it
168	flows through the system.
169	</p>
170	<p>
171	The final section of the system links the Local Client
172	Monitors to an alerting system. These Monitors can be
173	configured to detect changes in the data past threshold
174	levels. When a threshold is breached an alert is raised.
175	This alert is then escalated as the alert persists through
176	four live levels, NOTICE, WARNING, CAUTION and CRITICAL.
177	The alerting system keeps an eye on the level and when a
178	certain level is reached, certain alerting mechanisms fire
179	through whatever medium they are configured to send.
180	</p>
181	</div>
182	<!--#include virtual="/footer.inc" -->
183	</div>
184	<!--#include virtual="/menu.inc" -->
185	</div>
186	</body>
187	</html>