ViewVC Help
View File | Revision Log | Show Annotations | Revision Graph | Root Listing
root/i-scream/web/www/cms/features.xhtml
(Generate patch)

Comparing web/www/cms/features.xhtml (file contents):
Revision 1.4 by tdb, Tue Mar 23 20:22:31 2004 UTC vs.
Revision 1.6 by tdb, Thu Mar 25 16:13:24 2004 UTC

# Line 1 | Line 1
1   <!--#include virtual="/doctype.inc" -->
2 <
3 < <head>
4 < <title>CMS Features</title>
2 >  <head>
3 >    <title>
4 >      CMS Features
5 >    </title>
6   <!--#include virtual="/style.inc" -->
7 < </head>
8 <
9 < <body>
10 <
10 < <div id="container">
11 <
12 < <div id="main">
13 <
7 >  </head>
8 >  <body>
9 >    <div id="container">
10 >      <div id="main">
11   <!--#include virtual="/header.inc" -->
12 <
13 < <div id="contents">
14 <
15 <  <h1 class="top">CMS Features</h1>
16 <
17 <  <h2>Problem Specification</h2>
18 <
19 <       <h3>Original Problem</h3>
20 <
21 <       <p>
22 <        This is the original specification given to us when we
23 <        started the project. The i-scream central monitoring
24 <        system meets this specification, and aims to extend it
25 <        further. This is, however, where it all began.
26 <       </p>
27 <      
28 <       <h3>Centralised Machine Monitoring</h3>
29 <
30 <       <p>
31 <        The Computer Science department has a number of different machines
32 <        running a variety of different operating systems. One of the tasks
33 <        of the systems administrators is to make sure that the machines
34 <        don't run out of resources. This involves watching processor loads,
35 <        available disk space, swap space, etc.
36 <       </p>
37 <      
38 <       <p>
39 <        It isn't practicle to monitor a large number of machines by logging
40 <        on and running commands such as 'uptime' on the unix machines, or
41 <        by using performance monitor for NT servers. Thus this project is
42 <        to write monitoring software for each platform supported which
43 <        reports resource usage back to one centralized location. System
44 <        Administrators would then be able to monitor all machines from this
45 <        centralised location.
46 <       </p>
47 <
48 <       <p>
49 <        Once this basic functionality is implemented it could usefully be
50 <        expanded to include logging of resource usage to identify longterm
51 <        trends/problems, alerter services which can directly contact
52 <        sysadmins (or even the general public) to bring attention to problem
53 <        areas. Ideally it should be possible to run multiple instances of
54 <        the reporting tool (with all instances being updated in realtime)
55 <        and to to be able to run the reporting tool as both as stand alone
56 <        application and embeded in a web page.
57 <       </p>
58 <
59 <       <p>
60 <        This project will require you to write code for the unix and Win32
61 <        APIs using C and knowledge of how the underlying operating systems
62 <        manage resources. It will also require some network/distributed
63 <        systems code and a GUI front end for the reporting tool. It is
64 <        important for students undertaking this project to understand the
65 <        importance of writing efficient and small code as the end product
66 <        will really be most useful when machines start run out of processing
67 <        power/memory/disk.
68 <       </p>
69 <
70 <       <p>
71 <        John Cinnamond (email jc) whose idea this is, will provide technical
72 <        support for the project.
73 <       </p>
74 <
75 <  <h2>Features</h2>
76 <
77 <       <h3>Key Features of The System</h3>
78 <      
79 <       <ul>
80 <        <li>A centrally stored, dynamically reloaded, system wide configuration system</li>
81 <        <li>A totally extendable monitoring system, nothing except the Host (which
82 <          generates the data) and the Clients (which view it) know any details about
83 <          the data being sent, allowing data to be modified without changes to the
84 <          server architecture.</li>
85 <        <li>Central server and reporting tools all Java based for multi-platform portability</li>
86 <        <li>Distribution of core server components over CORBA to allow appropriate components
87 <          to run independently and to allow new components to be written to conform with the
88 <          CORBA interfaces.</li>
89 <        <li>Use of CORBA to create a hierarchical set of data entry points to the system
90 <          allowing the system to handle event storms and remote office locations.</li>
91 <        <li>One location for all system messages, despite being distributed.</li>
92 <        <li>XML data protocol used to make data processing and analysing easily extendable</li>
93 <        <li>A stateless server which can be moved and restarted at will, while Hosts,
94 <          Clients, and reporting tools are unaffected and simply reconnect when the
95 <          server is available again.</li>
96 <        <li>Simple and open end protocols to allow easy extension and platform porting of Hosts
97 <          and Clients.</li>
98 <        <li>Self monitoring, as all data queues within the system can be monitored and raise
99 <          alerts to warn of event storms and impending failures (should any occur).</li>
100 <        <li>A variety of web based information displays based on Java/SQL reporting and
101 <          PHP on-the-fly page generation to show the latest alerts and data</li>
102 <        <li>Large overhead monitor Helpdesk style displays for latest Alerting information</li>
103 <       </ul>
104 <      
105 <       <h3>An Overview of the i-scream Central Monitoring System</h3>
106 <
107 <       <p>
108 <        The i-scream system monitors status and performance information
109 <        obtained from machines feeding data into it and then displays
110 <        this information in a variety of ways.
111 <       </p>
112 <      
113 <       <p>
114 <        This data is obtained through the running of small applications
115 <        on the reporting machines.  These applications are known as
116 <        "Hosts".  The i-scream system provides a range of hosts which are
117 <        designed to be small and lightweight in their configuration and
118 <        operation.  See the website and appropriate documentation to
119 <        locate currently available Host applications.  These hosts are
120 <        simply told where to contact the server at which point they are
121 <        totally autonomous.  They are able to obtain configuration from
122 <        the server, detect changes in their configuration, send data
123 <        packets (via UDP) containing monitoring information, and send
124 <        so called "Heartbeat" packets (via TCP) periodically to indicate
125 <        to the server that they are still alive.
126 <       </p>
127 <      
128 <       <p>
129 <        It is then fed into the i-scream server.  The server then splits
130 <        the data two ways.  First it places the data in a database system,
131 <        typically MySQL based, for later extraction and processing by the
132 <        i-scream report generation tools.  It then passes it onto to
133 <        real-time "Clients" which handle the data as it enters the system.
134 <        The system itself has an internal real-time client called the "Local
135 <        Client" which has a series of Monitors running which can analyse the
136 <        data.  One of these Monitors also feeds the data off to a file
137 <        repository, which is updated as new data comes in for each machine,
138 <        this data is then read and displayed by the i-scream web services
139 <        to provide a web interface to the data.  The system also allows TCP
140 <        connections by non-local clients (such as the i-scream supplied
141 <        Conient), these applications provide a real-time view of the data
142 <        as it flows through the system.
143 <       </p>
144 <      
145 <       <p>
146 <        The final section of the system links the Local Client Monitors to
147 <        an alerting system.  These Monitors can be configured to detect
148 <        changes in the data past threshold levels.  When a threshold is
149 <        breached an alert is raised.  This alert is then escalated as the
150 <        alert persists through four live levels, NOTICE, WARNING, CAUTION
151 <        and CRITICAL.  The alerting system keeps an eye on the level and
152 <        when a certain level is reached, certain alerting mechanisms fire
153 <        through whatever medium they are configured to send.
154 <       </p>
155 < </div>
156 <
12 >        <div id="contents">
13 >          <h1 class="top">
14 >            CMS Features
15 >          </h1>
16 >          <h2>
17 >            Problem Specification
18 >          </h2>
19 >          <h3>
20 >            Original Problem
21 >          </h3>
22 >          <p>
23 >            This is the original specification given to us when we
24 >            started the project. The i-scream central monitoring system
25 >            meets this specification, and aims to extend it further.
26 >            This is, however, where it all began.
27 >          </p>
28 >          <h3>
29 >            Centralised Machine Monitoring
30 >          </h3>
31 >          <p>
32 >            The Computer Science department has a number of different
33 >            machines running a variety of different operating systems.
34 >            One of the tasks of the systems administrators is to make
35 >            sure that the machines don't run out of resources. This
36 >            involves watching processor loads, available disk space,
37 >            swap space, etc.
38 >          </p>
39 >          <p>
40 >            It isn't practicle to monitor a large number of machines by
41 >            logging on and running commands such as 'uptime' on the
42 >            unix machines, or by using performance monitor for NT
43 >            servers. Thus this project is to write monitoring software
44 >            for each platform supported which reports resource usage
45 >            back to one centralised location. System Administrators
46 >            would then be able to monitor all machines from this
47 >            centralised location.
48 >          </p>
49 >          <p>
50 >            Once this basic functionality is implemented it could
51 >            usefully be expanded to include logging of resource usage
52 >            to identify longterm trends/problems, alerter services
53 >            which can directly contact sysadmins (or even the general
54 >            public) to bring attention to problem areas. Ideally it
55 >            should be possible to run multiple instances of the
56 >            reporting tool (with all instances being updated in
57 >            realtime) and to to be able to run the reporting tool as
58 >            both as stand alone application and embeded in a web page.
59 >          </p>
60 >          <p>
61 >            This project will require you to write code for the unix
62 >            and Win32 APIs using C and knowledge of how the underlying
63 >            operating systems manage resources. It will also require
64 >            some network/distributed systems code and a GUI front end
65 >            for the reporting tool. It is important for students
66 >            undertaking this project to understand the importance of
67 >            writing efficient and small code as the end product will
68 >            really be most useful when machines start run out of
69 >            processing power/memory/disk.
70 >          </p>
71 >          <p>
72 >            John Cinnamond (email jc) whose idea this is, will provide
73 >            technical support for the project.
74 >          </p>
75 >          <h2>
76 >            Features
77 >          </h2>
78 >          <h3>
79 >            Key Features of The System
80 >          </h3>
81 >          <ul>
82 >            <li>A centrally stored, dynamically reloaded, system wide
83 >            configuration system
84 >            </li>
85 >            <li>A totally extendable monitoring system, nothing except
86 >            the Host (which generates the data) and the Clients (which
87 >            view it) know any details about the data being sent,
88 >            allowing data to be modified without changes to the server
89 >            architecture.
90 >            </li>
91 >            <li>Central server and reporting tools all Java based for
92 >            multi-platform portability
93 >            </li>
94 >            <li>Distribution of core server components over CORBA to
95 >            allow appropriate components to run independently and to
96 >            allow new components to be written to conform with the
97 >            CORBA interfaces.
98 >            </li>
99 >            <li>Use of CORBA to create a hierarchical set of data entry
100 >            points to the system allowing the system to handle event
101 >            storms and remote office locations.
102 >            </li>
103 >            <li>One location for all system messages, despite being
104 >            distributed.
105 >            </li>
106 >            <li>XML data protocol used to make data processing and
107 >            analysing easily extendable
108 >            </li>
109 >            <li>A stateless server which can be moved and restarted at
110 >            will, while Hosts, Clients, and reporting tools are
111 >            unaffected and simply reconnect when the server is
112 >            available again.
113 >            </li>
114 >            <li>Simple and open end protocols to allow easy extension
115 >            and platform porting of Hosts and Clients.
116 >            </li>
117 >            <li>Self monitoring, as all data queues within the system
118 >            can be monitored and raise alerts to warn of event storms
119 >            and impending failures (should any occur).
120 >            </li>
121 >            <li>A variety of web based information displays based on
122 >            Java/SQL reporting and PHP on-the-fly page generation to
123 >            show the latest alerts and data
124 >            </li>
125 >            <li>Large overhead monitor Helpdesk style displays for
126 >            latest Alerting information
127 >            </li>
128 >          </ul>
129 >          <h3>
130 >            An Overview of the i-scream Central Monitoring System
131 >          </h3>
132 >          <p>
133 >            The i-scream system monitors status and performance
134 >            information obtained from machines feeding data into it and
135 >            then displays this information in a variety of ways.
136 >          </p>
137 >          <p>
138 >            This data is obtained through the running of small
139 >            applications on the reporting machines. These applications
140 >            are known as "Hosts". The i-scream system provides a range
141 >            of hosts which are designed to be small and lightweight in
142 >            their configuration and operation. See the website and
143 >            appropriate documentation to locate currently available
144 >            Host applications. These hosts are simply told where to
145 >            contact the server at which point they are totally
146 >            autonomous. They are able to obtain configuration from the
147 >            server, detect changes in their configuration, send data
148 >            packets (via UDP) containing monitoring information, and
149 >            send so called "Heartbeat" packets (via TCP) periodically
150 >            to indicate to the server that they are still alive.
151 >          </p>
152 >          <p>
153 >            It is then fed into the i-scream server. The server then
154 >            splits the data two ways. First it places the data in a
155 >            database system, typically MySQL based, for later
156 >            extraction and processing by the i-scream report generation
157 >            tools. It then passes it onto to real-time "Clients" which
158 >            handle the data as it enters the system. The system itself
159 >            has an internal real-time client called the "Local Client"
160 >            which has a series of Monitors running which can analyse
161 >            the data. One of these Monitors also feeds the data off to
162 >            a file repository, which is updated as new data comes in
163 >            for each machine, this data is then read and displayed by
164 >            the i-scream web services to provide a web interface to the
165 >            data. The system also allows TCP connections by non-local
166 >            clients (such as the i-scream supplied Conient), these
167 >            applications provide a real-time view of the data as it
168 >            flows through the system.
169 >          </p>
170 >          <p>
171 >            The final section of the system links the Local Client
172 >            Monitors to an alerting system. These Monitors can be
173 >            configured to detect changes in the data past threshold
174 >            levels. When a threshold is breached an alert is raised.
175 >            This alert is then escalated as the alert persists through
176 >            four live levels, NOTICE, WARNING, CAUTION and CRITICAL.
177 >            The alerting system keeps an eye on the level and when a
178 >            certain level is reached, certain alerting mechanisms fire
179 >            through whatever medium they are configured to send.
180 >          </p>
181 >        </div>
182   <!--#include virtual="/footer.inc" -->
183 <
162 < </div>
163 <
183 >      </div>
184   <!--#include virtual="/menu.inc" -->
185 <
186 < </div>
167 <
168 < </body>
185 >    </div>
186 >  </body>
187   </html>

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines