HPCN at ENEA
| PQE-1 - Operating System |
The operating system is based on the SUN-Solaris 2.5 ä
technology. Some customizations were made in order to match with the PQE1s
characteristics.
Some features are:
| Parallel File System (PFS); | |
| Parallel I/O (PI/O); | |
| Resource Management (scheduling, and load balancing within multiuser operation); | |
| Single System Image; | |
| File System Availability; | |
| Node Status Monitor; | |
| Communication Network. |
Parallel File System (PFS) and Parallel I/O (PI/O)
In the PFS, all operations on files are parallelized, from the naming management to the
readind/writing operations.
The PI/O subsystem guarantees the input/output is not a bottleneck for the operating
system, and in particular it allows to exploit to the best the parallelism offered by the
PFS.
A process checkpointing feature integrated with the PFS and PI/O is also provided.
Single System Image
The Single System Image feature shows the hybrid MIMD-SIMD system to both the user and the
system manager as a single platform.
File System Availability
In PQE1 the filesystems are controlled by two different nodes. If one of them becomes not
available, the other one works for both and the filesystems remain accessible and the
applications are not blocked (as it happened in the CS2 in the past).
Node Status Monitor
It is able to analyze the reason of a node failure, and reboot it.
Communication Network
The CS2s interconnection network has multiple communication paths between nodes.
Point-to-point communications can effectively work even with problems on the links, on
condition that these links are identified and extracted from the configuration.
Nevertheless a problem remained in the broadcast communications, where a common tree
specified by a top switch is used. In case of a tree failure, a new top switch must be
selected. The top switch selection and the map of the available nodes must be distributed
to different system kernels.
In PQE1, there is a network monitor that automatically detects errors and enriches the
intra-kernel communications to support the top-switch change.
Resource Management
The high-level scheduling subsystem makes the partitioning of the processing nodes towards
the waiting programs by using a set of algorithms based on the length of the
programs queue. Moreover, the scheduling algorithms decide the partition of the SIMD
nodes to be associated to each MIMD node, on the basis of the computation each MIMD node
is implementing. In order to permit the dynamic sharing of a set of nodes among more
applications, the scheduling subsystem uses the checkpointing features provided by the
PFS.
In PQE1 the Resource Manager, based on "PandoraÔ ",
is enriched with new features such as "gang schedule" and "restart".
Parallel Debugger
The parallel debugger is based on "Totalview", which is a de-facto standard for
parallel debuggers.
This system is updated to the new version of the Operating System.
For more details, see the Documentation and the References.
