Honeywell-Bull     CII-Honeywell-Bull     Bull

Operating System GCOS-64 GCOS-7

1967-200x

 

ARCHITECTURAL CONCEPTS
 

The basic concept behind the GCOS operating system is the architecture based on threads (then named processes terminology borrowed then from Multics). Threads are sequence of instructions executed in a processor on a set of data. Recently, several types of threads had appeared in technical literature: kernel threads, user threads and light weight processes. The most common to express parallelism is "thread"; "process" is associated to a unique address space that can be shared between threads. GCOS use a specific address space to each thread and also use a large amount of sharing with other threads. For the purpose of this document, let's call "GCOS processes" "threads".

The sequence of instructions executed in a thread is gathered in a structure named a procedure. A procedure is generally derived from the output of a program.
Concepts emerged, at Algol 60 time, emphasized a nested space of identifiers. The declaration of variables is made inside a syntactical block. Most of recent programming languages use that concept that was however absent from the first versions of COBOL and Fortran. The compilers are responsible to handle the build-up of the space of variables, including the block structure handling. They do so by allocating "automatic" variables inside a "stack frame" that can be pushed or pulled either by instructions of the compiler for internal procedures or by the operating system for external procedures.
External procedures calls and returns are architectured through a procedure descriptor that contains pointers to the entry point (s) of the procedure, privileges of the procedure, descriptions of parameters, and return address. The handling of those procedure descriptors is made through special instructions (PREPARE_STACK, ENTER and RETURN) implemented in firmware, and part of the micro-kernel. The mechanism used in procedure calls checks the access rights of the calling and the called procedure to prohibit access to unauthorized data and to forbid the passing of forged procedure descriptors as return parameters. In summary, inward calls (i.e. calling a more privileged ring procedure) were authorized, while outward calls were normally forbidden (a special mechanism for allowing the system to call user supplied escape procedure was provided by software).
The mapping of a language such as COBOL to this model has not always been very easy. COBOL originated from the second generation of computers and had features derived from tape oriented operating systems. If the CALL verb resembled somewhat to a procedure call, the verb CANCEL had no equivalent in modern programming languages. At the price - not too expensive for such I/O intensive programs- of an encapsulation of those verbs, a common architecture was maintained for all the compiled languages.

The program is usually designed by the programmer as a sequence of phrases (statements) executed sequentially. However, some of those statements are in fact referencing actions (services) to be executed on behalf of the program by the operating system. Those actions may be divided in two kinds: the synchronous services that might be bound to the thread (e.g. GET_TIME) or be calls to services (e.g. SET_TIMER) and asynchronous services that might cause the thread to be put in a WAIT state (because they rely on a resource with a response time much longer than a few processor cycles or because they cause the thread to wait for a resource locked by another thread).

The choices made in GCOS architecture were to provide the maximum efficiency in both cases and to do it whatever number of physical processors the hardware includes. 
The architecture implements the concept of semaphores to synchronize threads with others AND with hardware. Semaphores are not just lock bits, but they contain a count and a threads queue. In addition messages can be posted to the semaphores and picked by the unblocked thread.
Synchronous services are offered by system procedures that have a direct access to the application address space. No thread commuting time is spent in that case.

Procedure and Data referenced by a thread constitute what is called the address space of the thread. It could be seen that the synchronous services of the operating system are included in this address space, including their private data resources invisible to the application programmer. 

 

The choice of a segmented address space has been made, under the influence of MULTICS. The main advantage of a segmented address space was to allow the sharing of common procedures and data, with a minimal overhead: the same object or copies of the same object may have the same address in several threads. Such features is not necessary inherent to segmentation (Multics allocating segments number dynamically did not guarantee such unicity), but GCOS choose such design to reduce the overhead. In a system with limited memory and slow processor speed, it would have been prohibitive to perform dynamic binding of common resources continually. It is true that segmentation deserves a bad reputation since the messy implementation of Intel 286 and early versions of Windows. GCOS design preceded Intel's one by twelve years.
It is true that paging, la IBM S/370, did allow also a segmentation of the address space, but it was causing an excessive loss of memory, and consequently, memory trashing due to internal fragmentation caused by the 4KB page size. A simulation of OS/MFT behavior in tight real memory was undertaken on CP/67 at Grenoble University in 1969.
Anyway a provision to add paging to the architecture was taken since the original systems

The GCOS address space.

Contrarily to several operating systems, all threads of GCOS have their own (potentially shared) address space. The address space was specified in a control structure, named "process control block" that pointed to shared and private tables of segments.

The address space of the thread contains procedure segments and data segments. Procedure segments are NOT normally to be modified during execution, a strong departure from the habits of the time where clever programmers modified instructions during the execution to save space. A special system service had to be requested to change the state of a segment when needed by just in-time compilers, for example.

Data segments were often static in usual programming languages derived from early computers (COBOL, FORTRAN). But, ALGOL derived, blocks-languages were relying of dynamic allocation of memory in a stack specifically allocated to the thread.

Segments existing in the system address space are identified uniquely by their process-group id (J#), their process-id (within a J) (P#), a segment table number (STN #), and a segment number (STE #). However, by GCOS convention a part of the segment table number is common to all threads and another part of the segment table number is common to all threads in a process-group.

Some procedure segments included privileged instructions and were performing services that could alter the whole system (by accessing system tables...). That privileged address space had to be isolated from application programs. A ring attribute was given to the thread during the execution of such services. Three levels of privileges were used in GCOS systems: a ring -1 (or firmware ring, not visible to software) during the execution of the micro-kernel services, a ring 0 allowing the "privileged" instructions and a ring 1 allowing to reference system wide tables. Application programs were running in ring 3. Ring 2 was reserved for the implementation of a subsystem requiring some internal protection between its components).

Multi-threading

GCOS has implemented a concept of multi-threading since its origin. While the definition of threads was absent in user languages, it was possible at application linkage time to specify that several threads have to be run and subsystems such as emulators and later transaction processing had to be multi-threaded. Such architecture could be ported without almost any change to multi-processor configurations. Up to 24 physical processors were eventually supporting the same GCOS. 
Obviously, while there was not a concept of "multi-processor safe" GCOS (by implementing a wide lock in front of system services), the efficiency in multi-processors configurations was progressively improved by distributing some services in separate threads (i.e. data bases) and by increasing the locality of data in hardware caches by constraining the threads environment not to move freely across processors.

Segment Types

Memory space requirements and multiprogramming lead to avoid several copies of the system code to be present memory for different threads. So the address space has been divided in four classes (named types of segments).
Type 0 segments were common to all threads, as it is done in recent systems by allocating half of the address space to system.
Type 1 segments were segments numbers reserved for the implementation of dynamically loaded modules (more on them later)
Type 2 segments were allocated for segments shared between all threads of the same process (named here process groups). They could be procedure segments (read/execute by default) or shared data.
Type 3 segments were private segments to the thread (a process control segment and a stack segment at the minimum).

Note that normally segments are not overlapping, i.e. a word of memory has only one segmented address. There are exceptions to this rule. Some system tables such as process group control tables and process control segment overlap. A large segment describes Main memory. And, dynamically linked load modules have different segment numbers (but same displacements).

Native mode Addressing

The GCOS native decor is known through an internal Bull document named PA800, a, evolution of the original reference document published in 1971 BL011. Those documents describe the architecture visible to the programmer and not the actual hardware features of a given processor. Several families of processors have been built in Bull and in NEC from Japan for that architecture.

Instructions contain a type of operation (op-code), one or two addresses and/or several registers. The architectured system offers to the programmer up to 16 general-purpose 32-bits registers that may be used either as deposit of data or as index registers. Addresses are made of two parts: the number of a base register and a displacement within the segment referenced through the base register. There is an additional feature in the address: an "indirect addressing" flag that indicates that the location is actually a pointer containing another address. There are 8 base registers available. A base registers specifies a 16-bits segment number and an offset within that segment.

Out of the programmer visible space, are segment tables that contains for each entry the physical location of the segment in real memory and the length of the segment (actually a boundary for accesses). The segments are allocated modulo 16-bytes (in BL011).

An additional feature of the architecture is the existence of procedure descriptors for each code segment. The procedure descriptor, built by the compilers specify the size of the stack section to be pulled down at each entry in the procedure, a mask for registers passed as parameters, and the size of the parameters area passed through the stack. A given code segment may have several entry points, the list of which being part of the procedure descriptor structure.

When the pointed object is not present in physical memory a fault occurs in the thread. If the fault occurred in the segment table, the virtual memory manager is entered and the segment descriptor updated when the data are really available. When the fault occurs in a pointer, it might be a programming error or an uncompleted reference, which is interpreted by the debugger or the dynamic linker.

Virtual memory using segmentation was helped by the fact that the full logical address J, P, STN, STE did reference a unique -by software convention- piece of information (data or code). However, that convention was violated in the case of dynamically shared load modules, where the segment name (STE) space was overlaid. That part of the address space was easily identified by its reserved table numbers (STN).

A significant difference with S/360 Principles of Operation was the inclusion of indirect addressing in the instructions. Patterned after Multics, it allowed to identify data descriptors and pointers as objects of the architecture and to avoid a frequent reloading of base registers as in IBM architecture. 

Interrupts and Fault Traps, 

Interrupts are signals interrupting asynchronously the normal processing. The most frequent are related to the synchronization of input-output devices and of the signaling by the timers. Abnormal environment conditions and operator interventions are also causing interrupts.

Fault traps are derailing of normal processing caused from the processor reactions to the process instructions. The most frequent cause is related to the absence of operand in physical memory. Other causes are protection violation, illegal operand format and process termination.

Each thread is associated to a "fault vector" specifying the procedure descriptors to be called for each type of fault trap and for some interrupts that are caused by the action of the thread. It is possible to specify a specific action for a given process, through a service call that updates the fault vector.

Virtual Memory Management

The physical memory of the first Level 64 systems was limited to 512 KB, and there was a marketing requirement to support systems with only 64 KB. This goal was not achieved, it was probably unachievable, and the absolute minimum physical memory in which GCOS run was 256 KB. In fact an additional chunk of 256 KB was delivered to customers ordering 64KB or 128 KB of memory.
Virtual memory was used in GCOS except in the bootstrap ISL -initial system load-. The preferred solution to implement virtual memory was demand segmentation. Demand segmentation could easily be mapped on planned overlays that have been the common way of managing the programming space in the 1960s. It was more flexible than planned overlays and was for those small amounts of memory more efficient than demand paging. 
The optimum size for virtual memory elements was 4KB in a disk configuration where buffering was minimal. Allocating memory modulo 4KB lead to more waste than attempting to map segments on a 16 bytes boundary. The hardware processor architecture was optimizing MOVEs and the operations of compaction by moving segments in memory were relatively less expensive than paging. Such an implementation was always seen due to evolve with the advent of larger memories and a paging option had been in consideration since 1975. Paging became finally the way of handling memory in the late 1980s, when hardware supporting paging became available across the range. Its first introduction was in 1987 in supporting a processor originated from NEC (DPS-7-10x7 Aquila).

Paged virtual memory did not replace segmented virtual memory but was a layer exercised first before segments were moved out. In addition, Segmentation was a basic feature of the architecture and not just a trick to implement a virtual memory larger than the physical.

Memory protection

Memory protection is achieved by the combination of access rings and segmentation. A user thread cannot reference segments outside its system created address space. Absolute address of memory is only made within ring 0 procedures (Physical I/O handling and Virtual Memory manager).
A user program is unable to forge a fake address when calling system services. The ring indication of parameters passed to an external procedure is checked by the micro-kernel before entering the system service. 
I/O transfers is performed from or to data segment. Actual channel programs are not under control of the user program.
The access rights to a segment are READ, WRITE, EXECUTE. Compilers do generate by default READ/EXECUTE code segments. To save address space, constants are allocated within the code segments. The few cases where the system allow a change in segment accesses are those where code are generated dynamically (such as when the parameters of a utility are only known at run-time, e.g. the SORT verb of COBOL that may be handled as a sort within main memory -if small enough- or the spin-off of a utility program). 

Process Group

We will use here the GCOS original name of process group for the entity made of a group of threads sharing a large part of their address space and being " loaded" simultaneously. A process group is the entity of scheduling. Application process groups are also called Job Steps. They are characterized by an identifier J (for job step), a process group control segment containing the threads " process control blocks", the segment tables specifying their private address spaces (located in real and virtual memory).
This entity is somewhat different from the concept of process space, used by UNIX and others, because the threads sharing the process space share the whole static process space and are not protected against each other.

2001-2003 Jean Bellec