Nouvelle page 2

Overview of GCOS 64 Operating System architecture

SECTION 3
Structure of the Operating System.

SECTION 5
Interactive Modes of operation

© Jean Bellec 2001

Thirty one years after having written, with Claude Carré, André Bensoussan and Axel Kvilekvaal a first draft about the "APL OS" architecture, and seven years after leaving Bull, I felt worthwhile to revisit the past in writing a description of what has been eventually that operating system. The description is written here assuming that the reader has more knowledge of modern operating systems than familiarity with the pre-1970 state of the art.

Since 1974, we have been somewhat frustrated by the secrecy that had enveloped the inside of this project. Customers were given only a sketchy idea of their OS. University people and even French computer scientists consequently ignored the creation of the Bull engineers. Our nostalgia does not mean that we believe that the industry has missed a great opportunity to build on the base of GCOS rather to rely on more popular systems that have invaded our personal computers and servers. The architectural concepts of our system were designed when the modern computer usage starts to appear and well before hardware became cheaper by several orders of magnitude. While such consideration also applies to IBM mainframes and to UNIX, the fact that GCOS 64 was considered as a proprietary system owned by what was then "the other computer company" deprived it from the acknowledgement that Open Systems were able to provide to its competitors.

After understanding what is behind GCOS 64/ GCOS 7 concepts, it will be seen that GCOS incorporated many of the features of systems appeared long after its introduction. It will explain reasons for diverging from more celebrated OS and attempt to explain reasons behind the complexity of some mechanisms.

This paper does not describe the "history" of GCOS birth and its evolution described in other documents of this site.

GCOS here is used for GCOS 64 and GCOS 7, two successive names for essentially the same product. GCOS 7 design is completely different from GECOS III or GCOS 8 as well as from GCOS 4 or GCOS 6.

SECTION I
Architectural Concepts

The basic concept behind the GCOS operating system is the architecture based on threads (then named processes terminology borrowed then from Multics). Threads are sequence of instructions executed in a processor on a set of data. Recently, several types of threads had appeared in technical literature: kernel threads, user threads and light weight processes. The most common to express parallelism is "thread"; "process" is associated to a unique address space that can be shared between threads. GCOS use a specific address space to each thread and also use a large amount of sharing with other threads. For the purpose of this document, let's call "GCOS processes" "threads".

The sequence of instructions executed in a thread is gathered in a structure named a procedure. A procedure is generally derived from the output of a program.
Concepts emerged, at Algol 60 time, emphasized a nested space of identifiers. The declaration of variables is made inside a syntactical block. Most of recent programming languages use that concept that was however absent from the first versions of COBOL and Fortran. The compilers are responsible to handle the build-up of the space of variables, including the block structure handling. They do so by allocating "automatic" variables inside a "stack frame" that can be pushed or pulled either by instructions of the compiler for internal procedures or by the operating system for external procedures.
External procedures calls and returns are architectured through a procedure descriptor that contains pointers to the entry point (s) of the procedure, privileges of the procedure, descriptions of parameters, and return address. The handling of those procedure descriptors is made through special instructions (PREPARE_STACK, ENTER and RETURN) implemented in firmware, and part of the micro-kernel. The mechanism used in procedure calls checks the access rights of the calling and the called procedure to prohibit access to unauthorized data and to forbid the passing of forged procedure descriptors as return parameters. In summary, inward calls (i.e. calling a more privileged ring procedure) were authorized, while outward calls were normally forbidden (a special mechanism for allowing the system to call user supplied escape procedure was provided by software).
The mapping of a language such as COBOL to this model has not always been very easy. COBOL originated from the second generation of computers and had features derived from tape oriented operating systems. If the CALL verb resembled somewhat to a procedure call, the verb CANCEL had no equivalent in modern programming languages. At the price - not too expensive for such I/O intensive programs- of an encapsulation of those verbs, a common architecture was maintained for all the compiled languages.

The program is usually designed by the programmer as a sequence of phrases (statements) executed sequentially. However, some of those statements are in fact referencing actions (services) to be executed on behalf of the program by the operating system. Those actions may be divided in two kinds: the synchronous services that might be bound to the thread (e.g. GET_TIME) or be calls to services (e.g. SET_TIMER) and asynchronous services that might cause the thread to be put in a WAIT state (because they rely on a resource with a response time much longer than a few processor cycles or because they cause the thread to wait for a resource locked by another thread).

The choices made in GCOS architecture were to provide the maximum efficiency in both cases and to do it whatever number of physical processors the hardware includes.
The architecture implements the concept of semaphores to synchronize threads with others AND with hardware. Semaphores are not just lock bits, but they contain a count and a threads queue. In addition messages can be posted to the semaphores and picked by the unblocked thread.
Synchronous services are offered by system procedures that have a direct access to the application address space. No thread commuting time is spent in that case.

Procedure and Data referenced by a thread constitute what is called the address space of the thread. It could be seen that the synchronous services of the operating system are included in this address space, including their private data resources invisible to the application programmer.

The choice of a segmented address space has been made, under the influence of MULTICS. The main advantage of a segmented address space was to allow the sharing of common procedures and data, with a minimal overhead: the same object or copies of the same object may have the same address in several threads. Such features is not necessary inherent to segmentation (Multics allocating segments number dynamically did not guarantee such unicity), but GCOS choose such design to reduce the overhead. In a system with limited memory and slow processor speed, it would have been prohibitive to perform dynamic binding of common resources continually. It is true that segmentation deserves a bad reputation since the messy implementation of Intel 286 and early versions of Windows. GCOS design preceded Intel's one by twelve years.
It is true that paging, à la IBM S/370, did allow also a segmentation of the address space, but it was causing an excessive loss of memory, and consequently, memory trashing due to internal fragmentation caused by the 4KB page size. A simulation of OS/MFT behavior in tight real memory was undertaken on CP/67 at Grenoble University in 1969.
Anyway a provision to add paging to the architecture was taken since the original systems

The GCOS address space.

Contrarily to several operating systems, all threads of GCOS have their own (potentially shared) address space. The address space was specified in a control structure, named "process control block" that pointed to shared and private tables of segments.

The address space of the thread contains procedure segments and data segments. Procedure segments are NOT normally to be modified during execution, a strong departure from the habits of the time where clever programmers modified instructions during the execution to save space. A special system service had to be requested to change the state of a segment when needed by just in-time compilers, for example.

Data segments were often static in usual programming languages derived from early computers (COBOL, FORTRAN). But, ALGOL derived, blocks-languages were relying of dynamic allocation of memory in a stack specifically allocated to the thread.

Segments existing in the system address space are identified uniquely by their process-group id (J#), their process-id (within a J) (P#), a segment table number (STN #), and a segment number (STE #). However, by GCOS convention a part of the segment table number is common to all threads and another part of the segment table number is common to all threads in a process-group.

Some procedure segments included privileged instructions and were performing services that could alter the whole system (by accessing system tables...). That privileged address space had to be isolated from application programs. A ring attribute was given to the thread during the execution of such services. Three levels of privileges were used in GCOS systems: a ring -1 (or firmware ring, not visible to software) during the execution of the micro-kernel services, a ring 0 allowing the "privileged" instructions and a ring 1 allowing to reference system wide tables. Application programs were running in ring 3. Ring 2 was reserved for the implementation of a subsystem requiring some internal protection between its components).

Multi-threading

GCOS has implemented a concept of multi-threading since its origin. While the definition of threads was absent in user languages, it was possible at application linkage time to specify that several threads have to be run and subsystems such as emulators and later transaction processing had to be multi-threaded. Such architecture could be ported without almost any change to multi-processor configurations. Up to 24 physical processors were eventually supporting the same GCOS.
Obviously, while there was not a concept of "multi-processor safe" GCOS (by implementing a wide lock in front of system services), the efficiency in multi-processors configurations was progressively improved by distributing some services in separate threads (i.e. data bases) and by increasing the locality of data in hardware caches by constraining the threads environment not to move freely across processors.

Segment Types

Memory space requirements and multiprogramming lead to avoid several copies of the system code to be present memory for different threads. So the address space has been divided in four classes (named types of segments).
Type 0 segments were common to all threads, as it is done in recent systems by allocating half of the address space to system.
Type 1 segments were segments numbers reserved for the implementation of dynamically loaded modules (more on them later)
Type 2 segments were allocated for segments shared between all threads of the same process (named here process groups). They could be procedure segments (read/execute by default) or shared data.
Type 3 segments were private segments to the thread (a process control segment and a stack segment at the minimum).

Note that normally segments are not overlapping, i.e. a word of memory has only one segmented address. There are exceptions to this rule. Some system tables such as process group control tables and process control segment overlap. A large segment describes Main memory. And, dynamically linked load modules have different segment numbers (but same displacements).

Native mode Addressing

The GCOS native decor is known through an internal Bull document named PA800, a, evolution of the original reference document published in 1971 BL011. Those documents describe the architecture visible to the programmer and not the actual hardware features of a given processor. Several families of processors have been built in Bull and in NEC from Japan for that architecture.

Instructions contain a type of operation (op-code), one or two addresses and/or several registers. The architectured system offers to the programmer up to 16 general-purpose 32-bits registers that may be used either as deposit of data or as index registers. Addresses are made of two parts: the number of a base register and a displacement within the segment referenced through the base register. There is an additional feature in the address: an "indirect addressing" flag that indicates that the location is actually a pointer containing another address. There are 8 base registers available. A base registers specifies a 16-bits segment number and an offset within that segment.

Out of the programmer visible space, are segment tables that contains for each entry the physical location of the segment in real memory and the length of the segment (actually a boundary for accesses). The segments are allocated modulo 16-bytes (in BL011).

An additional feature of the architecture is the existence of procedure descriptors for each code segment. The procedure descriptor, built by the compilers specify the size of the stack section to be pulled down at each entry in the procedure, a mask for registers passed as parameters, and the size of the parameters area passed through the stack. A given code segment may have several entry points, the list of which being part of the procedure descriptor structure.

When the pointed object is not present in physical memory a fault occurs in the thread. If the fault occurred in the segment table, the virtual memory manager is entered and the segment descriptor updated when the data are really available. When the fault occurs in a pointer, it might be a programming error or an uncompleted reference, which is interpreted by the debugger or the dynamic linker.

Virtual memory using segmentation was helped by the fact that the full logical address J, P, STN, STE did reference a unique -by software convention- piece of information (data or code). However, that convention was violated in the case of dynamically shared load modules, where the segment name (STE) space was overlaid. That part of the address space was easily identified by its reserved table numbers (STN).

A significant difference with S/360 Principles of Operation was the inclusion of indirect addressing in the instructions. Patterned after Multics, it allowed to identify data descriptors and pointers as objects of the architecture and to avoid a frequent reloading of base registers as in IBM architecture.

Interrupts and Fault Traps,

Interrupts are signals interrupting asynchronously the normal processing. The most frequent are related to the synchronization of input-output devices and of the signaling by the timers. Abnormal environment conditions and operator interventions are also causing interrupts.

Fault traps are derailing of normal processing caused from the processor reactions to the process instructions. The most frequent cause is related to the absence of operand in physical memory. Other causes are protection violation, illegal operand format and process termination.

Each thread is associated to a "fault vector" specifying the procedure descriptors to be called for each type of fault trap and for some interrupts that are caused by the action of the thread. It is possible to specify a specific action for a given process, through a service call that updates the fault vector.

Virtual Memory Management

The physical memory of the first Level 64 systems was limited to 512 KB, and there was a marketing requirement to support systems with only 64 KB. This goal was not achieved, it was probably unachievable, and the absolute minimum physical memory in which GCOS run was 256 KB. In fact an additional chunk of 256 KB was delivered to customers ordering 64KB or 128 KB of memory.
Virtual memory was used in GCOS except in the bootstrap ISL -initial system load-. The preferred solution to implement virtual memory was demand segmentation. Demand segmentation could easily be mapped on planned overlays that have been the common way of managing the programming space in the 1960s. It was more flexible than planned overlays and was for those small amounts of memory more efficient than demand paging.
The optimum size for virtual memory elements was 4KB in a disk configuration where buffering was minimal. Allocating memory modulo 4KB lead to more waste than attempting to map segments on a 16 bytes boundary. The hardware processor architecture was optimizing MOVEs and the operations of compaction by moving segments in memory were relatively less expensive than paging. Such an implementation was always seen due to evolve with the advent of larger memories and a paging option had been in consideration since 1975. Paging became finally the way of handling memory in the late 1980s, when hardware supporting paging became available across the range. Its first introduction was in 1987 in supporting a processor originated from NEC (DPS-7-10x7 Aquila).

Paged virtual memory did not replace segmented virtual memory but was a layer exercised first before segments were moved out. In addition, Segmentation was a basic feature of the architecture and not just a trick to implement a virtual memory larger than the physical.

Memory protection

Memory protection is achieved by the combination of access rings and segmentation. A user thread cannot reference segments outside its system created address space. Absolute address of memory is only made within ring 0 procedures (Physical I/O handling and Virtual Memory manager).
A user program is unable to forge a fake address when calling system services. The ring indication of parameters passed to an external procedure is checked by the micro-kernel before entering the system service.
I/O transfers is performed from or to data segment. Actual channel programs are not under control of the user program.
The access rights to a segment are READ, WRITE, EXECUTE. Compilers do generate by default READ/EXECUTE code segments. To save address space, constants are allocated within the code segments. The few cases where the system allow a change in segment accesses are those where code are generated dynamically (such as when the parameters of a utility are only known at run-time, e.g. the SORT verb of COBOL that may be handled as a sort within main memory -if small enough- or the spin-off of a utility program).

Process Group

We will use here the GCOS original name of process group for the entity made of a group of threads sharing a large part of their address space and being " loaded" simultaneously. A process group is the entity of scheduling. Application process groups are also called Job Steps. They are characterized by an identifier J (for job step), a process group control segment containing the threads " process control blocks", the segment tables specifying their private address spaces (located in real and virtual memory).
This entity is somewhat different from the concept of process space, used by UNIX and others, because the threads sharing the process space share the whole static process space and are not protected against each other.

SECTION II
GCOS computer model

Files System and computer address space

The first computing systems, derived from punched cards processing, dissociated completely the concept of files from the processing instructions. The Von Neumann architecture and the introduction of time-sharing interactive systems created a new approach that merge the concept of files and the computational space. Files were user creation as well as programs and the most efficient view was to consider them similarly. MULTICS added to this new view the concept of sharing that had been ignored by many of its time-sharing predecessors.

GCOS did NOT adhere to this MULTICS view of files. Instead it considered files as being independent from the users. Very often, files were created well before the introduction of a new system, and they are essentially persistent after the life of that system. Files are handled by GCOS as they were in their business processing predecessors (Gamma 60, IBM S/360, GECOS III).
That choice meant that files are not just a succession of bytes (defined by length or terminated by 00x,) like in MULTICS or some of its successors, but may have an intrinsic structure and are never put inside the address space of a computation. Files, in GCOS, are OPENed and CLOSEd and a program has to work in windows of the file through the appropriate access methods respecting the structure of the file.
Application programs windows in the file are named buffer segments that are protected by the segmentation mechanism. The syndrome of violating the system protection through buffer overflow has been stranger to the GCOS architecture.

The time-sharing systems used generally the concept of a process per user and run all programs requested by the user inside this process address space. This paradigm was NOT chosen for GCOS. One reason was the batch applications are submitted in general by one user, the system operator. Another was that the amount of system resources varies significantly according to the type of computation: resources needed by a compilation are different from those required in sorting tape files...There was also a suspicion of a lack of reliability of hardware and application software that would lead to consider that a job step be associated to a stable check point of the file system. By moving the file system (i.e. the volumes that contained the files) to another system, it will be possible to restart a job. It should be remembered that the duration of a batch-processing job was frequently measured in hours more than in seconds at that time.
Consequently a job was divided in job steps. For each job step, resources were allocated and a new process group environment was created. At the end (normal, caused by errors, or killed by the operator) of the process group, the environment was destroyed and the resources still allocated were relinquished.

Compatibility

GCOS was running on a family of computers Level 64, DPS-7 and DPS-7000. The design of this family started in 1967 as a new product line within General Electric. The decision to build the systems was achieved at the end of 1970 after the merger of the computer operations of Honeywell and GE. It was contemplated then to replace all current product lines with the new products named at that time NPL New Product Line.
However, the development cost to replace the whole existing products exceeded Honeywell resources and the replacement of GECOS III systems was delayed indefinitely.
Relatively soon in the design of NPL, in 1969, it was decided that the smaller member, designed in Italy, would not be fully compatible with the main line. Segmentation was reduced to support only a code segment and a data segment per process. What became introduced as GCOS 62 had only data types and a non-privileged instructions compatibility with GCOS 64.

Honeywell decided two introduce NPL products (under GCOS62 and GCOS64) and to reintroduce GECOS III as GCOS 66 under a weak umbrella named the Honeywell computational theater that emphasized source language compatibility within members and neglect actual commonalities between operating systems. While such a concept made some sense to users not concerned by the inside of their system, it increases the development costs for the manufacturer that had not only to develop three operating systems, but also to develop features to ease the migrations between the OS. A technical solution to adopt an interpretive common object language such as Java became in the 1990s was not conceivable in the 1970s, because the "fait accompli" of binary programs available on GECOS III and the low performances of processors of that time. JIT (Just in Time) compilers were out of question then, the time to compile even a small program was expressed in several minutes!
So the policy that Honeywell has to adopt when it decided not to phase down GCOS III in 1973, could not be considered successful and many successors of Level 64 and Level 66 had to be built at high cost before main frames started to doom.

As all the emulations strategies had been a technical success, it may seem surprising that Honeywell and Bull were not able to adopt it internally. That has been considered, at least by engineers, from Level 64 to Level 66, in the 6XXX project in 1972, from Level 62 to Level 64 in Gemini project in 1979 and by NEC on its ACOS2/ACOS4 lines in 1985. The changing multinational organization structure was the more obvious cause of this failure, and the "Not Invented Here" syndrome in engineering organizations did the rest.

SECTION III
Structure of the GCOS Operating System

The description of the operating systems written in the 1970s did use a terminology that would be not easy to catch by 2000s readers. The concept not known as thread, multi-threading was then named, not without rationale, process. Since then, UNIX fans have assimilated a process to an address space more than to a chain of control (while frequently the two concepts are mixed).
We used to call "distributed procedures" procedures executed within the user thread, while since the late 1980s, distribution was assimilated to the client/server model, where services are run in separate processes.
The I/O architecture of GCOS is also quite different of most open systems. It followed the concepts of "channel" introduced by IBM in the late 1950s and pushed the architecture of its "channel programs" to a limit closed to real process distribution.
Repartition of threads within a shared memory multiprocessor system was standard in the architecture, even within the nucleus of the OS, thanks to the implementation of basic functions in "firmware", what we can claim to be a micro-kernel.

Micro-kernel

GCOS includes a micro-kernel implemented in "firmware", a set of shared procedures that implement "synchronous services", a system process group that gathers services implemented by threads and events handling and in addition to those services a variable number of servers (process groups) that may be divided in standard system servers and in special subsystems monitors or servers.

We are here calling micro-kernel those basic functions that implement the architecture objects that are threads, semaphores (including I/O) and procedure calls. They are loaded at initialization time and constitute really a foolproof kernel layer. The term of micro-kernel may disturb purists, because the services of that level are not all implemented by micro-threads communicating only by messages with the software. Some services are likely synchronous operating as procedures in the calling thread (P and V-operations), others are implemented through micro processes polling I/O channels or dispatching visible threads candidate for execution.
The way that micro-kernel is implemented is not part of the architecture and vary slightly according to the processor models. That micro code is implemented as a subset of the software visible instruction set (so it is interpreted by the execution micro programs), bypassing error checks and using directly hardware features not part of the software visible architecture. It operates faster than the same algorithms in the faster.

The system process group operates within its own address space (J=0). It gathers system threads implementing asynchronous services or, more exactly the asynchronous portion of those services. Among those the I/O termination thread (s), the interrupt handling, attention handler.

In addition to that system process groups that has to stay permanently alive, several process groups are needed for a general-purpose operation: the Job scheduler, the Input Reader, the Output Writer. Reader and Writer are multi-threaded process groups that can handle several reader devices and several printer devices. Other process groups are also present in a fully configurated configuration: FNPS assuring the network interface, and a variety of transaction and data base servers.

Some recent systems (e.g. CHORUS) claim that a well architectured micro-kernel should pass parameters by messages so that an operating system could be distributed by processors without shared memory. Such configurations were not designed for GCOS, so GCOS does not pretend to include a well-architectured micro-kernel. The word itself was not used by developers who often consider the primitives as instructions implemented by firmware and operating in a mysterious area of memory called under-BAR (for Barricade register) memory.

Program execution.

The external specifications of GCOS were not a workstation operating system, such as Windows, not a time-sharing server like Unix (just being in limbo at GCOS design time), but essentially to serve as a batch operating system handling sequential files and progressively direct access files. Programs were initiated from a batch input file, initially a spooled card reader, later extended to multiple sources, some of them being TERMINALS. Programs were scheduled according their physical devices resources requirements (many initially were using magnetic tapes). Then they were loaded, performing some last minute binding with resources and services. Then their thread was given control and were dispatched the most efficiently possible, and eventually they terminate and an execution report was printed.

Program preparation

The overall structure of program preparation was already available in the previous generation of computers (for Bull: Gamma M-40, GE-140). A job is submitted to the system as a sequential file of commands (JCL) and data (source programs, subcommands for processors, and possibly user data to be processed). In general, the system being used most for business application, the program preparation job ended with the update of libraries for subsequent processing of files.
Program preparation includes an optional step of macro processing, a step of language compilation, and a linkage step. Utilities such as the binding of several compile units or libraries maintenance could be intermixed in program preparation jobs.

Compilers

The languages used in the majority of programs in GCOS were "compiled" in machine format prior to their execution. The exceptions were JCL/GCL, BASIC, and APL.
One of the goals pursued in the compilers design was to achieve a common object format for all compiled languages, a goal that IBM and GE had to be able to achieve before 1970. So it would be possible to link together procedures written in FORTRAN, COBOL, PL/1, HPL -a PL/1 subset used as system implementation language, MLP, NAL Assembler, and later C and have them calling the operating services through common sequences. Sometimes, a few languages idiosyncrasies need the compiler to insert a stub for compatibility, but the goal was achieved.
The interfaces between languages and services included traditionally two separate portions that were kept separate for efficiency purpose: the declarative statements defining system tables and the action statements. In the implementations languages (specific for the system), those statements were handled through macros: declarations macros being translated into data declaration, action macros being usually a procedure call (but in-line coding was sometimes substituted without changing the macro-call) or an asynchronous service request. That allowed an adaptation of the implementation services while limiting the impact on the client. Macros were expanded by a relatively sophisticated Macro-Generator with string processing derived from MIT TRAC. JCL initial implementation was performed in two passes: the command string was parsed by the Macro-Generator before being interpreted by a more compact module. Most JCL errors were detected by the Macro-Generator parser. That implementation was later replaced by a more conventional dedicated processor.
A goal was to have the compilers strictly obedient to ANSI and CODASYL standards. The compiler was accepting the whole standard (even in a controversial stage, such as CTG -Cobol Communication Task Group- or DBTG -Data Base Task Group- in COBOL) and only the standard. The standard features not mapping nicely with the architecture had to be compromised by a special handling at post-processor and linkage level (e.g. by "forking" a service in a separate thread or even in requesting a service from a special server) or by implementing a pre-processor in front of the compiler (e.g. translating transaction processing verbs such as COMMIT into a CALL allowed by the standard compiler).

The first versions of the compilers were designed in separate teams in Waltham MA and in France. RPG was even derived from a British compiler for Level 62. So they have their own components including code generators. The only common point was the common compile-unit format. After 1975, the compiler maintenance was centralized in a single team in Paris and they were redesigned to use common tools and modules. FORTRAN, HPL, C and others (at the exception of COBOL, not maintained in the same team) did produce a system independent format (OIL object intermediary language) and shared a common code generator.

In addition to the programming languages offered to customers, Honeywell and Honeywell-Bull used three "implementation languages" to develop the system. An assembler, named NAL for NPL Assembly Language, using the same macro-facility (MacGen) as the other implementation languages and more important the same declaration statements for data, gave system programmers an access to all the facilities of the machine. Its usage was limited to writing specific system procedures dependent on hardware and estimated to be critical in terms of performances.
HPL was eventually the language by excellence for implementing GCOS was a PL/1 subset, without floating point, multi-threading facility and with a more stringent typing of data. HPL used the macro-facility of MacGen. It was developed initially under Multics on the GE-645. It was eventually delivered to major customers under the name GPL as GCOS programming language.
MLP was initially a macro-assembler developed under GE-635 to implement GE-100 emulator and other hardware-oriented products. It never evolved in a full-general purpose compiler, while its usage was pushed by the lack of machine time on GE-645. One of its advantages was that it gave an in-line escape mechanism to the assembly language. Progressively, MLP was phased down for HPL, when occurred complete reimplementations of old modules. Note that NEC called GPL its version of MLP.

PL/1 was seen in the early 1975 a potential universal language, backed by the power of IBM. GCOS HPL was not nevertheless delivered openly to the customers because it did not implement all PL/1 features (floating point, tasking...). Largely because it was not supporting the default typing of PL/1, a new parser was needed. A PL/1 "standard" compliant compiler was implemented in the late 1970s by a CII originated team. But, the success of PL/1 was doomed.
A bigger issue in CII-Honeywell-Bull was to implement ADA or even to switch to ADA as an implementation language. ADA design winner was Jean Ichbiah, a researcher from CII; The effort to make such a move was discussed and finally abandoned at the beginning of the 1980s while the Bull's management was intending to disengage from proprietary systems. So ADA was not a "prophet in its own country".
C ascension was late to significantly influence GCOS products. As a language, it has few advantages compared to HPL, except, of course, its contribution to the port of externally developed products. In the early 1980s, Bull considered a port of a relational data base system. Almost of all them were available in C. The port of Oracle was made initially using a quick port of GNU C. A proprietary compiler of the C language, producing OIL output was also made by Bull and used for Oracle and other application ports.

Binding names into addresses

The output of compilers was named "compile units" (often named object format). The unit of compilation is usually an external procedure. All local names appearing in that external procedure are resolved by the compiler as tentative binary addresses. According to the scope of data such as recognized by the compiler, the data are allocated to one or several data segments. Variables with AUTOMATIC scope, parameters of the procedure call are allocated relative address in a stack segment and gathered in as "stack section". Names not resolved by the compilers are kept in a "linkage section".

The linker is activated with the name of an external procedure as its main parameter. It searches the "compile unit libraries" (from searches rules defined by the JCL -command- author) and starts to resolve names contained in the linkage section. The linker continues that quest until the only names to be resolved remain system names (conventionally on the form H_xxxxxxx). The linkage sections are merged into a "linkage segment" and the stack initial structure is built. The linker is also responsible to allocate segment numbers to the procedures and data segments (according conventions set in JCL).

The last binding of names is made at loading time. The loader binds system names remaining in the load module with segments numbers (and entry points) set up at system generation time.

The preceding scheme implies that each external procedure corresponds to at least one segment. To avoid a fast exhaustion of the segment numbers resource, another facility was introduced named the "Binder" that merges several compile units into a single one. Those bound compile units must be with same privileges and have a good chance to be used in common. For, some time, the Binder was only used internally but it was eventually delivered to large users, especially Fortran customers.

Building of the load module

The linker had the responsibility to bind together user procedures and library subroutines and to prepare a "program" for execution, trying to perform as much binding as possible before execution, without compromising device independence. Linked programs were named 'load modules" stored in load modules libraries. Load modules are completely relocatable, thanks to segmentation and they wan be moved from system to system and usually from release to release.

The linker beginning was somewhat controversial, because it was building the thread structure of the job step by walking across references. The linker was the builder of the address space of the set of threads being the structure of the job step. Later, the static linker was complemented by a dynamic linker binding dynamically linked load modules. The FORK primitive was not a basic GCOS primitive. The linker provided a maximum number of threads for the job step.

File system

Level 64 specifications called for file level compatibility with contemporary systems: Honeywell H-200, GE-100 and IBM S/360. In those systems, file may even unlabeled and let be known by their physical position only. There was a requirement that GCOS had to support IBM DOS disc format. Actually, the S/360 disc pack format was robust, contrarily to those of its competition inherited from a world of non-removable discs. GCOS adopted the physical and logical format of S/360 in terms of labels and data formats.

The files needed to the operating system were the backing store (location of the virtual memory, containing all the address spaces of system and user running threads), the bootload file and spool files for input and output, and program libraries containing object code (compile units to be linked), load modules...
Those files were contained in a system volume and labeled with conventional names. Compile Units and Load modules were not strictly file but subfiles stored in containers that were visible to utilities as separate files. There has been some controversy about that concept of subfiles, but as conventional files were allocated at track boundary, there would have been a storage loss if they have been files. Another consideration was to avoid implementing eventually an ACL (access control list) for each program element. So subfiles libraries were implemented as "members" of files formatted in QAM (queued access method) allowing the recovery of space inside tracks.

Users files, on tapes or discs, were labeled and were known by the association of volume name and file name. Labels were gathered into a special file per volume in an IBM compatible format. The user files are allocated in an integer number of tracks and are addressed as CCCTTRR at physical record level. The size of records is the file creator responsibility; it could be constant at file level or variable, each record being preceded by a header (count) containing its length (in binary). In addition to that formatting for sequential files, GCOS supports the IBM index-sequential format (CKD count key data) that adds two physical records to the actual data records and allows an off-line searching of the key field by the disc controller. This ISAM formatting became progressively obsolete with the faster growth in main memory size compared to the slower growth in controller processing power.
In GCOS, it was replaced by a UFAS (Unified File Access System) organization that presented a fixed (at file level) block formatted data zone. UFAS was used for sequential, direct and indexed sequential access methods (the latter being modeled on IBM VSAM). IDS-2 data base files were mapped on direct access UFAS files. An important peculiarity of UFAS files was that they became supported in the mid-1980s by a centralized software cache and by a centralized locking facility (GAC) that allows to implement file locking at bloc level and that could batch or transaction processing applications. In addition, that facility was later used to implement dual-copy databases on coupled systems.

A catalog was implemented in the following years to be able to handle more conveniently a profusion of files on discs and not to require a knowledge of the content of all off-line volumes by the responsible of operation writing JCL. That catalog associates a long name (44 characters) with the volume name and the combination of VTOC and catalog allow the system to know the physical address of the beginning of files. The use of a one-bit flag in the VTOC requested the system to look in the catalog to control the access rights. Access rights were given to accounts (typically, a set of users). No naming restriction was required, but eventually the creator name was usually included in the file name (for maintaining the upward compatibility with GCOS8). The use of the catalog remained an installation option; some customers did not want to modify their tape-inspired operation.
The catalog organization that took advantage of CKD format originally was reimplemented in the 1980s to remove that dependence by using a version of UFAS files with the same basic functionalities as earlier The history of data management in GCOS had been somewhat hectic because the need of supporting externally defined files and because the new methods were separately licensed to the customer, disallowing their use for system usage.

With the advent of non-removable discs, cataloged operations became more popular and features like disc quotas limiting the space allocated to each account were requested and eventually implemented.

Discs and Files were possibly shared between two distinct GCOS systems, usually for back-up reasons, thanks to a locking mechanism at VTOC entry (label). A read-write channel program was able to seize control of the volume without being interrupted. Those "semaphores" represented to much overhead to be used efficiently for anything but file (or table) exclusive access and were limited to two systems.

Discs basic format was modified with the advent of fixed block format in the late 1980s. CKD functionalities were discarded for fixed sectors and previous disks reformatted in fixed blocks. Some concerns were raised about the integrity of logical records spanning several blocks that could have been compromised by a failure of the system during gaps between blocks WRITEs. Critical files were given the addition of heading/tailings timer stamps to close that low-probability risk. Customer migration to the new formats took time and both formats remained supported for long.

A disc mirroring facility was implemented in the late 1980s. On specific volumes, supporting fixed disc format, files were maintained in two copies. This allowed an immediate recovery of media errors and slightly faster read access. The facility allowed mirror discs to be shared between two systems and provide a dynamic reconstruction of the damaged files without stopping the operation. During the period of reconstruction, the concerned part of the file system was journalized to secure the database, as in the absence of the dual copy.

Data management

As said above, files were never accessed as a "contiguous string of bytes", but as "logical records" through an access method. This access method itself contained three parts:
the first one, at OPEN time, set up the control features of the file (filling the file control block) and binding the file index or "file code" (known to the user program) to the file control bloc;
the second one dynamically invoked by the application program performs the data transfers to and from the program address space from and to the buffer area;
the third one handled the buffer pool and issued the physical I/O requests to the media contained in the supporting device.

The first function required often many activities and usually was performed in several steps. Binding a file code with a real file was specified in JCL. JCL interpretation might cause a request to the operator for mounting media, while the job step stayed in limbo. Mounting caused the invocation of volume recognition that bound the file name with the device and allowed the retrieval of the labeled or catalogued features. Finally the OPEN procedure verified the coherence of those all information data and made the initial positioning of the file. The OPEN and CLOSE services were implemented as system procedures in the user thread. Other functions were packaged within system process groups. Open/close services had to retrieve information prepared by the JCL interpretation of ASSIGN and ALLOCATE statements binding the generic data management calls with the appropriate data management routines, the channel programs templates and the physical channel and device logical channel characteristics.

GET and PUT services (called READ/WRITE/UPDATE in some programming languages) essentially performed MOVE between the buffer area and the program working area. For sequential files processing, the number of those statements far exceeded the number of I/Os. They were implemented as procedures within the address space of the program. After a few releases, it was decided to shortcut the protection mechanism that originally surround those services and to implement them as in-line macros part of the application code. They had a limited access to only a part of the file control block -the current address- and their transfer in the user address space did not compromise the integrity of the file system.

The reading and writing of buffers was working in non-privileged mode (ring 1), and did not have an access to structures like logical channel tables. However they build "tentative channel programs" that set the sequence of channel commands required for an I/O and they fill it with virtual memory addresses and relative address inside the secondary storage space of the file. They were not able to compromise the integrity of other files or that of other programs.

The last function was the responsibility of Physical I/O services that absolutized the contents of the channel program and set a connect signal to the IOC. The subsequent functions were performed by the IOC and the peripheral controller. They dequeue the requests on the logical channel and they initiate the transfer between the peripheral and the buffer(s).

All the services listed above were initially performed inside the application thread. Ring 1 buffer management and Ring 0 Physical I/O management implied ring crossing but not process dispatching that would have been more expensive.

A first generation of GCOS data management was focusing on data organizations that use media dependent information present in older systems. Those organizations were named collectively BFAS Basic Files Access System. A peculiar variant of them was HFAS Honeywell Files Access System. While pure sequential files did not include device dependent data, BFAS assumed that physical records boundary had a meaning for the application data. Direct Access Files and Index Sequential contained relative address within the file expressed as CCCTTRR (cylinder, track, physical record number). Index sequential files use the Key field of the CKD format for retrieving data inside a disc cylinder by the disc controller firmware

A second generation designed soon after GCOS first delivery was named UFAS Unified File Access System [Unified stood for a compatible organization for GCOS 66 and GCOS 64]. UFAS is characterized by the removal of device dependent data within files and the use of constant size blocks inside a file (although, the customer was allowed to attribute different sizes of blocks to different files). UFAS included a re-implementation of disc sequential files with functionalities similar to BFAS but using a constant block size for each file. UFAS main improvement was a complete re-implementation of index sequential access modeled on IBM VSAM. Indexes were stored in a special region of the files and addresses of records were defined by a device independent value, a block (renamed control interval after IBM) number and a relative record address within the block. Indexed UFAS provided a control interval splitting facility easing the speed of new data insertion in the file. Heavy insert load might require a dynamic UFAS file reorganization that could be done in parallel with file normal access and updates in a separate thread. Control intervals reorganization was just another multi-threading activity on the file.
UFAS control structures were also used to build data bases accessed through IDS-II (Integrated Data Store), the Honeywell implementation of CODASYL DBTG feature, also available on GCOS 55 and GCOS 6 (and offered by Cullinet on IBM and by ICL on their own systems). IDS-II supported multi-record types stored in a common data structure privileging locality of related data [instead of storing data in separate tables and multiplying the file control structures and buffers, as in relational data bases]. IDS-II data records relations were implemented through explicit chaining pointers expressed in UFAS form. IDS-II primary records were retrieved either through explicit indexing or via a "computed" indexing where the value of a field was hashed to represent the UFAS address. When two addresses collided, a new control interval chained to the control interval pointed by the hashed index was created to store the new record.
[IDS designer Charlie Bachmann at that time at Honeywell had dreamt that the whole system could be based on an IDS database, coding the IDS primitives and the related storage management in the micro-kernel. In fact, several obstacles opposed that option: it was almost impossible to freeze a durable schema for all the operating system functions and the CODASYL approach required such a freezing; marketing saw IDS as an important asset against IBM DL/1 and wished the unbundling of IDS program].
From an operating system point of view, the initial implementation of UFAS and IDS followed the BFAS design. Data access methods were services implemented as shared procedures within the user threads and dispatches were not needed before waiting for next records (if the record feeding was later than processing of logical records). Several buffers were allocated to each file, providing a primitive file caching in virtual memory. As virtual memory used the same devices as the file system, it was not really useful to "page" buffers in secondary storage. Multiplying the number of buffers was also a drain on the number of segments, a critical resource in GCOS. Files private to an application were assigned local (type 2) segments for their buffers, while files dynamically shared by several process groups had their buffers in shared (type 0) segments.

The constraints of such a requirement lead to implement several new functions as servers. The back end part of the general Data Management, the Buffer Pool Management, was the most spectacular. In that case, the server implementation was helping to override some architectural limits (numbers of buffers in the application address space) already suspected at the time of the original design.

MLDS Multiple Level Data Store was another data organization introduced in the 1980s for compatibility with the data base capability of GCOS 62/4 {and Level 61). MLDS was a multi- indexed data organization complemented by the idea of complementary records chained to primary records.

Relational Data Bases started to become competitive in the early 198Os. Bull decided to port Oracle data base system to GCOS rather to port MULTICS or GCOS8 relational facilities that were just introduced. Oracle implementation was different from the old access method. An Oracle server - a GCOS process group- received SQL requests from their clients (batch or IOF programs or TDS transaction server). The source code used by the port remained Oracle Corp property and Bull was only responsible for the compilation and the coding of the environment of the server. Several successive Oracle releases were ported. The first one used GNU C-compiler, the other used the genuine compiler.

Data compression, a technique appearing in the mid-1980s and popularized in the personal computer world has been absent of the GCOS operating system itself and limited to the Open 7 world. Apart from the issue proprietary rights in the initial algorithms, the file organizations in GCOS usually mixed control information and user data. It would have been too expensive in terms of real memory at those times to expand a compressed file in the buffer manager

Security of Data

GCOS has not included integrated encryption technology. In fact, Bull emphasized the encryption of data communications using a low level security secret key provided by CP8 smart cards. But, the encryption of files (by a PGP mechanism for instance) was absolutely prohibited by the French law until 1997. The French law used to be the most restrictive in the World.
In that context, no attention was given to the implementation of an encrypt/decrypt co-processor inside the CPU à la IBM S/390 that could have used to keep the secrecy of data in the file system including the buffer pool.

Bull hoped to take benefit of the smart-card technology that it pioneered to improve security on GCOS. Smart cards could be used as a convenient way to store user-id and password limiting the risks of exposing passwords to external view. Actually, smart-cards allowed to convey "secret keys" invisible to an intruder over communications lines. But as the encrypted data were hiding also control informations, the Secure Access 7 feature was of limited use in network communications.

I/O in GCOS.

Contrarily to many systems, the I/O interface with software was architectured in a compatible manner. Peripheral devices were seen as logical channels. The logical channel was seen as a special processor with a specific instruction set the channel program, made of sequential instructions named channel control entry CCE. In addition to READ and WRITE instructions to the peripheral, CCE included SENSE instruction, BRANCH instruction and synchronization instructions operating on semaphore threads.
From the software point of view, each logical channel was seen as an independent processor, while several levels of multiplexing-demultiplexing might occur inside the I/O part of the hardware system.
Actually, the channel program execution was distributed between a main processor, its IOC and the peripheral controller.

Channel programs used physical (real) memory addresses. Some considerations have been given to a dynamic translation of virtual addresses into virtual memory in the IOC channel. However, as anyway buffers had to be fixed in memory during data transfer, the dynamic translation would have caused a synchronization burden between the virtual memory management and the IOC, it was decided to let the IOC working with physical addresses. A firmware-assisted translation of channel program addresses was performed by I/O initialization services before the channel program activation.

GCOS I/O buffers had to be locked in real memory. However, the data chaining mechanism allowed to transfer a single data block (in the peripheral) into/or from several main memory areas. This "data chaining" mechanism was used to isolate control information present in the data block from the buffer accessible by the application program. It was used intensively by communications I/O.

Peripheral errors recovery was assured by PIAR Peripheral Integrity Assurance Routines, called by abnormal termination of I/O instructions. Those device dependent programs filtrated those events into application significant ones (i.e. end-of-file on sequential files), irrecoverable errors (that lead to application "check pointing" or termination) and recoverable errors where PIAR either was able to auto-correct the I/O (e.g. by using ECC) or by reinitiate the channel program.
Disc and console PIAR were resident in main memory, other PIAR were swappable to backing store.

Resources allocation

The goal pursued initially was to schedule an application only when it was sure of the allocation of resources needed for its successful termination. That avoided to have a deadly embrace of job steps fighting for the same resources. Needed resources with the exception of real memory and files stored in resident volumes had to be reserved by the job scheduler from indications present in JCL. Requested resources were compared to the available resources in the same order and relinquished in totality if a case of non-availability was encountered. Of course, in such a case, the scheduling priority of the step was augmented to increase its chances to be scheduled as soon as possible

Virtual Memory

Most of the virtual memory is known and allocated at linkage time. The scheduler has just to be sure that the sum of the virtual space does not excess the size of the swapping backing store. However, there are segments the length of which is not known statically, notably the user stack segment. The way the stack overflow was initially implemented was let increase the stack segment size up to 64KB and, if that size is exhausted, to "page" the stack in allocating a new segment out of the thread private segment table.

Real Memory

The virtual memory does not mask the need for real memory. Some operations (I/O transfers) require to nail down segments in real memory. It was generally inefficient to allow the swapping in backing store of buffers (where no transfer is currently performed). Control structures used by dispatcher and memory management must be resident. Some instructions dealing with a pointer chain might involve several segments and might lock-up the processor if they cause several segment missing faults. More generally, each step required a certain amount of real memory for its working set. JCL did specify a minimum amount of real memory and an optimum amount of memory for its operation. Those numbers were used by the job step scheduler to start the step loading or to delay its initiation.

Bull did not use "extended memory" that was available on NEC systems. That extended memory was cheaper, but had a more significant access time and was not addressed at byte boundary but as chunks of 64 or 4K bytes. It was certainly non-uniformly addressable memory (NUMA) and was used as a drum substitute. It could be used synchronously by MOVE instructions or asynchronously using DMA via the IOC. Its usage was more judicious in NEC high-end machines where "main memory" was implemented in expensive SRAM than in Bull machines where SRAM was not used and DRAM become progressively cheaper and cheaper.

Memory management

Paging was introduced only in the mid-1980s and until then the memory management relied only on the use of segmentation. At loading time, tentative segments from the load module were loaded in a large file named Backing Store. To each loaded segment was appended a header containing the access rights and the segment type. The header in addition includes the backing store relative address of the segment frame -the aggregate composed of the header and of the addressable segment-.

Problems usually encountered in memory management are those of fragmentation: the waste of memory by the alignment of programs and data at a memory block (segment or page frame) boundary (internal fragmentation) and the waste of memory by the variable size segments to be allocated in large enough holes (external fragmentation). The initial implementation of virtual memory by segment aligned to a 16-bytes (later 64) boundary and size reduced considerably the internal fragmentation. The variable segments size was more difficult and had sometimes to trigger a "compaction" of memory (i.e. moves of segment to increase the size of "holes". Hardware was optimized for efficient moves decreasing the load of such a procedure.

The loader stored the process group control block segment in physical main memory. That segment includes the segment tables for the threads of the process group. It loaded some segment frames in main memory according to residency criteria defined at linkage time.

When memory management (VMM) was called -by a missing segment fault in a segment table- VMM looked for free main memory to load the requested segment frame. It selected the smallest memory area to accommodate the segment, updated the page table, loaded the segment frame and reorganized the list of memory "holes". If the "holes" were not large enough to store the segment frame, VMM attempted to compact main memory (moving segments and updating segment tables), a process that was faster than backing store I/O in Level 64 configurations. If no fitting holes were found, swapping out segments was needed: non-modifiable segments were never written back to backing store, so it was cheaper to just set up code segments as "missing" and to overlay their segment frame by the new one. If VMM had to swap out, constant size data segments, it retrieve the backing store address of the segment from the segment header and copied the segment in backing store. If the segment size was increased, the backing store address had to be altered.

The size of backing store was much larger than the physical memory, allowing many programs to be loaded in backing store in dormant state. Thrashing, i.e. high rate of segment swapping could occur, due to a large "working set" of segments, GCOS reacted by reducing the priority of offending threads.

When main memory size came expressed as several Megabytes and the backing store performances were not improved in parallel, the Virtual Memory management based on segmentation became more expensive specially memory compaction. With the Aquila project, that introduced a NEC designed processor on top of DPS-7 product line, GCOS VMM was modified to rely chiefly on paging for memory management. Segment swapping was still maintained for "overlay" management of well-behavior applications.

Files resources

Files are allocated to a job step; they can be passed from job step to job step. The ASSIGN command performs the binding of a file symbolic name (file code) and the real name of file in the installation. The real name is either a catalogued name, or a name specified by the volume name and label or even by an external name known by operator (unlabeled tapes). For sequential files, the program ignores the device type, until the processing of the ALLOCATE command. The ALLOCATE command request the mounting of files stored on removable devices (tapes, disc packs) after verifying that they are not yet mounted and available. The system relies on automatic volume recognition (AVR) letting the operator using any of the empty drive.
Their type of access (READ, READ EXCLUSIVE, WRITE, UPDATE, WRITE EXCLUSIVE) has to be specified in the JCL command.

The strategy is to allocate all resources needed for the execution of a step before starting the process group representing its execution. This not applies to the libraries that are created dynamically nor to system files created in behalf of the step.

A usage count is used for files opened simultaneously by several steps. Files may be dismounted only when unused. The operator may have to kill offending steps monopolizing a file (or device) resource.

As some steps run almost permanently, such as transaction processing (TDS) the access rights of the resources have to be handled by the customer operation in terms of access rights and concurrent access with other steps.

Device resources

Some resources (essentially discs) are shared between several process groups, but most of devices are used exclusively by application programs or by system utilities. So, most data from or to the devices are spooled by system process groups (input reader, output writer).

Inter process Communication

An important thing in GCOS architecture is inter-process communication and inter-process group communication. The first is architectured using micro-kernel features. Communication between the system process group (J=0) and the other also use dedicated semaphores and shared memory. Communication between non-necessarily resident process groups used a mechanism via the file system (queued subfiles).

Semaphores

GCOS semaphores were designed after Dijkstra primitives. A semaphore is an architectured entity containing a lock and a count. The creation of semaphore is usually static (either as a system semaphore part of the application program interface or as a local semaphore set up by the linker). The operations available on semaphores are P-operation, V-operation and a P-Test operation.
The lock is set by a P operation. If the lock is set, the P-Operation increments the count and put the thread in wait state and links its id in the semaphore queue.
The V-operation decrements the count and, if it became null, reschedules the first thread of the waiting queue (i.e. put it in the ready queue).
The P-Test is a complementary statement that allows the thread not to be stopped when the semaphore is locked. It is used to implement a WAIT on multiple-events and similar features.

Semaphores are handled by the micro-kernel in relation with the thread dispatcher.

Semaphores with Messages

Many semaphores concern system services: server threads are waiting new requests on a system semaphore. So semaphores may support messages (16 bytes strings) that are queued by client threads performing V-operations. The issuing by the server thread of a P-Operation on a semaphore with message blocks the server if no request (V-op) is pending and returns the first message in the queue if some have been notified.

Messages are usually pointers to more detailed data structures. The semaphores messages are stored in a special system segment used only by the micro-kernel.

Shared memory

Type 0 data segments are within the address space of all process groups. Most of them have restricted access (Read only and/or ring number). Type 1 data segments are not directly addressable because that address space is dynamically allocated. However, the system maintains a table of shared data segments allowing the sharing of the Type 1 space.
Type 2 data segments are statically assigned to all the threads of the same process-group.
Type 1 and 2 writable segments have to be protected against concurrent access by a locking mechanism, usually locking semaphores.

Event management

An additional mechanism has been initially introduced in GCOS, but its role decreased with time because its lower efficiency. Event Management was a software (not micro-kernel) supported mechanism that stored messages (like messages semaphores) in the file system (in queued files). The facility did not require that the server was active when the client posted its messages and extended the limits on the size and the number of messages.
However, the processing and the elapse time for operations on events caused this mechanism to be of limited use.

Dispatching of threads

Threads are either stopped, blocked or ready for execution. When a thread has been started, it passes under the control of the micro-kernel. The dispatching of the ready threads is made by the micro-kernel according to the thread priority. The micro-kernel works according the thread priority, threads with the same priority are allocated slices on a round-robin basis.
Threads may be allocated a time-slice after what they are interrupted and put at the end of the queue of the ready threads with the same priority. That feature was made available after a few years, when interactive applications became predominant.
Threads blocked themselves when they issue a P-operation on semaphores already set by another thread. They became ready, when another thread issued a V-operation on that semaphore.
So all normal dispatching was made under the efficient micro-kernel control, letting to GCOS scheduler the initiation or the killing of threads.

Initialization

The hardware initialization of the system evolved somewhat during the time, but software bootload did not change deeply. The initialization of the hardware configuration (specially peripheral controllers) was finished under SIP System Initialization Program that was loaded by firmware from the boot device. Note that the modification of the boot device was made through operator commands to the service processor.

SIP role was to read the system configuration table and to update it according the real hardware availability and to load the optional firmware components. After that phase, control was given to GCOS ISL a real memory version of the operating system that verified or built the resources needed by GCOS.

As data integrity was crucial to GCOS, all system malfunctions were handled by special machine integrity assurance routines (MIAR) that attempted to recover from possible memory, IOC and processor errors. In fact, the reliability of hardware was much higher that previous systems and there were no many occurrences of MIAR. MIAR operated as fault procedures called in the user threads (when the error was detected on the execution of processor instructions) or in asynchronous threads (when it occurred in I/O). In SMP configurations, MIAR had the possibility to remove a processor from the configuration and to siphon the cache of that processor. The dynamic reconnection of a processor was also possible from the service processor. As main memory had auto-correction capability (by ECC), MIAR able to isolate blocks of memory were infrequently called. GCOS did not implement in software a relocation of the operating system control structures, so non-corrected memory errors usually called a system termination. The service processor initiated memory-checking lead to the isolation of the memory block.

Multi-processor, Partitioning and Multi-computer configurations

The first generation of GCOS systems was supporting only single processor configurations; the architecture had been designed to support several (initially up to 8, later -in Artemis- up to 24) processors in a shared multi processor (SMP) configuration. Hardware became available in 1981. According to the architecture the functions of allocating CPU to threads was under the responsibility of the micro-kernel. The operating system run unchanged in single processor or multi processor. A few changes were needed just to improve performances by decreasing the path of critical sections of the kernel protected by a lock semaphore.
The micro-kernel itself was running in all the processors and when a thread relinquished control the firmware dispatcher look in the threads ready queue to select the new thread to be given control. The threads ready queue was protected by a hardware lock that inhibits access to the related memory bank.
To avoid that such dispatches contribute to flush the L1 processor caches, some "affinity" between processor and cache was recorded in thread status to give preference to a thread that might have still some data in the cache. A similar procedure was used for dispatches of Artemis that shared a L2 cache between each cluster of four processors.

The architecture provided a way to run specific threads on specific processors as a provision for specific features available in some processors. Such an asymmetry was unused until the mid 1990s, where different pricing was applied to the Artemis processors, accelerating the DB and Unix processors and slowing down GCOS application processors.

With the availability of SMP on P7G, Siris 8 emulation coexisted with the first test of GCOS7 by converted customers. A facility was offered to partition the SMP hardware in two systems. IBM offered also this feature on S/390, but several years after Bull. The same system could be operated during some shifts in a dual-processor configuration, and in other as two independent GCOS systems.

With Artemis that could support up to 24 processors, the partitioning facility was extended.

Distributed Processing & Networking

Although, the hardware systems on which GCOS run since 1979 had Distributed Systems in their named, GCOS is NOT in itself a Distributed Operating System. Its usage of heterogeneous peripheral processors has been the pretext for this commercial name. But all the architectural entities of GCOS, with the exception of files, have only a meaning inside one system. Even when a MP hardware system was partitioned in several software systems, no mechanism had existed to provide a direct communication between GCOSes. There has been studies made in the 1980s to attempt to architecture more efficiently Distributed Processing in clusters of homogeneous systems, however those studies did not materialize. They face the issue of the implementation of distributed processing across Honeywell and Bull heterogeneous systems under evolving standards (Bull-Honeywell DSA, ISO and finally TCP/IP). Even a standardized function like RPC -remote procedure call- had not been architectured in GCOS.

Telecommunications

During the life of GCOS, many changes occurred in the field of telecommunications. In the 1960s, excluding the time sharing applications that were not yet considered as market for general purpose computers, two main usages were considered: remote batch operation to allow the decentralization of input/output and direct access to terminals to allow data collection and other real-time processing. The idea of formalizing data collection in the programming system through a message queuing facility emerged in the early 1970s via the CTG of Codasyl. However, that approach did hardly satisfy the operation of multiprogrammed systems and had to be integrated in the OS.
A major transformation was the universal acceptance of transaction processing that revolutionized the industry. TP applications became the heart of business applications in the computer, batch-processing retiring as ancillary activities.
Another revolution occurred in the late 1970s, that is the recognition of networks of computers. The illusion that a single system could cope with all the needs of a company or an administration faded and networking became a key point in the architecture of a system. Until the mid 1980s, most manufacturers dreamed to dominate the customer computing world and proprietary systems were developed. Finally, all ended by accepting open networks that allowed all type and makes of computers to be connected to a worldwide network.

GCOS basic architecture was used to build several implementations of products corresponding to that evolution of needs.

To support the message queuing facility of CTG, the GCOS COBOL compiler generated not only the calls to the SEND and RECEIVE procedures, but also the parameters of the process group handling the message processing. The launching of the COBOL application triggers also that of the message processing. Queues of messages was done through access to a set of Queued files that were filled or emptied by the Message Control Program.

Direct access was supported by a BTNS (Basic Terminal Process Group) process group handling terminals and accessed by the user program through shared memory (based on semaphores). BTNS supported its JCL controlled communications lines. All terminals located on the same communication line were to be allocated to BTNS.

After the CII merger, CII-HB and Honeywell embarked into the definition of a competitive solution against IBM's SNA the HDSA Honeywell Distributed Systems Architecture. HDSA differed from SNA by its orientation around a Network Processor instead of main frames. HDSA was designed at the same time as the ISO OSI Open Model and adhered somewhat dogmatically to that layered model. The presentation and application levels were to be implemented in terminals and hosts while the interface between NCP and hosts would be the session layer. HDSA was implementing only the "connected" mode, following the tradition of PTT's that were already offering X-25 protocols. If many applications were taking advantage of that "connected mode" (Telnet, File Transfer...), establishing a session for each transaction caused much overhead over the system.

The Network Processor was a Level 6 minicomputer and was used either as a Front-end (FEP) of a GCOS mainframe or as a remote Network Processor. The interconnection between a DPS-7 and the FEP was via PSI channel. It violated somewhat the HDSA architecture concepts, because it did not offer strictly a session interface but multiplexed the data exchanged over the opened sessions as packets transmitted between the FEP and a GCOS resident program, named FNPS. That FNPS program controlled the FEP PSI interface on a couple of logical channels on which it read and wrote data continually on looping channel programs. In addition to that, FNPS was performing the bootloading and the initialization of the FEP, when the FEP was disk-less -the general case for DPS-7 systems.

Distributed Applications

DSA, as HDSA was named after the nationalization of CII-HB, offered the following functions:

Terminal control for all direct access applications
File transfer and remote batch processing
Security
Cooperative transaction processing (mapped on SNA LU6.2)
Access to IBM SNA terminals -through Janus options in FEP-

HDSA had always been considered as a close-gap and a learning phase before a manufacturer independent ISO standard be adopted. Unhappily for Bull and Honeywell, their intense lobbying to attempt to have HDSA choices adopted by ISO was hurting not only IBM and Digital that had their own standards (IBM improving SNA, and DEC adopting "non-connected" mode), but also the scientific and university worlds adopting en masse the TCP/IP and UDP/IP architectures. Some ISO standards were adopted, functionally close to DSA, but requiring many implementations. Bull and some of its customers speculated that ISO would take over eventually but had their UNIX systems implementing TCP/IP. So, a progressive conversion to the world of Open Systems occurred in the late 1980s.
The DPS-6 minicomputer was discontinued and a new front-end processor for DSA was based on the 68000 architecture. The evolution started on the low end DPS-7000 with a micro-FEP directly connected to the Multibus II I/O bus of those systems. However, firmware emulated a PSI interconnection. Only a part of NCP software was ported from Level-6 to the 68000 architecture. In a second phase, the DPS-6 FEP and Network processor were replaced by 68000 systems supporting DSA. In addition, similar hardware, now under UNIX (SPIX) operating systems insure the interconnection to the TCP/IP world.

GCOS did not initially support natively the TCP/IP protocols and applications. Instead, the TCP/IP world was attached to a UNIX subsystem OPEN7 operating under GCOS that then communicate with the GCOS world.

Later, in 1996, TCP/IP stack was implemented natively under GCOS, progressively decreasing the role of DSA to the support of legacy terminals.

SECTION IV
Modes and Emulators

A specific feature of GCOS was its capability to operate in foreign modes by providing to the user an environment identical to that he enjoyed (?) in his previous machine. Those modes run simultaneously with GCOS native applications. While such simulators are now available for free as simulators for 8-bits microcomputers in our PCs, the performances ratio between a to-day 32-bits processor and the emulated machine is more than 10 to one. In the case of GCOS emulators, the margin was significantly lower and the realization of emulators needed more features and more talents. The marketing objective was to emulate with a positive margin systems that had a row computing power not inferior to 1/3 of the raw power of the GCOS hardware.
That hardware was based upon a 32-bits architecture, and used a bank of registers. The strategy adopted for GCOS emulators was to use that hardware in conjunction with a dedicated firmware to interpret the sequence of instructions of the emulated machine. No particular effort was made to reuse the native firmware. This architecture is similar to that used by the lower models of the IBM S/360 as opposed to the strategy used on the emulators of S/360-65 where native instructions were monitoring each emulated instruction.
However some emulated machines had features that cannot well emulated by the native hardware. In such case, an additional feature was added to the hardware either as standard (like op-code decoding in the P7G system, or as an option (like flags processing in the P7 system). Those features were ignored in native mode.
The emulator firmware processed all instructions with some exceptions. The exceptions concerned features due to be shared by the native programs due to operate simultaneously with the emulators (timers, I/O ...).
The general architecture of the emulator was to encapsulate the emulated program into a native mode program that monitored the emulator launching and processed the functions not executed by the emulated firmware. The emulated memory was included as a native segment in the "emulator" process group. That "emulator" was in general multi-threaded, allowing a processing of I/O more effective than the emulated machine (for instance by increasing the buffer capacity, or in relinquishing control to the emulated program without waiting for I/O completion). Mapping the I/O devices of the emulated machine or faster native devices was also a way to achieve better overall performances.

GE-100 Emulator

The GE-100 system emulated on Level-64 was a batch second-generation system with card input, printer output and files located on tapes or discs. Sequential processing was the rule, but optional discs were often used in "direct access", by their physical addresses.

The system used byte addressing, as was Level-64, so there was no need for special hardware for the emulation. GE-100 instructions sequences were interpreted by the emulator micro-programs at the exception of a few instructions noticeably I/O that were trapped to the emulator support software.

The emulator software was a native GCOS process group including several threads dealing with I/O and exceptions and addressing a data segment image of GE-100 main memory.
I/Os were buffered inside the emulator process group, allowing the operation to define larger data blocks on the native devices than the usual 80 characters of the emulated machine.

GE-100 discs were reconnected to the Level-64, but the emulator software offered also the access of the GE-100 disc image on native discs (GE-100 images being considered as native files on the direct access method).

H-200 Emulator

Honeywell H-200/2000 had an architecture quite different from Level 64. H-200 was derived from the IBM 1401 architecture. It handled data as 7-bits bytes using a flag (an additional bit in the last byte) to delineate character strings; when the native machine was processing 8-bits bytes defined by length in the instruction. A hardware feature to detect the flag by hardware instead to checking it by firmware for each data byte. The emulator was so able to benefit from the 4-byte word mover capability offered by the native hardware, instead of performing moves one byte at a time. That flag detection was the main feature of the special hardware card cage required by the emulator.

The software side of the emulator had a similar architecture to the GE-100, while designed in Billerica and initially delivered under the 4A Back-up environment (a back-up operating system designed in Boston) instead of GCOS-64. In 1975, it was integrated within GCOS 64.

H-200 discs were also supported for conversion purpose by the GCOS operating system through the HFAS access method. Few programs apart a conversion utility did use that native method. H-200 differed from other file organization in allowing a hardware search on the data field of a record. HFAS was also available on native discs allowing the emulator to tale advantage of higher capacity and performance.

The Honeywell policy evolved during the 1970s attempting to move H-2000 customers to DPS-8 by attaching a co-processor to the DPS-8. So, H-200 emulation was abandoned at the end of the 1970s on the different versions of DPS-7

S/360 Emulator

Honeywell placed hopes in a S/360 emulator and a running system was designed by Billerica's engineering.

As Level 64 basic instructions and registers had been limited from S/360 and as the I/O formats were a superset of IBM offering, the emulation of that machine was relatively easy. S/360 emulator was a specific piece of firmware reusing native firmware primitives. CSV statements were translated by software into native calls.

The delivery of S/370 and the ambiguous position of IBM on 360/20 put the technically successful program in marketing jeopardy. Paging would require the implementation of that feature on Level-64 to maintain efficiency of programs and IBM unbundling policy seemed to forbid a customer to use a DOS/VS license on a stranger system.

Siris 3 Emulator

Siris 3 emulation was developed to maintain the CII-HB commitment to Iris-5x/6x customers. As it was the case with GE-100 and S/360, no special hardware was required. Some features of Iris-65 like topographic memory were not software supported or could be supported through the paging hardware of DPS-7/8x.

A special firmware was developed for the P7G processor and was supported by a native software process group performing I/O and sharing resources with GCOS software. A few changes were made in Siris 3 software to minimize the interpretation work, especially for supporting the DSA front-end processor. Support of CII specific data communications protocols (TMM-VU and other) was also added to the FEP and handled via FNPS by the Siris 3 emulator software.

Siris 8 Emulator

CII Iris 80 architecture was an extension of the SDS Sigma 7 32-bits architecture. It added essentially a MP capability to the Sigma 7. The main difference from the native mode was the position of the op-code that was inverted inside the word. As op-code detection was the first thing interpreted by the emulator (as a branch vector) it required that the instruction decoding part of the P7G be able to reference the last part of the word instead of the first. The cost of that modification was low and the feature stayed unused in the majority of systems.

The emulator software interpreted CSV (call supervisor) instructions, bypassing the inner layers of Siris 8 for efficient data management. It was the first program to take benefit of the SMP operation of DPS-7.

SECTION V
Interactive Modes of operation

Initial deliveries of GCOS systems were for batch processing operation that was the main objective of business data processing served by Honeywell and Honeywell-Bull. Engineers by designing the above features have been objected that they were more ambitious than the market required. However the market changed progressively during the 1970s. The traditional way of planned batch processing was to be replaced by real time transaction processing. Such a change was already perceptible in the mid 1960s, but had been matched with ad-hoc solutions on systems like GE-400 or 600 and IBM S/360. The challenge was to incorporate those new requirements in a general-purpose operating system.

Interactive Operator Facility

Around 1980, there was a requirement to build limited "time-sharing" facilities within GCOS. The goal was essentially to provide interactive debugging facilities to programmers and also to port some large applications needing interactions. There were alternative solutions to the one what was chosen. TDS subsystem was already available, but the nature of TP operations was specific and too restrictive: specific programming conventions, limited working set of TP's etc. Building a new subsystem has already done with IBM TSO or GCOS III TSS. It was preferred to extent the operator facilities and the JCL to build the interactive facilities and use the standard system for the rest of the environment (programming facilities, dispatching...) So the IOF environment could flexible ad infinitum (almost).

The architectural implementation was to allocate a process group (one J) per user and spin-off J as required. The total number of J was limited to 255, that number being an architectural limit to the number of IOF users. Later slight modifications to the architecture introduced by Japanese remove somewhat that limitations. Anyway, the number of IOF users rarely exceeded 10 on a GCOS7 system.

JCL that already had some powerful capabilities was extended to be used interactively and renamed GCL (generalized control language) and facilities built-in JCL but what had been of limited use (for GCOSes compatibility reasons) were revived, giving to GCL most of capabilities of a shell language. However, the heritage of a batch language restricted the primitives to the commands already available. Instead of developing features like regular expressions and pipes, it was chosen to augment GCL by a MENU processor offering a formatted menu to the user and including default-options for most of the parameters. The Menu processor was run in a listener thread allocated to each display terminal. That thread passed a GCL stream for interpretation by the job scheduler server; if the GCL included the launching of an interpreting processor, such as the editor or BASIC, or an user interactive program, the remote terminal was allocated to that processor until its termination. IOF keeps listening for a BREAK signal that could interrupt the operation of the interactive program if the operator wished to get out from a loop or another faulty behavior.

Another layer was added to the console mode of IOF operation, initially available. It was a menu handler based on VIP display alphanumeric terminals and allowing application programmer to modify the terminal interface. A default system menu was substituted to the command interface.

IOF was the environment for several programming languages: BASIC, APL. The environment of those languages was quite different from standard programming languages. There was an interactive editor/parser followed that spun off an execution job step (the byte code interpreter).

IOF was used also with an text editor used essentially for entering programming languages source code and the documentation (the latter during the period where PCs were not yet a common toll (in the early 1980s). The editor was complemented by a formatter named WordPro that work in a way similar to troff.

Although, Bull made an extensive usage of IOF for developing HPL programs for GCOS itself, specific languages processors were developed for IOF. Most of them featured an interpretive execution.
BASIC was the first language implemented in the late 1970s. As soon as a "line" was entered, it was parsed and translated in byte code.
APL was also implemented under IOF in a similar environment. Special keyboards were supported. APL was used by some standard applications developed by software houses and ported from IBM VM/370 environment. Such combination was consuming much CPU resources and an attempt to microprogram an APL interpreter was undertaken, but finally cancelled. A few application programs were written in APL by a French software house.
A LISP interpreter was also written under GCOS. While the interpreter itself did not require special features, the "artificial intelligence" mood of the early 1980s cause several projects to consider LISP as a hub for many interactive applications (one of which being the famous "automatic configurator" publicized by DEC and seen as "the" solution for assembling complex systems. The configurator was written in KOOL that generated large LISP data set, regrouping in the same set "procedures" and "user data". GCOS offered a large 4MB segment for storing that text, but processing a dozen of "configurators" in parallel lead to an excessive working set in the system (in terms of paging misses as TLB flushing). GCOS had nothing to solve that challenge, as did many other systems, probably at the origin of some discredit for AI languages.

IOF lacks the interactive features invented in workstations and X-Windows became popular too late to have influenced GCOS operating system. GCOS never lad and would have been to be deeply modified to be a windows systems. The market of GCOS after having been in batch processing definitively moved to a role of data base / transaction processing server.-

Transaction Driven System

The basic architectural constructs of GCOS were not directly matching the requirements of transaction processing. The number of terminals (several thousands) connected to such an application were to excess the architectural dimensions. The overhead implied by the basic GCOS model (i.e. associating a job step to each transaction) had already proven unacceptable in GCOS III TP-II.
In most transaction systems operating under a general-purpose operating system, like GCOS III, GCOS 8 or IBM OS, a transactional subsystem reimplementing most functions of the OS was implemented and originally developed by sales support (e.g. Customer Information Control System). Instead, GCOS TDS was developed by engineering and took advantage of the basic OS and of provisions reserved for that purpose from the initial design.

The TDS Transaction Driven System model was gather a library of application specific commands that could be "called" by users (almost exclusively clerks dialoguing with their own customers by telephone or at a office window). The eventual purpose of those commands was to update databases and/optionally to deliver a printed ticket and receipt to the customer. The database was use to retrieve information, to create new information records and to update existing data. Frequently, in addition to the dialog with on-line customers, other transactions or printouts could be triggered following thresholds recorded on the database, or on timing events. The transaction commands were named TPR Transaction Processing Routines. They were stored in binary format as shared modules in a library.
TPR were written in a special COBOL dialect. They were preprocessed, compiled and linked as "shared modules" type 2 segments -using an option of the static linker-. They were processed as re-entrant modules to be executed in a thread initiated at each "transaction". The loading time by the dynamic linker was minimal. A cache of loaded TPRs was maintained so no additional I/O was needed for most frequent transactions.

The working area of the transaction thread was the stack, but in addition some data segments (protected by their read-only status or by semaphores) could be specified by the programmer. A TPR could SEND/RECEIVE additional messages to the clerk and could accessed one or several records of one or several databases. There was no specific restriction in data base usage and several access methods could be used in a transaction. The TPR however should issue COMMITment statements when it was ready to free the databases and must TERMINATE in relinquishing all other resources not stored in the file system.

TDS uses the GCOS mechanism in mapping its own architecture concepts on the GCOS mechanisms.
First terminals are not permanently on-line with the GCOS system. They are ignored until they log-in.
Second, terminals are not architectural entities but are only source and destination of messages. So, a transaction could involve one or several terminals. Even a logged-in terminal has just been made known to the TDS subsystem. Its user may send messages.
Third, when the terminal user sends a message specify it begins a transaction, by sending a command recognized as such by the TDS overseer, a "virtual process" is created within the file system for that transaction.
Fourth, this "virtual process" is mapped on one of the threads of the TDS threads pool. This mapping may be immediate if there are available frees entries in the pool or it might be delayed.
Fifth, the mapping remains effective until the thread had to suspend itself because long duration periods such as exchange of messages with the terminal (that impact relatively long transmission times and longer user "think time"). In those case, the virtual process is unmapped, its context is stored in the file system (TDS swap file) until the terminal answer is received. A programmable time-out may cancel the transaction.
Sixth, the transaction may just read the data base (s) and the termination (normal or caused by time-out) has no special operation to do. Alternatively it could alter the contents of the database. Modifications of the database are journalized by copying the concerned block before modification and storing a copy of the modified record (after journal).
The purpose of the before journal is to be able to cancel the modification if the transaction terminates before a COMMITment had been taken for the update. The after journal has the purpose to reconstruct the data base if a problem (hardware failure, system crash) require to back up the data base before restarting the processing of transactions.
In fact, the "before-journal" was frequently replaced (at customer wish) by the mechanism of "differed updates" where the database was not updated before the end of the transaction. That mechanism, in liaison with the control intervals buffer pool and a General Access Control (GAC) implemented simultaneously, was provided a data base cache with all coherency mechanisms needed for an efficient processing of transaction. When the Oracle server was included in TDS, this cache became distributed part in GCOS, part in the Oracle database, and possibly also in cooperative TP.
Among events that may cause the termination of transactions is the mutual interlocking of transactions in concomitant accesses to several records. The strategy applied in that case is implemented in the GAC server.
The after journal was giving a way to reconstruct the database in the event of a system malfunction. Another solution was optionally used at customer wish that consisted to keep a log of transaction request and to replay them after a crash. A logging of messages was often done in a TP system for arbitrating conflicts between end-users and clerks. However, the simultaneous processing of transactions would not guarantee the same result for on-line processing and for batched logged transactions; replaying logged transactions messages might cause problems when end-user show guaranteed output of transactions that are not identical to the definitive update of the database. So the journalized file system was the more recommended solution.
Journals and the swapping file (containing transaction context) were the object of special care against hardware failures. Header and Trailer time stamps were used to guarantee the integrity of those files.
Dual copies of databases were introduced essentially to decrease the recovery time, in case of media failures, and secondarily to improve the latency time of media accesses. Dual copy did not replace the existing mechanisms of differed updates and after journals that remained needed for a 24/7 continuous operation.

The behavior of a transaction system depends upon a right planning of the transactions programs. Whereas, batch applications and IOF applications may accept runaway programs, it is not the case for a normal operational transaction system. However, more protection is offered by GCOS than in competition systems where all running transactions operate in the same address space. All accesses to the databases and all modifications of the work files of the subsystem are monitored by TDS procedures. The execution of a runaway program inside a transaction is not likely to alter the integrity of the database by other transactions and even may not be noticeable to other transaction users. For instance, using a transaction program recursively is likely to cause a stack overflow in the private thread space of the transaction or to be subject to a transaction time-out. The multi-threading of the TDS is pre-emptive and a transaction program cannot monopolize a processor.

Although, the initial specifications called for a single TDS subsystem in a GCOS system, there was no barrier to operate several systems with the same or different databases with the same or different access rights in the same system.

When DSA was introduced in the early 1980s, the following options were taken for transaction processing. Terminals not attended by clerks were not connected to the network. Attended terminals had an open session to the TDS server after the clerk had logged-in and was recognized by its terminal-id and its password. The network processor had no knowledge of the transaction concept and was totally transparent to the transaction protocols and commitments.
When cooperative TP was considered, an issue was raised about opening "communications sessions" for each distributed transaction, as the "connection" network architecture would have required. The overhead penalty was high enough to justify the establishment of permanent sessions between distributed Transaction systems and to use them as a pool of data pipes on top of each CTP protocols would apply, realizing an "emulation" of "connection-less" protocols. TDS did not support directly "very long transactions" that would require that the context of the end-user be transported to another terminal or have to be maintained over days. Specific protocols to insure the cancellation of commitments in the database have to established at application level (for instance to separating the concept of reservation and buying or by programming cancellation TPR, knowing how to undo commitments). The model where the end-user would directly perform transaction from its own PC (using cookies) had not yet taken place in the 1980s and in the era of mobile computing, that model had itself its limits.

In the mid 1980s, transactions that could be distributed between different TP systems were introduced using CTP -cooperative transaction processing- protocol closely mapped on IBM SNA LU-6.2 (using DSA or DSA mapped SNA networks). Distributed TP between GCOS TDS and IBM CICS was becoming a reality.

A common characteristic of the TP model exemplified by TDS was that the transaction system kept the state of the whole transaction on behalf of the users. This was an heritage of the era of dumb terminals (Teletypes and display terminals without programming capability). When PCs were substituted to those dumb terminals, it subsisted a lack of confidence to store enterprise level important data in the memory of PCs. There were many customers that jeopardize the integrity of their databases by moving unconsciously to the client-server model where the state of the transactions was distributed, partly in the user workstations partly in the server (s). The centralized state model (in conjunction with CTP protocols) was rugged and robust, however, it presented a bottleneck when a transaction processing application became open to millions of Internet users. It also had problems to accommodate very long duration transactions or hierarchies of transactions.

Data Base server

While Data Management has been handled as a service operating in the thread environment of its caller (the TDS threads for transaction processing), the port of Oracle by Bull engineers in the early 1980s mark a change in the architecture. Essentially, to minimize changes in Oracle source code, Oracle port was implemented as a separate J server (a specialized process group) receiving SQL request from other client process groups (batch, IOF or TDS).

Those implementations were helping the introduction of a large number of processors. The Auriga hardware architecture was characterized by sharing a L2 cache within a group of 4 processors. While that feature was masked to programmers and to the users, it has a significant performances impact introducing a degradation due to the address space migration between L2 caches. In the mid-1990s, GCOS systems were sold in attributing a different price to the processors, decreasing the price for more computation incentive applications such as Oracle and Buffer management servers, keeping them high for standard GCOS applications.

Open 7 System Facility (UNIX environment)

From 1975 to 1982, GCOS was THE operating system of CII-Honeywell-Bull and its responsible had a tendency to ignore two important factors that will change the world of software: the advent of the Personal Computer and the penetration of UNIX an operating system developed essentially outside the industry in non-profit organizations. The new Bull management was not at all biased in favor of CII-HB product lines and attempted to convert the companies to the world of Open Systems. It became obvious that opening the world of GCOS software would allow to integrate new applications at a low cost, specially those developed for a direct interaction of the user and his (or her) program. The IOF environment required a high porting cost for applications developers who had develop them in mini-computer environment or other incompatible systems.

Several solutions to offer a UNIX compatible environment were considered: a software solution and a hardware solution where a UNIX supported processor such as Motorola 68000 would have been attached to the GCOS system, as the DPS-6 Front-end processor has already been. The hardware solution raises the issue of providing scalability across the whole range of DPS-7000. Its implementation was initiated in the early 1990s on GCOS-8 systems and was adopted only in the late 1990s on GCOS7 (Diane project).
The software solution consisting to build a UNIX environment the way emulators were integrated inside GCOS was the first initiated. It received few attention because Bull management wanted to orient customers towards genuine UNIX systems. Its perception was limited to the port of UNIX typical application, the most important being the TCP/IP stack.

In fact it was a port of UNIX to the DPS-7000 instruction set. This was done using the GNU C compiler with the DPS-7000 assembler, generating native code. The UNIX supervisor ported to the native decor was linked to service function that were calls to the GCOS services providing an access to the UNIX resources files AND to the GCOS resources. This port of UNIX was multi-threaded using the micro-kernel support and could take advantage from the DPS-7000 multi-processor, not only to have UNIX and GCOS coexist, but even to run simultaneously several UNIX processes.

Open 7, as the UNIX port was called, used the services of GCOS operating system as emulators did, sharing devices and system resources (timer, input and output). GCOS allocated to Unix a large GCOS file that was mounted as a UNIX file system. All devices I/O were handled by GCOS. UNIX benefited from the shared buffer pool of GCOS and did not need its own peripheral.
But, it was able to controls its own front-end processor (a real UNIX system) through a port on the Ethernet (or FDDI) local network and to perform TCP/IP networking on the same hardware resources as the GCOS system. Reversibly, Open 7 implemented a TCP/IP server for the account of GCOS7 programs.

When it was planned -around 1995-to discontinue the manufacturing of more performing DPS-7000 processor, this software solution lost its interest and a return to the hardware solution was re-envisioned, using the Intel platform instead of the IBM/Motorola's one. The TCP/IP stack was moved to the native GCOS and was the base of the interconnection of the two worlds (by RPC instead of direct calls a sit was the case in Open 7).

Finally a DPS-7000 emulator (Diane 2) was developed on IA-32 and IA-64 hardware architectures. Windows NT was used as a loader and a supervisor for the GCOS applications that finally stayed with most of GCOS code on top of the most popular architecture.

SECTION VI
Miscellaneous

In this section, several aspects of GCOS covering features encountered within its development will be examined:
- the limits of the architecture that were originally suspected or that were discovered during years.
- the constraints and decisions related to software engineering techniques in an industrial product.
- the divergence (and their causes) with our original partner NEC.

Limits

Physical Memory

Level 64 architecture could address up to 256MB (28-bits byte address). As Level 64 hardware was limited to 1MB of physical memory by physical constraints (9000 chips on prototypes, 2250 chips on production model), the limit appeared reasonable and the extension to more than 256 MB might be designed without limited impact. The initial limit was 16 times that of S/360 designed 6 years before. The production Level 64 got RAM cards of 16KB. A memory size of 256KB meant one card cage of 16 cards and a fully configurated 1MB memory represented a complete cabinet (with electronics and power supply). Level-64-DPS were shipped in 1978 with memory chips of 16K bits, decreasing by a factor of 4 the size of the memory that was integrated in the CPU cabinet. P7G was delivered in 1982 with 64Kbits memory chips -the 64Kbits chips was the most difficult to pass by semiconductor manufacturers and the physical memory was extended to 2x4 MB.

Number of J

The limit of 256 job steps was relatively comfortable until many interactive users worked in the system. The limit was initially encountered by NEC that had made the choice to allocate a J per interactive user. They kept some J with up to 255 threads and while limiting the number of other limited to 16 threads. So, they succeed to have around 1000 J. Their design was adopted by Bull in 1986 but had a very little impact on GCOS.

Virtual Address Space

The address space both in terms of number of segments and size of individual segments appeared a significant restriction soon in the 1970s. Two "escape" mechanisms were implemented the large 4MB segments and the type 1 segments. In addition, the limited address space inherent to a 42 bits machine was one of the reason of the destruction of the address space at each step of a batch or an interactive job. Actually, systems like Multics were forced to rely on a "flushing" of the segment tables and a reconstruction of the dynamically linked space when limits were encountered. The decision of considering segments as buffers and not as part of the file system helped a little bit. It was also possible to segregate operating system functions as separate process spaces to increase the total address space.

Changes in that area were very difficult to make as other products lines had experimented. New software releases had to work on various "old" processors for customer convenience and to decrease maintenance costs. New hardware systems were introduced only at both ends of the product line.

For technical and "political" reasons, an extended architecture XSA was designed in the late 1980s jointly by Bull with NEC. When implemented, it would have allowed the removal of the constraints of the limited address space. However, only some parts, like extended physical memory space, were eventually implemented by Bull and NEC last processors were not marketed by Bull, nor the extensions made to GCOS.

Segment size

Most of the address space was available on segments limited to 64KB, not a strong limitation for procedure segments -forcing system and application programmers to modular components-, but for system tables that were limited to use less than 64KB. As tables might include several hundred bytes per item, that segment size was a limit frequently encountered. The way to go around those limits was to allocate a chain of segments or to undertake a more significant redesign regrouping several tables in a large segment as in non-segmented systems, taking care of non-introducing security flaws by that way.

Methodology, Integration and Delivery

The size of the project required an intensive work on integration and delivery methods. The architecture, per se, was not enough to maintain a good product. The development was divided into working groups producing system integration units, a collection of "debugged" (alpha-tested) source code, and documentation of a functional area.

Source code was originally coded, compiled and run in simulation mode under Multics on a GE-645. When Level 64 hardware became available, Multics role became more that of a repository of the source code. Flow charts, when used, had to stay off-line.
The situation was aggravated with the regrouping in 1975 of all the development in Paris and the integration of additional teams. A single 645 was not able to cope with programmers needs and a return to the old world of punched cards was undertook.
Eventually, in the early 1980s PCs connected to a GCOS system became the standard equipment of programmers. Data entry, code corrections and documentation were made on PCs, while compilations and execution runs were processed under IOF on GCOS systems.

System calls, visible to applications or internal to the system were defined through HPL and/or MLP (and later C) macros. Those macros were expanded along with local programmer macros by a macro-generator pass before compilation. That procedure worked well as soon as all modifications were made in source form. However, the releases of the system were made on an yearly basis and corrections were required more often. If some corrections were done in source form and led to large amount of patches, frequently the correction was delivered only as binary (i.e. hex) patches and done also separately in the programmer source code for the next release.
Additionally, patches to correct a functional dysfunction were applied where it was the fastest to do, i.e. in areas where programmers had the time or the motivation to do it.
Consequently, the operating system became progressively more monolithic and minor changes more expensive to implement. Redesigns of large areas were the only way to clean the problem and to start on new bases.

The documentation of the customer operators and programmers was produced by technical writers from draft documents written in English by Bull programmers. That documentation was not always available for beta testing by Quality Assurance people. Beta testing was never officially at customer sites, although several products were introduced as "RPQ" for specific customers supported directly by Bull personnel. GCOS descriptions included flowcharts, table descriptions, and internal interfaces. Those documents were written by developers, edited by support people and distributed as microfiches to support in Field Engineering.
Specs for duly commenting the source code were set up, but they were somewhat relaxed in the "punched card era" and the usage of English was also relaxed for French in some products developed by CII originated development teams and delivered in France. Anyway, all that technical documentation was only distributed internally and was not made available to customers. Those customers had to see the computer and its operating system as a "black box" running their application. That avoids dysfunctions created by an external usage of interfaces not documented as public (in programmer's manual), it also decrease attachment of programmers to their system and led to STARs (System Technical Anomaly Reports) not documented enough and mixed with improvement request. The situation improved in the late 1980s when customers and GCOS technical support got an under-siege mentality and collaborated in CUBE users commissions to insure their job's survival.

NEC ACOS systems

NEC acquired a full development license of GCOS as part of an agreement with Honeywell made in the 1960s and renewed in 1972. A few NEC engineers participate to GCOS 64 project as developers and the first ACOS4 release was almost identical to GCOS64. It differed only by a support of Japanese language, while their reference documentation stayed in English. NEC continued to have access to GCOS products in the 1970s, with the exception of HDSA products.

NEC used ACOS 4 on their high performance systems, including until 1985 where it was replaced by UNIX, on their supercomputer SX systems. They supported their own data communications systems and their own line of peripheral. They implemented paging, based on a 1974 common design with CHB, on their systems starting in 1980.

NEC chose to design its interactive operation in a way inspired from IBM MVS. Networking adopted a mainframe centric architecture similar to SNA. Interactive processing was mapped on TSO architecture. Transaction processing was inspired from IMS/VS where a queuing server distributed transactions to specific servers organized by transaction type.

In 1985-1986, Bull renewed the cooperation with NEC, borrowing the design of the paging mechanism from them. However, both companies pursued their way on their software, exchanging presentations and ideas on innovative areas.

Conclusion

GCOS 64 and GCOS 7 were the successive names to the same continuous design. If other software systems showed a similar history, it is the only European software that had a comparable life: from old card batch oriented applications toward powerful transaction servers.

It suffered from the complexity introduced by the layers of functions introduced with time, from limits due to the high cost of hardware at the origin, from some dogmatisms in some areas that were circumvented more than corrected. GCOS was overall designed and managed for manufacturer-designed software. It respected industry de facto interfaces (i.e. IBM) more than other Honeywell and Bull systems but it ignored for a too long time the world of open systems.

It will survive some time to the abandon of its genuine hardware designs by being hosted in a SMP Intel architecture (initially IA-32, eventually IA-64), coexisting on the same enclosure with operating systems of other servers, serving the applications now becoming obsolete or not performing enough under GCOS in an open networking environment.

Retour