«Medium-Range Weather Prediction Austin Woods Medium-Range Weather Prediction The European Approach The story of the European Centre for Medium-Range ...»
The Centre’s own CRAY-1A, serial number 9, was installed in Shinfield Park on 24 October 1978. This was the first export order for a Cray computer.
198 Chapter 16Provisional acceptance was completed on 10 November. A full computer service started on a trial basis in December, despite the limited staff then employed. The Rutherford service ceased. Serial Number 1 had some hardware modifications made to it to make it more suitable for crypto-analysis work. It was then shipped to a site belonging to the UK Ministry of Defence, prior to their installation of a CRAY-1 in March 1979.
The computer system in 1978.The computer system: CDC, Cray, Fujitsu, IBM 199
The CRAY-1A was a single processor computer with a memory of 8 Mbytes and a disk subsystem totaling 2.4 Gbytes. With a clock cycle time of 12.5 nanoseconds (equivalent to 80 MHz) and the ability to produce two results per cycle, the system therefore had a theoretical peak performance of 160 Megaflops. Running the operational forecast model, the machine was capable of a sustained performance of 50 Megaflops (50 million floatingpoint arithmetic calculations per second). Its reliability was over 99% at the time of its final acceptance on 6 February 1979. The mean time between hardware faults was 94 hours during its first year.
At a meeting with ECMWF in 1976, Seymour Cray was asked why his machine used only parity error detection on its memory subsystem rather than SECDED: Single Error Correction Double Error Detection. His response was “Speed!” — SECDED would add an extra clock cycle to every memory reference. His questioner commented that parity errors were the single most common cause of system crashes on the CDC 7600 at the University of London Computer Centre, a computer system that Cray himself had designed while at Control Data Corporation. Cray made no response, but he obviously took note of the comment; all Cray machines apart from Serial Number 1 of the Cray-1A used SECDED.
When the Centre delivered the first operational medium-range forecast to its Member States on 1 August 1979, a ten-day forecast required about five hours of CPU time, a reduction by a factor of 50 in the time required on the CDC 6600.
Compared to today’s systems, the Cray Operating System (COS) was fairly rudimentary. New versions were released regularly, in the very early days weekly, even daily. These were tested as thoroughly as possible, given the need to take dedicated “system sessions” lasting up to three hours. Although some critical problems were indeed isolated during the testing phase, it could and did happen that the new system would be put into production only to be withdrawn the same day due to bugs being discovered. Peter Gray, then the Head of Computer Operations Section, was well remembered for asking “Why wasn’t this found in testing?” He knew of course that no matter how much testing was done in the limited time available, this did not compare to running a full and varied production workload on the machine. Reverting to an earlier level of the system was in general fairly easy: loading a different removable disk pack on the disk drive of the Data General Eclipse control workstation and re-booting the CRAY machine from that software.
Member State use of the system was limited in the beginning due to the lack of high-speed telecommunications links. Council had a lengthy discussion in May 1978 on the Report of its “Advisory Committee on the Use of
200 Chapter 16the Centre’s Computer System by the Member States”, presented by its Chairman, Fred Bushby of the UK. The Committee had recommended that not less than 25% of the time should be available for the Member States. In December 1978, Council agreed with this proposal, with an allocation of 10% for “Special Projects” to be approved by the Council, the remainder to be split: 35% equally among the States and 65% allocated according to the financial contributions.
Although the UK telecommunications link was installed in March 1979, delays in establishing it meant that data were still being transferred by magnetic tape in November. The link became fully operational only at the end of April 1980. The link with Sweden was installed and working in October 1979, followed by Germany in November 1979. Member State visitors to the Centre also used the system. In all they used only 6% in 1979. The operational suite used 34% while the remaining 60% was taken by the Research Department, including the FGGE project.
In October 1979 the Centre hosted the fourth Cray User Group (CUG) meeting. Most of the sites that had installed CRAY-1 systems were represented; Cray Research sent a large proportion of their development team to the meeting. Los Alamos Scientific Laboratory, NCAR and the National Magnetic Fusion Energy Computer Centre had hosted the three previous CUG meetings. As was already customary, a social evening was arranged, in this case at a country pub in the Chilterns to the northwest of Reading. A coach was arranged to take the delegates and Cray employees to the pub, at the top of a steep hill. It was a lively evening. The locals taught anyone willing to learn how to play the pub games of darts, dominoes, shove-ha’penny and cribbage. Peter Gray, who had organised the evening, was horrified when the coach driver quietly took him aside after they had arrived at the pub, and informed him that the brakes on the coach had failed as they were climbing the hill. If the pub had been at the bottom of that very steep hill instead of at the top, Cray could have ended up without a software development team, and the future of the company could have been very different!
Geerd-R. Hoffmann from Germany succeeded Rob Brinkhausen as Head of Computer Division in 1980, and held the post until 1997. Hoffmann was renowned as a skilled negotiator in the many complex discussions with manufacturers over the years. The continuing success of the Centre in acquiring the best computer equipment available at the time is in no small way attributable to him and to his successor Walter Zwieflhofer.
In the following years, use of the system steadily increased. Hardware and software were upgraded to meet requirements. In 1981, the CRAY online disk capacity was increased by 75% and that of the CYBER doubled.
The computer system: CDC, Cray, Fujitsu, IBM 201 By 1981 there was a terminal in the office of each scientist and programmer.
At the end of that year, a CYBER 730E — later renamed the 835 — was installed to ease the interactive workload on the 175.
In 1982, the Centre issued an Invitation to Tender for a data handling subsystem and a local computer interconnection sub-system. At the end of that year, a VAX 11/750 mini-computer was installed for graphical applications.
In spring 1983, it was decided that a Loosely Coupled Network (LCN) would be acquired from Control Data Ltd to provide high-speed file transfer between the different parts of the system. At the end of the year, a high-speed coaxial trunk was delivered as the first phase of the LCN.
Installation of more components continued in the following year.
During 1983, time was rented on the CRAY-1S computer at the Atomic Energy Research Establishment at Harwell, 50 km from the Centre. A smooth-running and efficient procedure was developed to enable this remote machine to be used. Data were transferred on magnetic tape. In all 285 research forecasts to ten days were run on this machine.
Cray was an impressively successful company; it had grown from 50 employees in 1976 to more than 1,300 in 1983. From its contacts with Cray, the Centre was made aware of the development of a new kind of machine, the dual processor CRAY X-MP, MP standing for “Multi-Processor”.
Benchmarking exercises during the second half of 1982 confirmed that this machine was fully compatible with the CRAY-1A, and contract negotiations with Cray were begun. One very advantageous aspect of the contract was the lower maintenance charges that the Centre negotiated. Ambitious plans were made for development of the ECMWF computer system, in effect, the replacement of all of the Centre’s first-generation system. The replacement was completed by mid-1984.
In November 1983, a dual processor CRAY X-MP/22 was installed, which entered service on 13 March 1984. This had two CPUs and two million (8-byte) words of main memory, thus “22” — 2 CPUs, 2 million words (16 Megabytes). It had 128 Megabytes of secondary memory supplied as a Solid-state Storage Device (SSD). Its clock cycle was 9.5 nanoseconds (105 MHz), with a theoretical peak performance of 400 Megaflops. Its reliability was better than that of the already reliable CRAY-1A, with a mean time between hardware failures double that of the 1A. Its throughput was 3.3 times that of the CRAY-1A, exceeding the criterion laid down at the time of acquisition. Although the CRAY-1A was retained for three months as a back up, it was never required to fill this role.
Financing of the purchase of the CRAY-XMP was rather interesting. In November 1983, Council authorized the Director to purchase the dollars in
202 Chapter 16stages in advance by means of forward purchase contracts. About US$9 million was due in May 1984, the remaining US$1 million in August.
Although it was planned to purchase the May requirement in five equal amounts in each of the months January to May, the Director decided to wait until March, after the acceptance tests had been passed, before starting the purchase. As it happened the delay was to the Centre’s advantage, because the exchange rate pound to dollar went from US$1.40 to US$1.48 from January to March. However, rapid and significant exchange rate fluctuations at this time made the experience nerve-wracking for those involved; they were more used to dealing with scientific and technical rather than currency problems!
The Centre used the system to pioneer the operational use of multitasking, by having two separate tasks running, one on each processor. One task handled the Northern Hemisphere, the other the Southern Hemisphere, giving a speed-up of almost a factor of two over the single-tasked code.
The approach was generalized so that any even number of processors could be used, processing several rows simultaneously. Small inefficiencies arose, since the concurrent tasks required slightly different amounts of computation time — mainly because convective activity differed over the globe — but overall, a high average Central Processing Unit (CPU) utilization was achieved.
Additional improvements introduced with the X-MP system included an I/O (Input-Output) Subsystem, which allowed the disks and network devices to be handled more efficiently, and the SSD, which provided facilities for I/O at speeds substantially faster than those achieved using disk. While greatly improving program performance, the SSD complicated the scheduling of jobs on the system. The Centre’s analysts had to develop code that was incorporated into the Cray Operating System (COS), used to checkpoint the SSD memory, to ensure that it was available for use when the operational suite of jobs needed to run. This code was then handed over to Cray for inclusion in the next official release of COS.
Graphical applications were vitally important. Internal and external workshops were held to consider the need for a unified graphical system for the Centre. The basic graphical software would be proprietary, while contouring, observation plotting and so on would be developed within the Centre.
The first graphics hardware and software at the Centre was developed in the earliest years, and proved itself an excellent tool. A Graphics Project Group was established in 1984 to design and implement a second-generation system. This led to development of the Meteorological Application Graphics Integrated Colour System or MAGICS, which provided the basis for the Centre’s future graphics developments for the coming decades.
The computer system: CDC, Cray, Fujitsu, IBM 203 Although in 1982 only 40% of computer resources allocated to Member States was actually used by them, their use of the system continued to increase rapidly. In 1984, usage was doubled compared to the previous year.
The telecommunications links were now coming under strain; they were unable to handle the requirements for remote use of the system. This, together with the increasing demand for more of the ECMWF forecast products, led to Council approval of an earlier than planned replacement of the telecommunications system the following year. The Technical Advisory Committee (TAC) set up an ad-hoc sub-group to follow the work leading to the replacement.
In 1985, Council began discussions on the next mainframe computer.
Budgetary considerations dominated the discussion. The ECMWF budget was moving towards one of “zero growth in real terms”, a principle adopted by Council in May 1986. Council decided to finance the acquisition by a combination of bank loan of £5 million, the remainder to be financed by overdraft. Favourable interest rates were negotiated, and the loan was repaid in installments of £1 million in each of the following five years. The Head of Research, David Burridge, developed a cash flow model to project the monthly cash positions in each month up to 1992. The model was based on continually updated bank base rates, exchange rates, budget projections and other factors.
The Centre continued its interesting and perhaps even adventurous financial activities by acquiring the US dollars required for the next computer on the forward currency market. During 1985, almost US$3 million was acquired in several installments at an average rate ofUS$1.3286 to the £1.
In December, the Council authorised the Director to purchase the remaining US$1.4 million at once if the spot rate reached US$1.40 to the £1.
In December 1985, a four processor CRAY X-MP/48 was installed. It replaced the CRAY X-MP/22 after passing its final acceptance test on 11 February 1986. This system had 4 CPUs with a cycle time of 9.5 nanoseconds (102 MHz), 64 Megabytes of memory, 256 Megabytes of SSD and 13 Gigabytes of disk space, with a theoretical peak performance of 800 Megaflops.