The Year 2000 problem (also known as the Y2K problem, the millennium bug, the Y2K bug, or simply Y2K) was a notable computer bug resulting from the practice in early computer program design of representing the year with two digits. This caused some date-related processing to operate incorrectly for dates and times on and after January 1, 2000 and on other critical dates which were billed "event horizons". This fear was fueled by the attendant press coverage and other media speculation, as well as corporate and government reports. People recognized that long-working systems could break down when the "...97, 98, 99..." ascending numbering assumption suddenly became invalid. Companies and organizations world-wide checked and upgraded their computer systems.
While no significant computer failures occurred with global significance when the clocks rolled over into 2000, preparation for the Y2K bug had a significant effect on the computer industry. The fact that countries where very little was spent on tackling the Y2K bug (such as Italy and South Korea) fared just as well as those who spent much more (such as the United Kingdom and the United States) has generated debate on whether the absence of computer failures was the result of the preparation undertaken or whether the significance of the problem had been overstatedY2K was the common abbreviation for the year 2000 software problem. The abbreviation combines the letter Y for "year", and k for the Greek prefix kilo meaning 1000; hence, 2K signifies 2000. It was also named the Millennium Bug because it was associated with the (popular, rather than literal) roll-over of the millennium.
The Year 2000 problem was the subject of the early book, "Computers in Crisis" by Jerome and Marilyn Murray (Petrocelli, 1984; reissued by McGraw-Hill under the title "The Year 2000 Computing Crisis" in 1996). The first recorded mention of the Year 2000 Problem on a Usenet newsgroup occurred Saturday, January 19, 1985 by Usenet poster Spencer Bolles.[2]
The acronym Y2K has been attributed to David Eddy, a Massachusetts programmer,[3] in an e-mail sent on June 12, 1995. He later said, "People were calling it CDC (Century Date Change) and FADL (Faulty Date Logic). There were other contenders. It just came off my COBOL calloused fingertips."[citation needed]
It was speculated that computer programs could stop working or produce erroneous results because they stored years with only two digits and that the year 2000 would be represented by 00 and would be interpreted by software as the year 1900. This would cause date comparisons to produce incorrect results. It was also thought that embedded systems, making use of similar date logic, might fail and cause utilities and other crucial infrastructure to fail.
“ | The Y2K problem is the electronic equivalent of the El Niño and there will be nasty surprises around the globe. – John Hamre, Deputy Secretary of Defense [4] | ” |
Special committees were set up by governments to monitor remedial work and contingency planning, particularly by crucial infrastructures such as telecommunications, utilities and the like, to ensure that the most critical services had fixed their own problems and were prepared for problems with others. It was only the safe passing of the main "event horizon" itself, January 1, 2000, that fully quelled public fears.
In North America the actions taken to remedy the possible problems had unexpected benefits. Many businesses installed computer backup systems for critical files. The Y2K preparations had an impact on August 14, 2003 during the Northeast Blackout of 2003. The previous activities had included the installation of new electrical generation equipment and systems, which allowed for a relatively rapid restoration of power in some areas.
The practice of using two-digit dates for convenience long predates computers, notably in artwork. Abbreviated dates do not pose a problem for humans, as works and events pertaining to one century are sufficiently different from those of other centuries. Computers, however, are unable to make such distinctions.
“ | I’m one of the culprits who created this problem. I used to write those programs back in the 1960s and 1970s, and was proud of the fact that I was able to squeeze a few elements of space out of my program by not having to put a 19 before the year. Back then, it was very important. We used to spend a lot of time running through various mathematical exercises before we started to write our programs so that they could be very clearly delimited with respect to space and the use of capacity. It never entered our minds that those programs would have lasted for more than a few years. As a consequence, they are very poorly documented. If I were to go back and look at some of the programs I wrote 30 years ago, I would have one terribly difficult time working my way through step-by-step. – Alan Greenspan[5] | ” |
In the 1960s, computer memory was scarce and expensive, and most data processing was done on punch cards which represented text data in 80-column records. Programming languages of the time, such as COBOL and RPG, processed numbers in their ASCII or EBCDIC representations. They occasionally used an extra bit called a "zone punch" to save one character for a minus sign on a negative number, or compressed two digits into one byte in a form called binary-coded decimal, but otherwise processed numbers as straight text. Over time the punch cards were converted to magnetic tape and then disk files and later to simple databases like ISAM, but the structure of the programs usually changed very little. Popular software like dBase continued the practice of storing dates as text well into the 1980s and 1990s.
Saving two characters for every date field was significant in the 1960s. Since programs at that time were mostly short-lived affairs programmed to solve a specific problem, or control a specific hardware setup, neither managers nor programmers of that time expected their programs to remain in use for many decades. The realization that databases were a new type of program with different characteristics had not yet come, and hence most did not consider fixing two digits of the year a significant problem. There were exceptions, of course; the first person known to publicly address the problem was Bob Bemer who had noticed it in 1958, as a result of work on genealogical software. He spent the next twenty years trying to make programmers, IBM, the US government and the ISO aware of the problem, with little result. This included the recommendation that the COBOL PICTURE clause should be used to specify four digit years for dates. This could have been done by programmers at any time from the initial release of the first COBOL compiler in 1961 onwards. However, lack of foresight, the desire to save storage space, and overall complacency prevented this advice from being followed. Despite magazine articles on the subject from 1970 onwards, the majority of programmers only started recognizing Y2K as a looming problem in the mid-1990s, but even then, inertia and complacency caused it to be mostly ignored until the last few years of the decade.
Storage of a combined date and time within a fixed binary field is often considered a solution, but the possibility for software to misinterpret dates remains, because such date and time representations must be relative to a defined origin. Rollover of such systems is still a problem but can happen at varying dates and can fail in various ways. For example:
- The typical Unix timestamp (time_t) stores a date and time as a 32-bit signed integer number representing, roughly speaking, the number of seconds since January 1, 1970, and will roll over (exceed 32 bits) in 2038 and cause the Year 2038 problem. To solve this problem, many systems and languages have switched to a 64-bit version, or supplied alternatives which are 64-bit.
- The Microsoft Excel spreadsheet program had a very elementary Y2K problem: Excel (in both Windows versions and Mac version, when they are set to start at 1900) incorrectly set the year 1900 as a leap year, for compatibility with Lotus 1-2-3[6]. In addition, the years 2100, 2200 and so on were regarded as leap years. This bug was fixed in later versions, but since the epoch of the Excel timestamp was set to the meaningless date of January 0 1900 in previous versions, the year 1900 is still regarded as a leap year to maintain backward compatibility.
- In the C programming language, the standard library function to get the current year originally did have the problem that it returned only the year number within the 20th century, and for compatibility's sake still represents the year as year minus 1900. Many programmers in C, and in Perl and Java, two programming languages widely used in web development that use the C functions, incorrectly treated this value as the last two digits of the year. On the web this usually was a harmless bug, but it did cause many dynamically generated webpages to display January 1, 2000, as "1/1/19100", "1/1/100", or variations of that depending on the format.
- Older applications written for the commonly used UNIX source code control system SCCS failed to handle years that began with the digit "2".
- In the Windows 3.1 file manager, dates were shown as 1/1/19:0 for 1/1/2000. An update was available.
Even before January 1, 2000 arrived, there were also some worries about September 9, 1999 (albeit lesser compared to those generated by Y2K). This date could also be written in the numeric format, 9/9/99. This date value was frequently used to specify an unknown date; it was thus possible that programs might act on the records containing unknown dates on that day.[7] It is also somewhat similar to the end-of-file code, 9999, in old programming languages. It was feared that some programs might unexpectedly terminate on that date. The bug, however, was more likely to confuse computer operators than machines.
Another potential problem for calculations involving the year 2000 was that it was a leap year even though years ending in "00" are normally not leap years. A year is a leap year if it is divisible by 4 but not divisible by 100 unless also divisible by 400. For example, 1600 was a leap year, but 1700, 1800 and 1900 were not. Fortunately, most programs were able to rely on the oversimplified rule that a year divisible by 4 is a leap year. This method works fine for the year 2000, and will not become a problem until 2100, when hopefully these older legacy programs will have long since been replaced.
The problem was compounded by the need of many systems, especially in the financial services sector, to calculate expiration and renewal dates in the future. For example, a company tracking five-year bonds would experience Y2K problems in 1995, when its systems needed to calculate an expiration date of 2000, which with two-digit years, its "00" expiration year would seem to be earlier than the "95" of the issue date.