A stark warning about the upcoming Epochalypse, also known as the “Year 2038 problem,” has come from the past, as National Museum Of Computing system restorers have discovered an unsetting issue while working on ancient systems.
Epoch-alypse now: BBC iPlayer flaunts 2038 cutoff date, gives infrastructure game away
READ MORE
Robin Downs, a volunteer who was recently involved in an exhibition of Digital Equipment Corporation (DEC) gear at the museum, was on hand to demonstrate the problem to The Register in the museum’s Large Systems Gallery, which now houses a running PDP-11/73.
The machine’s software had already been patched for the Y2K problem, where using two digits to store the year caused headaches when the century rolled around. “Y2K”, Downs explained, “was mainly an application programming issue … mostly it was application programmers not taking into account two digits.”
The Year 2038 problem is a different beast. Indicating the PDP-11/73, Downs said, “This machine isn’t running Unix, but we have a C compiler on it, and the C compiler is from 1982, so it has … various issues.”
According to Downs, the operating system was patched for Y2K in the late 1990s, but doesn’t use the same time structure for its internal date and time.
“So, the C compiler on this, already now, when you ask it what the time and date are, it gets it wrong. It returns the correct time, but the wrong date.”
Annoying, but solvable. The team worked around the issue. However, when Downs was testing it by moving the system clock forward, something unexpected happened. He moved the clock forward to 2036, and everything seemed fine.
Then, in 2037 – a year before the Epochalypse is due – the program crashed. “It turns out,” said Downs, “the time function has another bug. Undocumented, unknown, where at the start of 2037, any program that calls the time function just crashes.”
“So we found bugs that exist, pre-2038, in writing this that we didn’t know about.”
The Year 2038 problem occurs in systems that store Unix time – the number of seconds since the Unix epoch (00:00:00 UTC on January 1, 1970) in a signed 32-bit integer (64-bit is one modern approach, but legacy systems have a habit of lingering).
At 03:14:07 UTC on January 19, 2038, the second counter will overflow. In theory, this will result in a time and date being returned before the epoch – 20:45:52 UTC on December 13, 1901, but that didn’t happen for Downs.
He said, “What we expected was that the local time function should return 1901. That’s what we thought would happen.”
Instead, it went back to 1970.
“Ok,” said Downs, “So the local time function has got a bug in it where it goes back 68 years instead of to -68 years…”
So, there could be problems in the compiler. Problems with how code handles the issue. Problems with what machines might actually do. And so it goes on.
Former Microsoft engineer Dave Plummer is optimistic that the problem will be solved in time. He told The Register, “Since the counter starts from current time, anything that is running when it rolls over in 2038 will be suspect. ie: it doesn’t have to have been running for long.
“While it’s conceivable there are important things that still rely on GetTickCount() or similar, I’d wager the intervening 13 years will be enough to find them!”
Downs is, however, concerned, and noted that children on school trips to the museum today could well be starting a career as engineers in 12 or 13 years and be given some legacy code to learn from. He’s met professional C programmers who are unaware of the breadth of potential problems. “And then you’ve got the other issue,” he said, “of things that we’re building now that we expect to last more than 12 years.”
The prognosis is not great.
“There’s no answer,” Downs concludes, “because unless you test each individual device and potentially software version … they can behave differently.
“There’s a hugely greater scope for things going wrong to a lesser or greater extent than they did for Y2K.” ®