Re: [ljc] JVM is messed up when system clock is rolled backward

From: Martijn V.
Sent on: Monday, September 2, 2013 11:59 AM
Hi Bruno,

From the 7u lead:

===================

Looks like an old/known issue. I've seen varying reports around whether this is a linux kernel issue or jvm issue. I'd suggest that Bruno follows up with a question on the [address removed] mailing list. It's best if he includes more information around linux version, virtualization environment, testcase etc.

I'm not sure why[masked] isn't public. (I'll follow up with owner of bugs.sun.com) - Reading the bug, it's targeted for jdk9 and is a P4. Dev team don't regard it as important. [masked] is a continuation of http://bugs.sun.com/view_bug.do?bug_id=6311057) -[masked] suggests that "the problem exists because Linux itself is operating incorrectly in this area"

====================

Hope that helps!

Cheers,
Martijn


On 2 September[masked]:15, Martijn Verburg <[address removed]> wrote:
Hi Bruno,

Thanks for your detailed and thoughtful reply - it certainly helps narrow down options!  Last quick question, when you say this occurs on the latest JVM I assume you mean 7u25? Out of curiosity, have you tried the latest Java 8 betas? The reason I'm asking is two-fold, if it's fixed in 8 then it will likely be easy to get Oracle/OpenJDK folks to back-port that fix to 7u40 which is coming out next. Also if time is of the essence, you may need to run a patched version of OpenJDK as opposed to Oracle's JVM (not as terrifying as you think).

This full thread really needs to go over to the OpenJDK mailing lists, I'll ping Dalibor (the community manager) to see if he can suggest which one it should go on (I suspect this is a hotspot-dev issue though).

Cheers,
Martijn


On 2 September[masked]:23, Bruno Bossola <[address removed]> wrote:
Hi Martijn,

thanks for your answer, I am happy to provide you answers. I was just thinking about all these Java deamons in the world, running on 64bit VM+Linux, and what happens when the daylight saving switch is turned on or off... funny eh?


> 1.) I assume your system time is jumping backwards due to NTP syncing your servers?
>
This is something I would not care about: the cause of the change in system time is orthogonal to this issue. Anyway the system clock jump it's something that happens in the virtualized environments of one of the clouds we are using: every 2/3 weeks the clock might be somehow reset wrongly. For that reason we already put an NTP sync in place, which tries to circumvent the issue by syncing veeeeery slowly.

Please note that this will certainly be a blocking issue when the system is deployed on customer site, where we have no control over operations.


> 2.) Are your server architectures homogenous?
>
Orthogonal. At the moment, on the target system, yes.


> 3.) Do your CPUs support monotonically increasing clocks?
>
I see your point. This is consistently happening on a 64bitvm when used on a 64bit linux system, regardless of the monotonicity of the underlying OS (at least apparently).

This should not happen for primitives such as System.nanoTime() (like the queue used internally for ScheduledExecutor) that should work correctly in presence of a monotonic system:
jlong os::javaTimeNanos() {
  if (Linux::supports_monotonic_clock()) {
    struct timespec tp;
    int status = Linux::clock_gettime(CLOCK_MONOTONIC, &tp);
    assert(status == 0, "gettime error");
    jlong result = jlong(tp.tv_sec) * (1000 * 1000 * 1000) + jlong(tp.tv_nsec);
    return result;
  } else {
    timeval time;
    int status = gettimeofday(&time, NULL);
    assert(status != -1, "linux error");
    jlong usecs = jlong(time.tv_sec) * (1000 * 1000) + jlong(time.tv_usec);
    return 1000 * usecs;
  }
}
Unfortunately, for some reasons, this is not the case on 1.6+ 64bitVM on 64bitLinux


> 4.) What timeframe are your servers expected to be in time sync by
> (i.e. Is it OK if it takes a little extra elapsed time to synch the machines or does it have to be ASAP)

>
Orthogonal. The sync between servers It's not an issue at the moment. This happens also on my laptop also, and it's super easy to reproduce. This issue it's just about a JVM running on a system where the system clock changes; my requirement is that IF the system time change THEN the JVM does not malfunction.
Anyway, at the moment we are using a sloooow NTP syncing cycle to avoid the issue to appear.


> One last thought --> wait() and sleep() - have you considered coding/architectural
> changes to avoid these constructs?
>
To be honest I am not very interested to sleep() or wait(), as this was the way I discovered the issue, and hopefully everywhere in our codebase we are using primitives of the java.concurrency package. However, just for the sake of the argument, this problem affects also scheduled executors (which are part of such package) and I do not know what else... it's very scary. So atm the only coding/architectural change to avoid this conflicts I see is:
- rewrite everything in i.e. Python
- deploy it on Windows 64 (hey, it works there!)
Please also take in consideration not only our code but the million of existing libraries (including the Java core libraries themselves) which are using these primitives, based on a documentation that not even slightly mention the problem.

Furthermore, to be more clear about the issue, the extent of it and the concurrency library, let me introduce this very simple program:
import java.util.concurrent.locks.LockSupport;

public class Main {

public static void main(String[] args) {
for (int i=100; i>0; i--) {
System.out.println(i);
LockSupport.parkNanos(1000L*1000L*1000L);
}
System.out.println("Done!");
}
}

While running it with a 64bit 1.6+ JVM on 64bit Linux, turn the clock down one hour and wait until the counter stops... magic! Then try to recover, most of the times works :)


Let's say that, at least in my humble opinion, you should not release software with such a regression, it's just ridiculous (in Java 1.4 this does not happen) and if you do so, you should put a BIG label on the front.. And I'd really want to know where this bug
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6900441 ended up, as all the others are marked as duplicate of it: if I had the bug I could see what parts are affected and, as you are suggesting, work around (at least) with my code.


Let me know your further thoughts, and thanks again for your answer.
Cheers,

    Bruno



On Sun, Sep 1, 2013 at 9:32 AM, Martijn Verburg <[address removed]> wrote:
Hi Bruno,

A couple of questions:

1.) I assume your system time is jumping backwards due to NTP syncing your servers?
2.) Are your server architectures homogenous?
3.) Do your CPUs support monotonically increasing clocks?
4.) What timeframe are your servers expected to be in time sync by (i.e. Is it OK if it takes a little extra elapsed time to synch the machines or does it have to be ASAP) 

One last thought --> wait() and sleep() - have you considered coding/architectural changes to avoid these constructs?


Cheers,
Martijn


On 31 August[masked]:44, Bruno Bossola <[address removed]> wrote:
Hi all,
 
In these days my teams are hitting a bug on the JVM 64bit on Linux 64bit: "...there is bug in JVM for overall scheduling during Sytem time changes backward, which also impacts very basic Object.wait & Thread.sleep methods. It becomes too risky to keep Java App running when system time switches back by even certain seconds. You never know what your Java App will end up to." (source: stackoverflow.com)

These are some of the consequences:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7139684
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6311057
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7139684
 
if u want to see something saucy, the source bug is now unavailable:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6900441
 
See also here for a stackoverflow drill:
http://stackoverflow.com/questions/9044423/java-scheduler-which-is-completely-independent-of-system-time-changes

Unfortunately such bug is NOT fixed in the latest JVM, so the recommended  course of action is to restart the VM if a bit time jump happens (on small jumps the JVM will catch up)

Did anybody experience this issue? And found any viable solution apart from a non-java monitor program?

Cheers,

    Bruno

 




--
Please Note: If you hit "REPLY", your message will be sent to everyone on this mailing list ([address removed])
This message was sent by Bruno Bossola ([address removed]) from LJC - London Java Community.
To learn more about Bruno Bossola, visit his/her member profile
Set my mailing list to email me As they are sent | In one daily email | Don't send me mailing list messages

Meetup, POB 4668 #37895 NY NY USA 10163 | [address removed]





--
Please Note: If you hit "REPLY", your message will be sent to everyone on this mailing list ([address removed])
This message was sent by Martijn Verburg ([address removed]) from LJC - London Java Community.
To learn more about Martijn Verburg, visit his/her member profile





--
Please Note: If you hit "REPLY", your message will be sent to everyone on this mailing list ([address removed])
This message was sent by Bruno Bossola ([address removed]) from LJC - London Java Community.
To learn more about Bruno Bossola, visit his/her member profile
Set my mailing list to email me As they are sent | In one daily email | Don't send me mailing list messages

Meetup, POB 4668 #37895 NY NY USA 10163 | [address removed]





--
Please Note: If you hit "REPLY", your message will be sent to everyone on this mailing list ([address removed])
This message was sent by Martijn Verburg ([address removed]) from LJC - London Java Community.
To learn more about Martijn Verburg, visit his/her member profile
Set my mailing list to email me As they are sent | In one daily email | Don't send me mailing list messages

Meetup, POB 4668 #37895 NY NY USA 10163 | [address removed]

Our Sponsors

  • Our Blog

    Read the latest news from the LJC

  • RecWorks Ltd

    Fixing Tech Recruitment using the Power of Community

  • jClarity

    Java/JVM Performance Analysis Tools & mentoring for Java related matters

  • LJC Aggrity

    Our LJC Aggrity site contains blog posts from our members

  • LJC Book Club

    Our Book club with book reviews from our members

  • Devoxx UK

    Java Community Conference, in collaboration with the LJC 12/13 Jun 14

  • SkillsMatter

    "Host, help organise, promote, film many of our meetings."

  • New Relic

    New Relic makes sense of billions of metrics a day in real time.

  • Hazelcast

    Hazelcast is the leader in operating in-memory computing.

  • Java.Net

    We are an official Java User Group recognised by Oracle's JUG program

  • JRebel

    Free 3 month J-Rebel license.

  • O'Reilly

    40% discount on printed books and 50% on e-books.

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy