The Dutch Prutser's Blog

By: Harald van Breederode

  • Disclaimer

    The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.
  • Subscribe

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 248 other followers

Why does my Linux virtual machine lose time?

Posted by Harald van Breederode on February 8, 2009

Despite the fact that I am a real Unix adept I run Windows XP on my laptop because it is much better accessible to the blind due to the greater availability of screen readers which I found to be more feature rich than Unix screen readers (if they exist at all). But this doesn’t mean I can’t run Linux as well on my laptop, as a matter of fact I prefer to run my Oracle demos on a VMware Server virtual machine running Oracle Enterprise Linux. Which in turn runs my Oracle10g and Oracle11g databases. The problem I encountered with this setup is that my Linux virtual machines are losing time and the clock falls behind rather quickly. After conducting a bit of research I discovered that this is caused by the fact that the default Linux kernel runs at a 1000Hz internal clock frequency and that VMware is unable to deliver the clock interrupts on time without losing them. This means that some clock interrupts are lost without notice to the Linux kernels which assumes each interrupt marks 1/1000th of a second. So each clock interrupt that gets lost makes the clock fall behind a 1/1000th of a second. Although you can let VMware synchronize the guest O/S clock to the host O/S clock I don’t recommend this because it makes your Linux clock very bumpy. What I understand is that if you enable this clock synchronization VMware will set the Linux clock every minute equal to the host clock. This means that if my clock falls 3 seconds behind every minute the clock will jump forward 3 seconds each time VMware does its synchronization thing. You can imagine what this means to the Oracle database instrumentation. I also tried to keep the clock synchronized using the Network Time Protocol (NTP) but it didn’t work because the time loss is to unpredictable and NTP gave up. Everything else I tried didn’t solve this problem. The solution is to recompile the Linux kernel with a 100Hz internal kernel frequency.

Recompiling the Linux kernel

Note: The following procedure is only applicable to Oracle Enterprise Linux 5. If there is enough demand I will explain the procedure for Oracle Enterprise Linux 4 in a future posting.

To recompile the Linux kernel I first need to know which kernel I am running and second I need to get the kernel source code for that kernel. I can get the kernel release with the uname command as shown below:

# uname -r
2.6.18-128.0.0.0.2.el5

Next I can download the kernel source code from the Oracle Open Source website. In my case I need to download the kernel-2.6.18-128.0.0.0.2.el5.src.rpm file. Once downloaded I can install this kernel source RPM with the rpm command as follows:

# rpm -i kernel-2.6.18-128.0.0.0.2.el5.src.rpm

Note: The ‘#’ prompt indicates that I ran this as the root user. Also There will be warnings which can be ignored.

The kernel sources are now installed in the /usr/src/redhat/SOURCES directory, and in /usr/src/redhat/SPECS is a so called SPEC file installed which will be used to build the kernel rpm. Before recompiling the kernel I first need to change the internal clock frequency from 1000Hz to 100Hz. This is done by changing a setting in a configuration file. The name of this configuration file is hardware architecture dependant so I first need to get my machine type with the uname command as follows:

# uname -m
i686

The configuration file is located in /usr/src/redhat/SOURCES and the name is kernel-2.6.18-i686.config. In this file I need to change the line with CONFIG_HZ_1000=y into CONFIG_HZ_100=y and I am ready to compile the kernel with the rpmbuild command given the SPEC file as its input as shown below:

# cd /u01/redhat/SPECS
# rpmbuild --target=i686 -bp kernel-2.6.spec

This will run for an hour or more generating lots of output. Once finished the compiled kernel RPM is in /usr/src/redhat/RPMS/i686 with the name kernel-2.6.18-128.0.0.0.2.el5.i686.rpm waiting to get installed.

Installing the new kernel

The new kernel can be installed with the rpm command, but the same kernel is currently running so a reboot with a different kernel is required before continuing. After the reboot I recommend removing the current installed kernel with rpm before installing the newly compiled one as follows:

# rpm -e kernel-2.6.18-128.0.0.0.2.el5
# rpm -i kernel-2.6.18-128.0.0.0.2.el5.i686.rpm

Note: It is possible that the kernel remove fails due to dependencies which have to be removed before the kernel is removed, and reinstalled afterwards.

A final reboot is required and after setting the clock it should never run behind anymore, but setting up NTP is still a wise thing to do.

Warning: Recompiling (and installing) the Linux kernel yourself makes your environment unsupported by Oracle and should never be done on a production environment.

-Harald

About these ads

8 Responses to “Why does my Linux virtual machine lose time?”

  1. Ilmar Kerm said

    In VMWare KB there is an article about timekeeping in Linux.
    No kernel recompilation is required.

    http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1006427

  2. Paul van Eldijk said

    @ Ilmar:
    Have you tried VMWare’s solutions? I’ve never gotten them to work, but OK, YMMV.

    @ Harald:
    It seems that more recent Linux-releases (e.g. RH-ES 5, CentOS 5) actually do work properly under VMWare

    regards,
    Paul

  3. Ilmar Kerm said

    Yes, I have used these parameters with RHEL 3, 4 and 5 and they work. But I use only VMWare Infrastructure, not VMWare Server.

  4. Adrian Hollay said

    Hello,

    for Suse SLES running 32-bit and 64-bit kernels I am using the “clock=pit” kernel setting and just crontab to set the clock every minute using ntpdate:
    * * * * * /usr/sbin/ntpdate 192.168.1.1 2&>1 >> /var/log/ntpdate.log

    The offsets are very small:


    12 Feb 12:44:06 ntpdate[24789]: adjust time server 192.168.1.1 offset 0.000422 sec
    12 Feb 12:45:06 ntpdate[26079]: adjust time server 192.168.1.1 offset -0.000209 sec
    12 Feb 12:46:06 ntpdate[27386]: adjust time server 192.168.1.1 offset 0.000134 sec
    12 Feb 12:47:06 ntpdate[28688]: adjust time server 192.168.1.1 offset 0.007115 sec
    12 Feb 12:48:06 ntpdate[29965]: adjust time server 192.168.1.1 offset -0.010231 sec
    12 Feb 12:49:06 ntpdate[31267]: adjust time server 192.168.1.1 offset 0.005129 sec

    This config works for me couple of years, already, and it is also suitable for running Oracle Clusterware on it.

    Adrian Hollay

  5. Harald van Breederode said

    Hi Ilmar,

    Many many thanks for pointing me to this KB article. Yesterday I installed a 1000Hz kernel in both a EL4 and EL5 VMware Server VM and they both run on time with the mentioned kernel parameters, even without NTP. This saves me quite some kernel compilations.

    If I knew this a week ago I didn’t have to write this posting ;-)
    -Harald

  6. […] then me Marco Gralike • Prutser for breaking open the kernel discussion, good article there. http://prutser.wordpress.com/2009/02/08/why-does-my-linux-virtual-machine-lose-time/ •  VMware for maintaining there KB so well • You for taking the time to read this […]

  7. dikkiedick said

    Harald,

    I noticed on the training you gave last week that the time on your VMWare-servers was getting further and further behind on your Windowslaptop-time. Why was that then? As you’ve found a solution for the problem.

    Greetings, Dick

  8. Harald van Breederode said

    Hi Dick,

    I recently installed a new kernel and somehow the extra kernel arguments “clock=pmtmr divider=10″ from /etc/grub.conf were lost in this process. I re-added them and time keeping is back to normal.
    -Harald

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
Follow

Get every new post delivered to your Inbox.

Join 248 other followers

%d bloggers like this: