r37 - 23 Sep 2008 - 22:07:23 - RuneMagnussenYou are here: NTP >  Support Web > KnownOsIssues
REFACTOR See KnownOsIssuesDev for discussion of this topic.

9.2. Known Operating System Issues

9.2.1. Lost Interrupts

There are several mechanisms for systems to miss timer interrupts. All cause troubles for time keeping.

9.2.1.1. Scheduler HZ too high

See below for more discussion of several cases.

9.2.1.2. Disk drivers using non-DMA

Some early Linux distributations shipped with DMA for IDE disks disabled by default. Lots of disk activity would provoke lost interrupts.

See man hdparm for info on how to change this setting.

http://www.megapathdsl.net/~hmurray/hacks/read.c has is a program that will cause lots of disk activity to test this case.

9.2.2. Xen, VMware, and Other Virtual Machine Implementations

NTP was not designed to run inside of a virtual machine. It requires a high resolution system clock, with response times to clock interrupts that are serviced with a high level of accuracy. No known virtual machine is capable of meeting these requirements.

Run NTP on the base OS of the machine, and then have your various guest OSes take advantage of the good clock that is created on the system. Even that may not be enough, as there may be additional tools or kernel options that you need to enable so that virtual machine clients can adequately synchronize their virtual clocks to the physical system clock.

9.2.2.1. VMware

In the specific case of VMware, you also need to install VMware Tools (or at least the headers from the VMware Tools), and the additional kernel options you need are probably going to be something like "clock=pit nosmp noapic nolapic", although you may need to do some experimenting to see which particular options work best for you. Once you've got these changes in place, you will need to set "tools.syncTime" to "true" in the vmx file. See also the VMware knowledgebase article Clock in a Linux Guest Runs More Slowly or Quickly Than Real Time.

When running VMware with RedHat Linux, there are some additional things that need to be done. I quote:

    With VMware 2.5.x and RHEL4 you need to go into the MUI under Advanced
    Options and set "Misc.TimerHardPeriod" to 333.  The default value is
    1000.  You always have to set the host rate faster than the guest.
    That's pretty much the culprit behind all the clock problems is that
    single setting.  By default the host isn't able to keep up with the
    guests requests, thus the guests lose time.

Also See: http://www.vmware.com/pdf/vmware_timekeeping.pdf

9.2.2.2. Xen

It appears that Xen just passes time-related system calls to the underlying master domain, and does not require any additional changes to support time sync into the guest domains.

9.2.2.3. Final notes

If your management or your client insists on running an ntpd instance inside of a VM client, even in the face of all the above information, there is a solution. Simply add a "noselect" keyword to each of the "server" definitions, and your VM client ntpd will monitor the defined upstream servers, but it won't actually try to sync with any of them -- leaving that job to the copy of ntpd running on the base hardware, as defined above. This will allow your client applications to confirm that they have good time sync, through the use of the ntpq program and by looking at the offsets reported, while avoiding the problems of actually trying to set the system time on the VM client.

However, do keep in mind that the kinds of additional/alternative kernel options you need to enable good time sync within your virtialization system may interfere with the proper operation of certain other types of programs. In that case, you need to make a decision -- do you run those applications under virtualization without good clock sync, or do you run them on a separate non-virtual machine that does have good clock sync?

Our thanks to Seph and Doug Hanks.

Related Links:

-- BradKnowles - 22 Feb 2007

9.2.3. Windows and Sun's Java Virtual Machine

Sun's Java Virtual Machine needs to be started with the -XX:+ForceTimeHighResolution parameter to avoid losing interrupts.

See http://www.macromedia.com/support/coldfusion/ts/documents/createuuid_clock_speed.htm for more information.

9.2.4. Linux

9.2.4.1. Kernel 2.4 (and Earlier)

9.2.4.1.1. Using a Local Refclock

  • First, you need to make sure that the PPSKit mods have been applied to your kernel. See PPSKit Implementation Status to see which is the right version of the kit for your system.
  • Second, if you're still having problems, make sure that the HZ= setting in your kernel configuration is set to 100. Some newer systems have come with this value set to 1000 instead, and that has tended to cause a lot of problems for some people by losing too many interrupts.

9.2.4.1.2. Without a Local Refclock

  • The PPSKit mods may not be necessary if you do not have a locally attached refclock (presumably over a serial line).
    • The issue seems to be primarily one of losing interrupts over a serial line that is very sensitive to delays.
  • You may still find that the PPSKit mods will make your ntpd server considerably more accurate and precise, even without a local refclock, due to the decrease in lost interrupts.
    • If you have a poorly performing ntpd which is not keeping good time on your system, you should seriously consider applying the PPSKit mods, or confirming that they are already applied, before you start assuming more serious hardware problems.

9.2.4.2. Kernel 2.6

9.2.4.2.1. Using a Local Refclock

  • The PPSKit mods have not been ported to kernel 2.6. There is currently no clear indication that the functionality provided by these mods have been subsumed into kernel 2.6.
  • You still have the same HZ= issue as shown above for kernel 2.4.
    • This is a bigger issue with kernel 2.6, since many distributions based on 2.6 are shipping with HZ= defaulting to a value of 1000.
  • Kernel 2.6 is still having problems working correctly with APIC and ACPI on many machines. You may need to disable APIC and/or ACPI at boot time before loading the OS, in order to get anything remotely resembling decent timekeeping.
  • See also the Dev Issues topic LinuxImplementationLinuxPPS

9.2.4.2.2. Without a Local Refclock

  • If you do not have a local refclock, you may find that kernel 2.6 works adequately for you, once any HZ= and APIC/ACPI issues are dealt with.
    • Otherwise, stick with kernel 2.4 until these issues have been resolved.

9.2.4.2.3. Lost ticks causing clock instability

From http://gossamer-threads.com/lists/linux/kernel/494604

In 2.6, some code has been added to watch for "lost ticks" and increment the jiffies counter to compensate for them. A "lost tick" is when timer interrupts are masked for so long that ticks pile up and the kernel doesn't see each one individually, so it loses count.

Lost ticks are a real problem, especially in 2.6 with the base interrupt rate having been increased to 1000 Hz, and it's good that the kernel tries to correct for them. However, detecting when a tick has truly been lost is tricky. The code that has been added (both in timer_tsc.c's mark_offset_tsc and timer_pm.c's mark_offset_pmtmr) is overly simplistic and can get false positives. Each time this happens, a spurious extra tick gets added in, causing the kernel's clock to go faster than real time.

9.2.4.2.4. A problem with the Reiser file system

The addition of the Reiser file system to the kernel caused a problem with ntpd. It was unable to stay synchronized, losing more than 10 minutes per day if allowed to run freely. The stock 2.6.18 kernel from Centos 5 had no problem. When the kernel interrupt rate (HZ) was reduced from 1000 to 250, the problem was solved. Apparently the Reiser FS produces enough interrupts to break the kernel clock at 1000 Hz. This occurred on a machine with a 2.4 GHz Intel Core Duo CPU.

9.2.4.2.5. Running ntpd without root privileges

The Linux Capabilities mechanism allows ntpd to drop all root privileges, except for the one it actually needs (the privilege to set the system clock). How to use this feature:

  • You need the Default Linux Capabilities in your kernel, either as a module (modprobe capability), or statically (under Security Options in the kernel configuration menu)
  • You need a working(!) version of libcap.so (http://www.kernel.org/pub/linux/libs/security/linux-privs)
    If you get cap_set_proc(): failed to drop root privileges errors after a kernel upgrade, you may need to recompile this library!
  • ntpd must be configured with --enable-linuxcaps
  • ntpd must be started as root, but with a -u argument to give it a non-root user id to switch to
  • Optionally, you can use the -i argument to additionally chroot ntpd (in fact, -i without -u should also work: ntpd will then run chrooted, with user id 0 but without root privileges, but this is not recommended)
You can verify your setup by looking at /proc/<PID>/status: For ntpd running without privileges, it should contain the lines
   CapInh: 0000000002000000
   CapPrm: 0000000002000000
   CapEff: 0000000002000000      
while for a root shell, you should see
   CapInh: 0000000000000000
   CapPrm: 00000000fffffeff
   CapEff: 00000000fffffeff

9.2.4.2.5.1. A problem with IPv6 interfaces after chroot

The ifiter_ioctl interface iterator reads IPv6 interface names from /proc/net/if_inet6. If no proc filesystem is mounted in the chroot jail, ntpd drops all IPv6 interfaces after startup.

The easy choices are

  • don't use chroot
  • mount proc in the chroot directory
  • disable interface updates with -U 0. ntpd will not notice any new or dropped interfaces anymore.

It might also work to

  • change ifiter_ioctl to enumerate IPv6 interface by another method. IPv4 interfaces are enumerated through ioctl on a socket.
  • install libinet6 to enable getifaddrs()

9.2.4.2.6. SELinux

Using SELinux with ntpd is known to cause problems. You will need to figure out how to configure SELinux to allow ntpd to access the system calls that it needs in order to set the system time, or you will need to figure out how to turn off all SELinux features with regards to ntpd.

As we get more information on how to do these kinds of things, we will add detail to this section.

9.2.4.2.7. Using udev

Most linux distributions use udev to manage

/dev
. To setup a symlink to a refclock device you need an udev-rule like this one:

KERNEL=="ttyS0" SYMLINK+="refclock-0"

Very old versions of udev will need this instead:

NAME=="ttyS0" SYMLINK+="refclock-0"

The rule have to be defined after the rule for the device linked to. If your distribution supports udev-rules in many files you should put the refclock rules in a file by itself to ease maintenance.

9.2.4.2.8. Kernel 2.6 Mis-Detecting CPU TSC Frequency

Starting with Linux Kernel 2.6.18, the CPU's Time Stamp Counter is used to keep time, and when booting sometimes the Kernel mis-detects the frequency of this counter. This may result in severe clock drift which is impossible for ntpd to correct.

One solution to this problem is to change back to the old "acpi_pm" clock, which is what was used in earlier kernels. For example, in your grub.conf file, you can set:

        clocksource=acpi_pm

And then reboot. A similar procedure is apparently possible with earlier versions of Kernel 2.6, which uses a "clock=" designation instead of "clocksource=".

Our thanks to Jordan Russell for locating and resolving this issue.

9.2.4.3. AppArmor causing "permission denied" errors

AppArmor is a security tool which has been developed by Novell and has made its way into the SuSE Linux/openSUSE distribution, and maybe also other distributions.
See: http://en.opensuse.org/AppArmor_Detail

AppArmor uses profiles to control which system devices and resources may be accessed by an application, allowing finer control than the standard Unix rights management. If an application tries to access a resource it has not been granted sufficient rights to then access is prevented, and a "permission denied" error occurs.

AppArmor is shipped with default profiles which work with the standard installation, but if an application's configuration is modified to use some non-standard configuration then the AppArmor profile has to be modified accordingly. This affects any application, not only ntpd.

The AppArmor profile for ntpd may require modification if refclock devices have been configured manually, or even if log files or statistics files shall be generated by ntpd.

In order to check whether a "permission denied" problem is related to AppArmor you can temporarily stop AppArmor and see if the problem persists, or not.

If AppArmor shall be used it must be configured to allow access to the refclock device used by ntpd. Under SuSE Linux/openSUSE this may be done using the configuration tool, yast2. To add an entry for a refclock /dev/refclock-0 which points to /dev/ttyS0:

  Yast2 -> Novell AppArmor -> Edit Profile
  Select profile /usr/sbin/ntpd
  Add entry: /dev/ttyS0
  Mark allow for: Read, Write, Link

This generates a new entry in the AppArmor profile file:

  /dev/ttyS0   rwl

Similar changes may be required to allow log or statistics file to be generated by ntpd under AppArmor control.

Please note the symbolic links (e.g. /dev/refclock-0) are also created newly after every reboot. If this doesn't appear to happen you must create an udev rule for this. See also 9.2.4.2.7. Using udev.

9.2.5. Mac OS X

The Mac OS X method of enabling ntpd is to go to the Apple menu option System Preferences..., then into the Date & Time sheet, then go to the Date & Time sub-panel, and click on the radio button labeled Set Date & Time automatically, which allows you to select a time server to use from a drop-down, or you can fill in the name of your own preferred time server.

Note that every time you exit this preference sheet, the system will re-write your /etc/ntp.conf based on the information you have provided.

  • Even if you have provided your own /etc/ntp.conf, there is no way to prevent the system from re-writing it based on the content of this field.
  • Even if you don't make any changes to this preference, just by going into this sub-panel and exiting back out, Mac OS X will re-write your /etc/ntp.conf.

Unfortunately, when Mac OS X creates the /etc/ntp.conf file, it will do nasty things like appending "minpoll 12 maxpoll 17" to every single line, including those lines which do not have a "server" directive.

  • Worse, any line that had more than two arguments will get the rest truncated and replaced by "minpoll 12 maxpoll 17".
  • All lines will get the directive "server" prepended to them, even if they weren't originally server directives.

Here's a sample input /etc/ntp.conf file:

         tos minclock 4 minsane 4
         server time.euro.apple.com iburst
         server de.pool.ntp.org iburst
         server fr.pool.ntp.org iburst
         server nl.pool.ntp.org iburst
         server uk.pool.ntp.org iburst
         server 0.europe.pool.ntp.org iburst
         server 1.europe.pool.ntp.org iburst
         server 2.europe.pool.ntp.org iburst
         server 127.127.1.0              # Local clock
         fudge 127.127.1.0 stratum 14    # Undisciplined
         statsdir /var/ntp/ntpstats
         filegen peerstats file peerstats type day enable
         filegen loopstats file loopstats type day enable
         filegen clockstats file clockstats type day enable

Here's what Mac OS X will munge this into:

         server tos minpoll 12 maxpoll 17
         server time.euro.apple.com minpoll 12 maxpoll 17
         server de.pool.ntp.org minpoll 12 maxpoll 17
         server fr.pool.ntp.org minpoll 12 maxpoll 17
         server nl.pool.ntp.org minpoll 12 maxpoll 17
         server uk.pool.ntp.org minpoll 12 maxpoll 17
         server 0.europe.pool.ntp.org minpoll 12 maxpoll 17
         server 1.europe.pool.ntp.org minpoll 12 maxpoll 17
         server 2.europe.pool.ntp.org minpoll 12 maxpoll 17
         server 127.127.1.0              # minpoll 12 maxpoll 17
         server fudge minpoll 12 maxpoll 17
         server statsdir minpoll 12 maxpoll 17
         server filegen minpoll 12 maxpoll 17
         server filegen minpoll 12 maxpoll 17
         server filegen minpoll 12 maxpoll 17

The result is totally bogus, won't parse, and will prevent ntpd from starting up.

If you're going to maintain your own /etc/ntp.conf file, you need to make sure you save a copy to something like /etc/ntp.conf.save.europe (or whatever you prefer), so that you can restore a good working copy after Mac OS X munges it beyond recognition.

You will probably also want to change the code in /System/Library/StartupItems/NetworkTime/NetworkTime so as to remove the call to ntpdate and change the invocation of ntpd to include a "-g" option on the command line.

Otherwise, you will probably want to start and stop ntpd manually, outside of the control of Mac OS X.

9.2.6. Sun

9.2.6.1. Sun Device Drivers

9.2.6.1.1. su Driver

An issue with the Sun su driver has surfaced with respect to PPS support. Currently (200508) the su driver is not supporting PPS correctly in some configurations. Sun is working on a patch for that issue. For more information please refer to bug_small.gif Bug #361.
Edit | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r37 < r36 < r35 < r34 < r33 | More topic actions
 
NTP Public Services Project
SSL security by CAcert
Get the CAcert Root Certificate
This site is powered by the TWiki collaboration platformCopyright © 1999-2008 by the contributing authors. All material on this collaboration platform is the property of the contributing authors. Ideas, requests, problems regarding the site? Send feedback