ITPub博客

首页 > Linux操作系统 > Linux操作系统 > [20130705]gettimeofday() 系统调用.txt

[20130705]gettimeofday() 系统调用.txt

原创 Linux操作系统 作者:lfree 时间:2013-07-05 16:45:35 0 删除 编辑
[20130705]gettimeofday() 系统调用.txt

链接:http://space.itpub.net/267265/viewspace-755535
今天本来想看看GETTIMEOFDAY,发现一个链接:
http://www.scaleabilities.co.uk/2012/12/18/who-stole-gettimeofday-from-oracle-straces/

自己找一台安装centos 6.2的系统做了测试。

按照链接的介绍,如果使用strace跟踪,可能看不到gettimeofday的调用(rhel 5.5以上版本)。

摘要:

System Call Interface

    Historically, the method for making a system call was via a software interrupt. This interrupt signalled the kernel to
switch into the kernel context and to process the system call on behalf of the user process. There is lots of detail
surrounding this such as how parameters are passed, security checks (because the kernel operates with full access to
hardware, including all system memory), invalidating TLB entries, and so forth, but we won't go into that here. The
important part is that this interface became increasingly unscalable, particularly across multiple SMP CPUs.  Intel
addressed the scalability issue in the Pentium II generation through the creation of a Fast System Call interface using
the SYSENTER and SYSEXIT instructions instead of interrupts. As this method did not exist in previous processor
generations, the Linux kernel needed a way to determine which method of invoking system calls was optimal for all
processors that were/are supported by the Operating System. This implementation needed to insulate the user code against
any kind of porting between different processor types, and the result was the implementation of the Virtual Dynamic
Shared Object library. This is a special library that the kernel maps into the address space of any user process and
implements the code for the Fast System Call interface.  Using ldd we can see that it is not actually represented by any
underlying file, just a memory address:
    
$ ldd /u01/app/oracle/product/10.2.0/db_1/bin/oracle
        linux-vdso.so.1 =>  (0x00007fff86dfc000)
        libskgxp10.so => /u01/app/oracle/product/10.2.0/db_1/lib/libskgxp10.so (0x00007fb32b12e000)
        libhasgen10.so => /u01/app/oracle/product/10.2.0/db_1/lib/libhasgen10.so (0x00007fb32af37000)
        libskgxn2.so => /u01/app/oracle/product/10.2.0/db_1/lib/libskgxn2.so (0x00007fb32ae35000)
        libocr10.so => /u01/app/oracle/product/10.2.0/db_1/lib/libocr10.so (0x00007fb32acc8000)
        libocrb10.so => /u01/app/oracle/product/10.2.0/db_1/lib/libocrb10.so (0x00007fb32ab86000)
        libocrutl10.so => /u01/app/oracle/product/10.2.0/db_1/lib/libocrutl10.so (0x00007fb32aa11000)
        libjox10.so => /u01/app/oracle/product/10.2.0/db_1/lib/libjox10.so (0x00007fb329f3f000)
        libclsra10.so => /u01/app/oracle/product/10.2.0/db_1/lib/libclsra10.so (0x00007fb329e35000)
        libdbcfg10.so => /u01/app/oracle/product/10.2.0/db_1/lib/libdbcfg10.so (0x00007fb329d17000)
        libnnz10.so => /u01/app/oracle/product/10.2.0/db_1/lib/libnnz10.so (0x00007fb329877000)
        libaio.so.1 => /lib64/libaio.so.1 (0x0000003f5d200000)
        libdl.so.2 => /lib64/libdl.so.2 (0x0000003f5da00000)
        libm.so.6 => /lib64/libm.so.6 (0x0000003f5e200000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003f5de00000)
        libnsl.so.1 => /lib64/libnsl.so.1 (0x0000003f6e600000)
        libc.so.6 => /lib64/libc.so.6 (0x0000003f5d600000)
        /lib64/ld-linux-x86-64.so.2 (0x0000003f5ce00000)

    The linux-vdso.so.1 entry at the top there is the VDSO library, which used be referred to as linux-gate in previous
versions. This article has some further information on how this is all structured, including disassembly of the actual
library showing the call to SYSENTER.

    Anyway, that's one of the purposes of the VDSO library, but it is more than that in concept; it is as abstraction to
what a system call actually is. With this abstraction in place it becomes possible to perform. other magic tricks under
the hood without any change to the user code. The particular trick that we are interested in for this blog entry is the
speedup of the gettimeofday(2) call.

Vsyscall64

    As documented here, the VDSO also contains some mapping directly to kernel code. Specifically, it allows the user
process some shortcuts to the system call interface by directly executing some/all of the code in user mode rather than
kernel mode. This is controlled via a kernel flag which can be viewed and modified via the /proc/sys/kernel/vsyscall64
special file. The possible values are as follows:

0 - Provides the most accurate time intervals at μs (microsecond) resolution, but also produces the highest call
overhead, as it uses a regular system call

1 - Slightly less accurate, although still at μs resolution, with a lower call overhead
2 - The least accurate, with time intervals at the ms (millisecond) level, but offers the lowest call overhead

    As from RHEL 5.5, the default value for this has become "1", which means by implication that it does not "use a
regular system call" as it did when the default was "0". I have not fully investigated exactly what is executed in the
various modes from a kernel code perspective, but I could make a fair guess from the descriptions:

0 – Regular system call, via Fast System Call interface if available

1 – Mostly user mode, probably just reading from a page of kernel-updated memory which is mapped into the user address
space. Some kernel code must be necessary to ensure the value in memory reflects the current timestamp

2 – Just user mode. Read the mapped memory page and rely on the kernel to update the value every millisecond.

    Like I said, that's just an educated guess. If anyone's looked into this, please do comment and I'll update
accordingly.

    Let's run a few tests. To check this out I wrote a small C program to essentially just call gettimeofday(2) over and
over again:

#include
 
void
main() {
 
    int i;
    struct timeval tv;
 
    for (i=1000000;i>0;i--)
        if (gettimeofday(&tv, 0))
            perror("gtod");
}

--我在我的测试环境做了测试:OS  = centos 6.2 内核版本:
# uname -a
Linux xxxx 2.6.32-220.17.1.el6.x86_64 #1 SMP Wed May 16 00:01:37 BST 2012 x86_64 x86_64 x86_64 GNU/Linux

# echo 0 >| /proc/sys/kernel/vsyscall64
# time strace -c ./gtod
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 99.84    0.095268           0   1000000           gettimeofday
  0.16    0.000152          19         8           mmap
  0.00    0.000000           0         1           read
  0.00    0.000000           0         2           open
  0.00    0.000000           0         2           close
  0.00    0.000000           0         2           fstat
  0.00    0.000000           0         3           mprotect
  0.00    0.000000           0         1           munmap
  0.00    0.000000           0         1           brk
  0.00    0.000000           0         1         1 access
  0.00    0.000000           0         1           execve
  0.00    0.000000           0         1           arch_prctl
------ ----------- ----------- --------- --------- ----------------
100.00    0.095420               1000023         1 total

real    0m33.072s
user    0m4.665s
sys     0m32.098s

# echo 1 >| /proc/sys/kernel/vsyscall64
# time strace -c ./gtod
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
  -nan    0.000000           0         1           read
  -nan    0.000000           0         2           open
  -nan    0.000000           0         2           close
  -nan    0.000000           0         2           fstat
  -nan    0.000000           0         8           mmap
  -nan    0.000000           0         3           mprotect
  -nan    0.000000           0         1           munmap
  -nan    0.000000           0         1           brk
  -nan    0.000000           0         1         1 access
  -nan    0.000000           0         1           execve
  -nan    0.000000           0         1           arch_prctl
------ ----------- ----------- --------- --------- ----------------
100.00    0.000000                    23         1 total

real    0m0.065s
user    0m0.062s
sys     0m0.003s

# echo 2 >| /proc/sys/kernel/vsyscall64
# cat /proc/sys/kernel/vsyscall64
2
# time strace -c ./gtod
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
  -nan    0.000000           0         1           read
  -nan    0.000000           0         2           open
  -nan    0.000000           0         2           close
  -nan    0.000000           0         2           fstat
  -nan    0.000000           0         8           mmap
  -nan    0.000000           0         3           mprotect
  -nan    0.000000           0         1           munmap
  -nan    0.000000           0         1           brk
  -nan    0.000000           0         1         1 access
  -nan    0.000000           0         1           execve
  -nan    0.000000           0         1           arch_prctl
------ ----------- ----------- --------- --------- ----------------
100.00    0.000000                    23         1 total

real    0m0.066s
user    0m0.064s
sys     0m0.001s


--重复我链接的测试:

SQL> @ver

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi

create table t1 ( c1 number ,c2 clob);
insert into t1 select rownum, lpad('a',32,'a') from dual connect by level<=10000;
commit ;
exec dbms_stats.gather_table_stats(user, 't1',method_opt=>'for all columns size 1');

SQL> select spid from v$process where addr in (select paddr from v$session where sid in (select sid from v$mystat where rownum=1));
SPID
------------
15664

$ cat  /tmp/a.sql
set timing on
set autot traceonly;
select * from t1;
quit


--三种情况的测试结果:
# echo 0 >| /proc/sys/kernel/vsyscall64
$ time strace -f -c sqlplus system/xxxx @/tmp/a.sql

...
Elapsed: 00:00:22.18

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 65.63    0.106459           2     60292           read
 16.42    0.026644           0    251228           gettimeofday
  7.22    0.011710           0    111292           getrusage
  7.12    0.011548           0     60117           write
  3.54    0.005747           0     60220           times
  0.02    0.000031           1        34           munmap
  0.01    0.000021           0       223       113 open
  0.01    0.000020           0       159           mmap
  0.01    0.000019           0        53           rt_sigaction
  0.01    0.000014           0       125           close
  0.01    0.000010           0        96           lseek
  0.00    0.000000           0        49        36 stat
  0.00    0.000000           0        55           fstat
  0.00    0.000000           0        65           mprotect
  0.00    0.000000           0        16           brk
  0.00    0.000000           0        18           rt_sigprocmask
  0.00    0.000000           0         3           ioctl
  0.00    0.000000           0        21        14 access
  0.00    0.000000           0         2           pipe
  0.00    0.000000           0        16        14 shmget
  0.00    0.000000           0         2           shmat
  0.00    0.000000           0         1           shmctl
  0.00    0.000000           0         2           dup
  0.00    0.000000           0         9           socket
  0.00    0.000000           0         8         8 connect
  0.00    0.000000           0         1           bind
  0.00    0.000000           0         1           getsockname
  0.00    0.000000           0         1         1 getpeername
  0.00    0.000000           0         2           getsockopt
  0.00    0.000000           0         1           clone
  0.00    0.000000           0         2           execve
  0.00    0.000000           0         7           uname
  0.00    0.000000           0         2           semctl
  0.00    0.000000           0         2           shmdt
  0.00    0.000000           0        29           fcntl
  0.00    0.000000           0         6           getdents
  0.00    0.000000           0         5           getcwd
  0.00    0.000000           0         1           chdir
  0.00    0.000000           0         3           readlink
  0.00    0.000000           0         2           umask
  0.00    0.000000           0         9           getrlimit
  0.00    0.000000           0         7           getuid
  0.00    0.000000           0         2           geteuid
  0.00    0.000000           0         1           getegid
  0.00    0.000000           0         3           getppid
  0.00    0.000000           0         2         1 setsid
  0.00    0.000000           0         1           sigaltstack
  0.00    0.000000           0         3           statfs
  0.00    0.000000           0         2           arch_prctl
  0.00    0.000000           0         3           setrlimit
  0.00    0.000000           0         8         2 futex
  0.00    0.000000           0         1           sched_getaffinity
  0.00    0.000000           0         2           io_setup
  0.00    0.000000           0         1           io_destroy
  0.00    0.000000           0         2           set_tid_address
  0.00    0.000000           0         1           semtimedop
  0.00    0.000000           0         2           set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00    0.162223                544221       189 total

real    0m22.412s
user    0m3.541s
sys     0m12.842s

# echo 1 >| /proc/sys/kernel/vsyscall64
$ time strace -f -c sqlplus system/xxxx @/tmp/a.sql
....
Elapsed: 00:00:10.66
.....
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 66.44    0.044495           1     60292           read
 14.52    0.009727           0    116045           getrusage
 12.88    0.008626           0     60117           write
  5.59    0.003745           0     60655           times
  0.21    0.000138           4        34           munmap
  0.10    0.000066          33         2           pipe
  0.10    0.000065          33         2           set_tid_address
  0.06    0.000040           0       225       113 open
  0.04    0.000025           1        21        14 access
  0.03    0.000018           0        96           lseek
  0.02    0.000015           0        65           mprotect
  0.02    0.000011           0       170           mmap
  0.00    0.000000           0       127           close
  0.00    0.000000           0        49        36 stat
  0.00    0.000000           0        55           fstat
  0.00    0.000000           0        16           brk
  0.00    0.000000           0        53           rt_sigaction
  0.00    0.000000           0        18           rt_sigprocmask
  0.00    0.000000           0         3           ioctl
  0.00    0.000000           0         6           pread
  0.00    0.000000           0        16        14 shmget
  0.00    0.000000           0         2           shmat
  0.00    0.000000           0         1           shmctl
  0.00    0.000000           0         2           dup
  0.00    0.000000           0         9           socket
  0.00    0.000000           0         8         8 connect
  0.00    0.000000           0         1           bind
  0.00    0.000000           0         1           getsockname
  0.00    0.000000           0         1         1 getpeername
  0.00    0.000000           0         2           getsockopt
  0.00    0.000000           0         1           clone
  0.00    0.000000           0         2           execve
  0.00    0.000000           0         7           uname
  0.00    0.000000           0         5           semctl
  0.00    0.000000           0         2           shmdt
  0.00    0.000000           0        31           fcntl
  0.00    0.000000           0         6           getdents
  0.00    0.000000           0         5           getcwd
  0.00    0.000000           0         1           chdir
  0.00    0.000000           0         3           readlink
  0.00    0.000000           0         2           umask
  0.00    0.000000           0         9           getrlimit
  0.00    0.000000           0         7           getuid
  0.00    0.000000           0         2           geteuid
  0.00    0.000000           0         1           getegid
  0.00    0.000000           0         3           getppid
  0.00    0.000000           0         2         1 setsid
  0.00    0.000000           0         1           sigaltstack
  0.00    0.000000           0         5           statfs
  0.00    0.000000           0         2           arch_prctl
  0.00    0.000000           0         3           setrlimit
  0.00    0.000000           0         8         2 futex
  0.00    0.000000           0         1           sched_getaffinity
  0.00    0.000000           0         2           io_setup
  0.00    0.000000           0         1           io_destroy
  0.00    0.000000           0         2           semtimedop
  0.00    0.000000           0         2           set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00    0.066971                298210       189 total

real    0m11.241s
user    0m2.249s
sys     0m6.253s

# echo 1 >| /proc/sys/kernel/vsyscall64
$ time strace -f -c sqlplus system/xxxx @/tmp/a.sql
....
Elapsed: 00:00:12.85

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 69.65    0.069590           1     60292           read
 13.24    0.013234           0    111684           getrusage
 11.39    0.011378           0     60117           write
  5.51    0.005502           0     60262           times
  0.14    0.000142           8        18           rt_sigprocmask
  0.03    0.000028           0        65           mprotect
  0.03    0.000027           0       223       113 open
  0.02    0.000019           1        34           munmap
  0.00    0.000000           0       125           close
  0.00    0.000000           0        49        36 stat
  0.00    0.000000           0        55           fstat
  0.00    0.000000           0        96           lseek
  0.00    0.000000           0       165           mmap
  0.00    0.000000           0        16           brk
  0.00    0.000000           0        53           rt_sigaction
  0.00    0.000000           0         3           ioctl
  0.00    0.000000           0        21        14 access
  0.00    0.000000           0         2           pipe
  0.00    0.000000           0        16        14 shmget
  0.00    0.000000           0         2           shmat
  0.00    0.000000           0         1           shmctl
  0.00    0.000000           0         2           dup
  0.00    0.000000           0         9           socket
  0.00    0.000000           0         8         8 connect
  0.00    0.000000           0         1           bind
  0.00    0.000000           0         1           getsockname
  0.00    0.000000           0         1         1 getpeername
  0.00    0.000000           0         2           getsockopt
  0.00    0.000000           0         1           clone
  0.00    0.000000           0         2           execve
  0.00    0.000000           0         7           uname
  0.00    0.000000           0         2           semctl
  0.00    0.000000           0         2           shmdt
  0.00    0.000000           0        29           fcntl
  0.00    0.000000           0         6           getdents
  0.00    0.000000           0         5           getcwd
  0.00    0.000000           0         1           chdir
  0.00    0.000000           0         3           readlink
  0.00    0.000000           0         2           umask
  0.00    0.000000           0         9           getrlimit
  0.00    0.000000           0         7           getuid
  0.00    0.000000           0         2           geteuid
  0.00    0.000000           0         1           getegid
  0.00    0.000000           0         3           getppid
  0.00    0.000000           0         2         1 setsid
  0.00    0.000000           0         1           sigaltstack
  0.00    0.000000           0         3           statfs
  0.00    0.000000           0         2           arch_prctl
  0.00    0.000000           0         3           setrlimit
  0.00    0.000000           0         8         2 futex
  0.00    0.000000           0         1           sched_getaffinity
  0.00    0.000000           0         2           io_setup
  0.00    0.000000           0         1           io_destroy
  0.00    0.000000           0         2           set_tid_address
  0.00    0.000000           0         2           semtimedop
  0.00    0.000000           0         2           set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00    0.099920                293434       189 total

real    0m13.330s
user    0m2.442s
sys     0m7.564s

--可以发现仅仅在/proc/sys/kernel/vsyscall64=0,出现gettimeofday.时间=22秒.
--很明显改成/proc/sys/kernel/vsyscall64=1(也是缺省设置效果最好)

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/267265/viewspace-765611/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论
熟悉oracle相关技术,擅长sql优化,rman备份与恢复,熟悉linux shell编程。

注册时间:2008-01-03

  • 博文量
    2600
  • 访问量
    6374054