\input texinfo @c -*- texinfo -*-
+@c %**start of header
+@setfilename qemu-tech.info
+@settitle QEMU Internals
+@exampleindent 0
+@paragraphindent 0
+@c %**end of header
@iftex
-@settitle QEMU Internals
@titlepage
@sp 7
@center @titlefont{QEMU Internals}
@end titlepage
@end iftex
+@ifnottex
+@node Top
+@top
+
+@menu
+* Introduction::
+* QEMU Internals::
+* Regression Tests::
+* Index::
+@end menu
+@end ifnottex
+
+@contents
+
+@node Introduction
@chapter Introduction
+@menu
+* intro_features:: Features
+* intro_x86_emulation:: x86 emulation
+* intro_arm_emulation:: ARM emulation
+* intro_mips_emulation:: MIPS emulation
+* intro_ppc_emulation:: PowerPC emulation
+* intro_sparc_emulation:: SPARC emulation
+@end menu
+
+@node intro_features
@section Features
QEMU is a FAST! processor emulator using a portable dynamic
@itemize @minus
-@item
+@item
Full system emulation. In this mode, QEMU emulates a full system
-(usually a PC), including a processor and various peripherials. It can
+(usually a PC), including a processor and various peripherals. It can
be used to launch an different Operating System without rebooting the
PC or to debug system code.
-@item
+@item
User mode emulation (Linux host only). In this mode, QEMU can launch
Linux processes compiled for one CPU on another CPU. It can be used to
launch the Wine Windows API emulator (@url{http://www.winehq.org}) or
QEMU generic features:
-@itemize
+@itemize
@item User space only or full system emulation.
-@item Using dynamic translation to native code for reasonnable speed.
+@item Using dynamic translation to native code for reasonable speed.
@item Working on x86 and PowerPC hosts. Being tested on ARM, Sparc32, Alpha and S390.
@item Precise exceptions support.
-@item The virtual CPU is a library (@code{libqemu}) which can be used
+@item The virtual CPU is a library (@code{libqemu}) which can be used
in other projects (look at @file{qemu/tests/qruncom.c} to have an
example of user mode @code{libqemu} usage).
@end itemize
QEMU user mode emulation features:
-@itemize
+@itemize
@item Generic Linux system call converter, including most ioctls.
@item clone() emulation using native CPU clone() to use Linux scheduler for threads.
-@item Accurate signal handling by remapping host signals to target signals.
-@end itemize
+@item Accurate signal handling by remapping host signals to target signals.
@end itemize
QEMU full system emulation features:
-@itemize
+@itemize
@item QEMU can either use a full software MMU for maximum portability or use the host system call mmap() to simulate the target MMU.
@end itemize
+@node intro_x86_emulation
@section x86 emulation
QEMU x86 target features:
-@itemize
+@itemize
-@item The virtual x86 CPU supports 16 bit and 32 bit addressing with segmentation.
+@item The virtual x86 CPU supports 16 bit and 32 bit addressing with segmentation.
LDT/GDT and IDT are emulated. VM86 mode is also supported to run DOSEMU.
@item Support of host page sizes bigger than 4KB in user mode emulation.
@item QEMU can emulate itself on x86.
-@item An extensive Linux x86 CPU test program is included @file{tests/test-i386}.
+@item An extensive Linux x86 CPU test program is included @file{tests/test-i386}.
It can be used to test other x86 virtual CPUs.
@end itemize
Current QEMU limitations:
-@itemize
+@itemize
@item No SSE/MMX support (yet).
@item IPC syscalls are missing.
-@item The x86 segment limits and access rights are not tested at every
+@item The x86 segment limits and access rights are not tested at every
memory access (yet). Hopefully, very few OSes seem to rely on that for
normal use.
-@item On non x86 host CPUs, @code{double}s are used instead of the non standard
+@item On non x86 host CPUs, @code{double}s are used instead of the non standard
10 byte @code{long double}s of x86 for floating point emulation to get
maximum performances.
@end itemize
+@node intro_arm_emulation
@section ARM emulation
@itemize
@end itemize
+@node intro_mips_emulation
+@section MIPS emulation
+
+@itemize
+
+@item The system emulation allows full MIPS32/MIPS64 Release 2 emulation,
+including privileged instructions, FPU and MMU, in both little and big
+endian modes.
+
+@item The Linux userland emulation can run many 32 bit MIPS Linux binaries.
+
+@end itemize
+
+Current QEMU limitations:
+
+@itemize
+
+@item Self-modifying code is not always handled correctly.
+
+@item 64 bit userland emulation is not implemented.
+
+@item The system emulation is not complete enough to run real firmware.
+
+@item The watchpoint debug facility is not implemented.
+
+@end itemize
+
+@node intro_ppc_emulation
@section PowerPC emulation
@itemize
-@item Full PowerPC 32 bit emulation, including priviledged instructions,
+@item Full PowerPC 32 bit emulation, including privileged instructions,
FPU and MMU.
@item Can run most PowerPC Linux binaries.
@end itemize
+@node intro_sparc_emulation
@section SPARC emulation
@itemize
-@item SPARC V8 user support, except FPU instructions.
+@item Full SPARC V8 emulation, including privileged
+instructions, FPU and MMU. SPARC V9 emulation includes most privileged
+instructions, FPU and I/D MMU, but misses most VIS instructions.
+
+@item Can run most 32-bit SPARC Linux binaries and some handcrafted 64-bit SPARC Linux binaries.
+
+@end itemize
+
+Current QEMU limitations:
+
+@itemize
+
+@item IPC syscalls are missing.
+
+@item 128-bit floating point operations are not supported, though none of the
+real CPUs implement them either. FCMPE[SD] are not correctly
+implemented. Floating point exception support is untested.
-@item Can run some SPARC Linux binaries.
+@item Alignment is not enforced at all.
+
+@item Atomic instructions are not correctly implemented.
+
+@item Sparc64 emulators are not usable for anything yet.
@end itemize
+@node QEMU Internals
@chapter QEMU Internals
+@menu
+* QEMU compared to other emulators::
+* Portable dynamic translation::
+* Register allocation::
+* Condition code optimisations::
+* CPU state optimisations::
+* Translation cache::
+* Direct block chaining::
+* Self-modifying code and translated code invalidation::
+* Exception support::
+* MMU emulation::
+* Hardware interrupts::
+* User emulation specific details::
+* Bibliography::
+@end menu
+
+@node QEMU compared to other emulators
@section QEMU compared to other emulators
Like bochs [3], QEMU emulates an x86 CPU. But QEMU is much faster than
TWIN [6] is a Windows API emulator like Wine. It is less accurate than
Wine but includes a protected mode x86 interpreter to launch x86 Windows
-executables. Such an approach as greater potential because most of the
+executables. Such an approach has greater potential because most of the
Windows API is executed natively but it is far more difficult to develop
because all the data structures and function parameters exchanged
between the API and the x86 code must be converted.
and potentially unsafe host drivers. Moreover, they are unable to
provide cycle exact simulation as an emulator can.
+@node Portable dynamic translation
@section Portable dynamic translation
QEMU is a dynamic translator. When it first encounters a piece of code,
instructions to build a function (see @file{op.h:dyngen_code()}).
In essence, the process is similar to [1], but more work is done at
-compile time.
+compile time.
A key idea to get optimal performances is that constant parameters can
be passed to the simple operations. For that purpose, dummy ELF
To go even faster, GCC static register variables are used to keep the
state of the virtual CPU.
+@node Register allocation
@section Register allocation
Since QEMU uses fixed simple instructions, no efficient register
register, most of the virtual CPU state can be put in registers without
doing complicated register allocation.
+@node Condition code optimisations
@section Condition code optimisations
Good CPU condition codes emulation (@code{EFLAGS} register on x86) is a
the condition codes are not needed by the next instructions, no
condition codes are computed at all.
+@node CPU state optimisations
@section CPU state optimisations
The x86 CPU has many internal states which change the way it evaluates
[The FPU stack pointer register is not handled that way yet].
+@node Translation cache
@section Translation cache
A 16 MByte cache holds the most recently used translations. For
terminated by a jump or by a virtual CPU state change which the
translator cannot deduce statically).
+@node Direct block chaining
@section Direct block chaining
After each translated basic block is executed, QEMU uses the simulated
architectures (such as x86 or PowerPC), the @code{JUMP} opcode is
directly patched so that the block chaining has no overhead.
+@node Self-modifying code and translated code invalidation
@section Self-modifying code and translated code invalidation
Self-modifying code is a special challenge in x86 emulation because no
Correct translated code invalidation is done efficiently by maintaining
a linked list of every translated block contained in a given page. Other
-linked lists are also maintained to undo direct block chaining.
+linked lists are also maintained to undo direct block chaining.
Although the overhead of doing @code{mprotect()} calls is important,
most MSDOS programs can be emulated at reasonnable speed with QEMU and
really needs to be invalidated. It avoids invalidating the code when
only data is modified in the page.
+@node Exception support
@section Exception support
longjmp() is used when an exception such as division by zero is
-encountered.
+encountered.
The host SIGSEGV and SIGBUS signal handlers are used to get invalid
memory accesses. The exact CPU state can be retrieved because all the
optimisations. It is not a big concern because the emulated code can
still be restarted in any cases.
+@node MMU emulation
@section MMU emulation
For system emulation, QEMU uses the mmap() system call to emulate the
In order to avoid flushing the translated code each time the MMU
mappings change, QEMU uses a physically indexed translation cache. It
-means that each basic block is indexed with its physical address.
+means that each basic block is indexed with its physical address.
When MMU mappings change, only the chaining of the basic blocks is
reset (i.e. a basic block can no longer jump directly to another one).
+@node Hardware interrupts
@section Hardware interrupts
In order to be faster, QEMU does not check at every basic block if an
of the CPU emulator. Then the main loop can test if the interrupt is
pending and handle it.
+@node User emulation specific details
@section User emulation specific details
@subsection Linux system call translation
shared object as the ld-linux.so ELF interpreter. That way, it can be
relocated at load time.
+@node Bibliography
@section Bibliography
@table @asis
-@item [1]
+@item [1]
@url{http://citeseer.nj.nec.com/piumarta98optimizing.html}, Optimizing
direct threaded code by selective inlining (1998) by Ian Piumarta, Fabio
Riccardi.
x86 emulator on Alpha-Linux.
@item [5]
-@url{http://www.usenix.org/publications/library/proceedings/usenix-nt97/full_papers/chernoff/chernoff.pdf},
+@url{http://www.usenix.org/publications/library/proceedings/usenix-nt97/@/full_papers/chernoff/chernoff.pdf},
DIGITAL FX!32: Running 32-Bit x86 Applications on Alpha NT, by Anton
Chernoff and Ray Hookway.
Willows Software.
@item [7]
-@url{http://user-mode-linux.sourceforge.net/},
+@url{http://user-mode-linux.sourceforge.net/},
The User-mode Linux Kernel.
@item [8]
-@url{http://www.plex86.org/},
+@url{http://www.plex86.org/},
The new Plex86 project.
@item [9]
-@url{http://www.vmware.com/},
+@url{http://www.vmware.com/},
The VMWare PC virtualizer.
@item [10]
-@url{http://www.microsoft.com/windowsxp/virtualpc/},
+@url{http://www.microsoft.com/windowsxp/virtualpc/},
The VirtualPC PC virtualizer.
@item [11]
-@url{http://www.twoostwo.org/},
+@url{http://www.twoostwo.org/},
The TwoOStwo PC virtualizer.
@end table
+@node Regression Tests
@chapter Regression Tests
In the directory @file{tests/}, various interesting testing programs
-are available. There are used for regression testing.
+are available. They are used for regression testing.
+
+@menu
+* test-i386::
+* linux-test::
+* qruncom.c::
+@end menu
+@node test-i386
@section @file{test-i386}
This program executes most of the 16 bit and 32 bit x86 instructions and
Various exceptions are raised to test most of the x86 user space
exception reporting.
+@node linux-test
@section @file{linux-test}
This program tests various Linux system calls. It is used to verify
that the system call parameters are correctly converted between target
and host CPUs.
+@node qruncom.c
@section @file{qruncom.c}
Example of usage of @code{libqemu} to emulate a user mode i386 CPU.
+
+@node Index
+@chapter Index
+@printindex cp
+
+@bye