summaryrefslogtreecommitdiff
path: root/docs/dexopt.html
diff options
context:
space:
mode:
Diffstat (limited to 'docs/dexopt.html')
-rw-r--r--docs/dexopt.html326
1 files changed, 326 insertions, 0 deletions
diff --git a/docs/dexopt.html b/docs/dexopt.html
new file mode 100644
index 0000000..7f0b4bc
--- /dev/null
+++ b/docs/dexopt.html
@@ -0,0 +1,326 @@
+<html>
+<head>
+ <title>Dalvik Optimization and Verification</title>
+</head>
+
+<body>
+<h1>Dalvik Optimization and Verification With <i>dexopt</i></h1>
+
+<p>
+The Dalvik virtual machine was designed specifically for the Android
+mobile platform. The target systems have little RAM, store data on slow
+internal flash memory, and generally have the performance characteristics
+of decade-old desktop systems. They also run Linux, which provides
+virtual memory, processes and threads, and UID-based security mechanisms.
+<p>
+The features and limitations caused us to focus on certain goals:
+
+<ul>
+ <li>Class data, notably bytecode, must be shared between multiple
+ processes to minimize total system memory usage.
+ <li>The overhead in launching a new app must be minimized to keep
+ the device responsive.
+ <li>Storing class data in individual files results in a lot of
+ redundancy, especially with respect to strings. To conserve disk
+ space we need to factor this out.
+ <li>Parsing class data fields adds unnecessary overhead during
+ class loading. Accessing data values (e.g. integers and strings)
+ directly as C types is better.
+ <li>Bytecode verification is necessary, but slow, so we want to verify
+ as much as possible outside app execution.
+ <li>Bytecode optimization (quickened instructions, method pruning) is
+ important for speed and battery life.
+ <li>For security reasons, processes may not edit shared code.
+</ul>
+
+<p>
+The typical VM implementation uncompresses individual classes from a
+compressed archive and stores them on the heap. This implies a separate
+copy of each class in every process, and slows application startup because
+the code must be uncompressed (or at least read off disk in many small
+pieces). On the other hand, having the bytecode on the local heap makes
+it easy to rewrite instructions on first use, facilitating a number of
+different optimizations.
+<p>
+The goals led us to make some fundamental decisions:
+
+<ul>
+ <li>Multiple classes are aggregated into a single "DEX" file.
+ <li>DEX files are mapped read-only and shared between processes.
+ <li>Byte ordering and word alignment are adjusted to suit the local
+ system.
+ <li>Bytecode verification is mandatory for all classes, but we want
+ to "pre-verify" whatever we can.
+ <li>Optimizations that require rewriting bytecode must be done ahead
+ of time.
+</ul>
+
+<p>
+The consequences of these decisions are explained in the following sections.
+
+
+<h2>VM Operation</h2>
+
+<p>
+Application code is delivered to the system in a <code>.jar</code>
+or <code>.apk</code> file. These are really just <code>.zip</code>
+archives with some meta-data files added. The Dalvik DEX data file
+is always called <code>classes.dex</code>.
+<p>
+The bytecode cannot be memory-mapped and executed directly from the zip
+file, because the data is compressed and the start of the file is not
+guaranteed to be word-aligned. These problems could be addressed by
+storing <code>classes.dex</code> without compression and padding out the zip
+file, but that would increase the size of the package sent across the
+data network.
+<p>
+We need to extract <code>classes.dex</code> from the zip archive before
+we can use it. While we have the file available, we might as well perform
+some of the other actions (realignment, optimization, verification) described
+earlier. This raises a new question however: who is responsible for doing
+this, and where do we keep the output?
+
+<h3>Preparation</h3>
+
+<p>
+There are at least three different ways to create a "prepared" DEX file,
+sometimes known as "ODEX" (for Optimized DEX):
+<ol>
+ <li>The VM does it "just in time". The output goes into a special
+ <code>dalvik-cache</code> directory. This works on the desktop and
+ engineering-only device builds where the permissions on the
+ <code>dalvik-cache</code> directory are not restricted. On production
+ devices, this is not allowed.
+ <li>The system installer does it when an application is first added.
+ It has the privileges required to write to <code>dalvik-cache</code>.
+ <li>The build system does it ahead of time. The relevant <code>jar</code>
+ / <code>apk</code> files are present, but the <code>classes.dex</code>
+ is stripped out. The optimized DEX is stored next to the original
+ zip archive, not in <code>dalvik-cache</code>, and is part of the
+ system image.
+</ol>
+<p>
+The <code>dalvik-cache</code> directory is more accurately
+<code>$ANDROID_DATA/data/dalvik-cache</code>. The files inside it have
+names derived from the full path of the source DEX. On the device the
+directory is owned by <code>system</code> / <code>system</code>
+and has 0771 permissions, and the optimized DEX files stored there are
+owned by <code>system</code> and the
+application's group, with 0644 permissions. DRM-locked applications will
+use 640 permissions to prevent other user applications from examining them.
+The bottom line is that you can read your own DEX file and those of most
+other applications, but you cannot create, modify, or remove them.
+<p>
+Preparation of the DEX file for the "just in time" and "system installer"
+approaches proceeds in three steps:
+<p>
+First, the dalvik-cache file is created. This must be done in a process
+with appropriate privileges, so for the "system installer" case this is
+done within <code>installd</code>, which runs as root.
+<p>
+Second, the <code>classes.dex</code> entry is extracted from the the zip
+archive. A small amount of space is left at the start of the file for
+the ODEX header.
+<p>
+Third, the file is memory-mapped for easy access and tweaked for use on
+the current system. This includes byte-swapping and structure realigning,
+but no meaningful changes to the DEX file. We also do some basic
+structure checks, such as ensuring that file offsets and data indices
+fall within valid ranges.
+<p>
+The build system uses a hairy process that involves starting the
+emulator, forcing just-in-time optimization of all relevant DEX files,
+and then extracting the results from <code>dalvik-cache</code>. The
+reasons for doing this, rather than using a tool that runs on the desktop,
+will become more apparent when the optimizations are explained.
+<p>
+Once the code is byte-swapped and aligned, we're ready to go. We append
+some pre-computed data, fill in the ODEX header at the start of the file,
+and start executing. (The header is filled in last, so that we don't
+try to use a partial file.) If we're interested in verification and
+optimization, however, we need to insert a step after the initial prep.
+
+<h3>dexopt</h3>
+
+<p>
+We want to verify and optimize all of the classes in the DEX file. The
+easiest and safest way to do this is to load all of the classes into
+the VM and run through them. Anything that fails to load is simply not
+verified or optimized. Unfortunately, this can cause allocation of some
+resources that are difficult to release (e.g. loading of native shared
+libraries), so we don't want to do it in the same virtual machine that
+we're running applications in.
+<p>
+The solution is to invoke a program called <code>dexopt</code>, which
+is really just a back door into the VM. It performs an abbreviated VM
+initialization, loads zero or more DEX files from the bootstrap class
+path, and then sets about verifying and optimizing whatever it can from
+the target DEX. On completion, the process exits, freeing all resources.
+<p>
+It is possible for multiple VMs to want the same DEX file at the same
+time. File locking is used to ensure that dexopt is only run once.
+
+
+<h2>Verification</h2>
+
+<p>
+The bytecode verification process involves scanning through the instructions
+in every method in every class in a DEX file. The goal is to identify
+illegal instruction sequences so that we don't have to check for them at
+run time. Many of the computations involved are also necessary for "exact"
+garbage collection. See
+<a href="verifier.html">Dalvik Bytecode Verifier Notes</a> for more
+information.
+<p>
+For performance reasons, the optimizer (described in the next section)
+assumes that the verifier has run successfully, and makes some potentially
+unsafe assumptions. By default, Dalvik insists upon verifying all classes,
+and only optimizes classes that have been verified. If you want to
+disable the verifier, you can use command-line flags to do so. See also
+<a href="embedded-vm-control.html"> Controlling the Embedded VM</a>
+for instructions on controlling these
+features within the Android application framework.
+<p>
+Reporting of verification failures is a tricky issue. For example,
+calling a package-scope method on a class in a different package is
+illegal and will be caught by the verifier. We don't necessarily want
+to report it during verification though -- we actually want to throw
+an exception when the method call is attempted. Checking the access
+flags on every method call is expensive though. The
+<a href="verifier.html">Dalvik Bytecode Verifier Notes</a> document
+addresses this issue.
+<p>
+Classes that have been verified successfully have a flag set in the ODEX.
+They will not be re-verified when loaded. The Linux access permissions
+are expected to prevent tampering; if you can get around those, installing
+faulty bytecode is far from the easiest line of attack. The ODEX file has
+a 32-bit checksum, but that's chiefly present as a quick check for
+corrupted data.
+
+
+<h2>Optimization</h2>
+
+<p>
+Virtual machine interpreters typically perform certain optimizations the
+first time a piece of code is used. Constant pool references are replaced
+with pointers to internal data structures, operations that always succeed
+or always work a certain way are replaced with simpler forms. Some of
+these require information only available at runtime, others can be inferred
+statically when certain assumptions are made.
+<p>
+The Dalvik optimizer does the following:
+<ul>
+ <li>For virtual method calls, replace the method index with a
+ vtable index.
+ <li>For instance field get/put, replace the field index with
+ a byte offset. Also, merge the boolean / byte / char / short
+ variants into a single 32-bit form (less code in the interpreter
+ means more room in the CPU I-cache).
+ <li>Replace a handful of high-volume calls, like String.length(),
+ with "inline" replacements. This skips the usual method call
+ overhead, directly switching from the interpreter to a native
+ implementation.
+ <li>Prune empty methods. The simplest example is
+ <code>Object.&lt;init&gt;</code>, which does nothing, but must be
+ called whenever any object is allocated. The instruction is
+ replaced with a new version that acts as a no-op unless a debugger
+ is attached.
+ <li>Append pre-computed data. For example, the VM wants to have a
+ hash table for lookups on class name. Instead of computing this
+ when the DEX file is loaded, we can compute it now, saving heap
+ space and computation time in every VM where the DEX is loaded.
+</ul>
+
+<p>
+All of the instruction modifications involve replacing the opcode with
+one not defined by the Dalvik specification. This allows us to freely
+mix optimized and unoptimized instructions. The set of optimized
+instructions, and their exact representation, is tied closely to the VM
+version.
+<p>
+Most of the optimizations are obvious "wins". The use of raw indices
+and offsets not only allows us to execute more quickly, we can also
+skip the initial symbolic resolution. Pre-computation eats up
+disk space, and so must be done in moderation.
+<p>
+There are a couple of potential sources of trouble with these
+optimizations. First, vtable indices and byte offsets are subject to
+change if the VM is updated. Second, if a superclass is in a different
+DEX, and that other DEX is updated, we need to ensure that our optimized
+indices and offsets are updated as well. A similar but more subtle
+problem emerges when user-defined class loaders are employed: the class
+we actually call may not be the one we expected to call.
+<p>These problems are addressed with dependency lists and some limitations
+on what can be optimized.
+
+
+<h2>Dependencies and Limitations</h2>
+
+<p>
+The optimized DEX file includes a list of dependencies on other DEX files,
+plus the CRC-32 and modification date from the originating
+<code>classes.dex</code> zip file entry. The dependency list includes the
+full path to the <code>dalvik-cache</code> file, and the file's SHA-1
+signature. The timestamps of files on the device are unreliable and
+not used. The dependency area also includes the VM version number.
+<p>
+An optimized DEX is dependent upon all of the DEX files in the bootstrap
+class path. DEX files that are part of the bootstrap class path depend
+upon the DEX files that appeared earlier. To ensure that nothing outside
+the dependent DEX files is available, <code>dexopt</code> only loads the
+bootstrap classes. References to classes in other DEX files fail, which
+causes class loading and/or verification to fail, and classes with
+external dependencies are simply not optimized.
+<p>
+This means that splitting code out into many separate DEX files has a
+disadvantage: virtual method calls and instance field lookups between
+non-boot DEX files can't be optimized. Because verification is pass/fail
+with class granularity, no method in a class that has any reliance on
+classes in external DEX files can be optimized. This may be a bit
+heavy-handed, but it's the only way to guarantee that nothing breaks
+when individual pieces are updated.
+<p>
+Another negative consequence: any change to a bootstrap DEX will result
+in rejection of all optimized DEX files. This makes it hard to keep
+system updates small.
+<p>
+Despite our caution, there is still a possibility that a class in a DEX
+file loaded by a user-defined class loader could ask for a bootstrap class
+(say, String) and be given a different class with the same name. If a
+class in the DEX file being processed has the same name as a class in the
+bootstrap DEX files, the class will be flagged as ambiguous and references
+to it will not be resolved during verification / optimization. The class
+linking code in the VM does additional checks to plug another hole;
+see the verbose description in the VM sources for details (vm/oo/Class.c).
+<p>
+If one of the dependencies is updated, we need to re-verify and
+re-optimize the DEX file. If we can do a just-in-time <code>dexopt</code>
+invocation, this is easy. If we have to rely on the installer daemon, or
+the DEX was shipped only in ODEX, then the VM has to reject the DEX.
+<p>
+The output of <code>dexopt</code> is byte-swapped and struct-aligned
+for the host, and contains indices and offsets that are highly VM-specific
+(both version-wise and platform-wise). For this reason it's tricky to
+write a version of <code>dexopt</code> that runs on the desktop but
+generates output suitable for a particular device. The safest way to
+invoke it is on the target device, or on an emulator for that device.
+
+
+<h2>Generated DEX</h2>
+
+<p>
+Some languages and frameworks rely on the ability to generate bytecode
+and execute it. The rather heavy <code>dexopt</code> verification and
+optimization model doesn't work well with that.
+<p>
+We intend to support this in a future release, but the exact method is
+to be determined. We may allow individual classes to be added or whole
+DEX files; may allow Java bytecode or Dalvik bytecode in instructions;
+may perform the usual set of optimizations, or use a separate interpreter
+that performs on-first-use optimizations directly on the bytecode (which
+won't be mapped read-only, since it's locally defined).
+
+<address>Copyright &copy; 2008 The Android Open Source Project</address>
+
+</body>
+</html>