159 lines
5.2 KiB
Plaintext
159 lines
5.2 KiB
Plaintext
|
|
FreeType Optimization HOWTO
|
|
|
|
|
|
|
|
Introduction
|
|
============
|
|
|
|
This file describes several ways to improve the performance of the
|
|
FreeType engine on specific builds. Each `trick' has some
|
|
drawbacks, be it on code size or portability.
|
|
|
|
The performance improvement cannot be quantified here simply
|
|
because id depends significantly on platforms _and_ compilers.
|
|
|
|
|
|
|
|
I. Tweaking the configuration file
|
|
==================================
|
|
|
|
The FreeType configuration file is named `ft_conf.h' and contains
|
|
the definition of various macros that are used to configure the
|
|
engine at build time.
|
|
|
|
Apart from the Unix configuration file, which is generated when
|
|
calling the `configure' script from the template called
|
|
|
|
freetype/ft_conf.h.in ,
|
|
|
|
all configuration files are located in
|
|
|
|
freetype/lib/arch/<system>/ft_conf.h ,
|
|
|
|
where <system> stands for your platform. This release also
|
|
provides an `ansi' build, i.e., the directory `lib/arch/ansi' used
|
|
to compile with any ANSI-compliant compiler.
|
|
|
|
The configuration macros that relate to performance are described
|
|
next.
|
|
|
|
|
|
1. TT_CONFIG_OPTION_INTERPRETER_SWITCH
|
|
--------------------------------------
|
|
|
|
If set, this macro builds a bytecode interpreter which uses a
|
|
huge `switch' statement to parse the bytecode stream during
|
|
glyph hinting.
|
|
|
|
If unset, the interpreter uses a big jump table to call each
|
|
bytecode's routine.
|
|
|
|
This macro is *set* by default. However, it may be worthwile on
|
|
some platforms to unset it.
|
|
|
|
Note that this macro is ignored if
|
|
TT_CONFIG_OPTION_NO_INTERPRETER is set.
|
|
|
|
|
|
2. TT_CONFIG_OPTION_STATIC_INTERPRETER
|
|
--------------------------------------
|
|
|
|
If set, this macro builds a bytecode interpreter which uses a
|
|
static variable to store its state. On some processors, this
|
|
will produce code which is bigger but slightly faster.
|
|
|
|
Note that you should NOT DEFINE this macro when building a
|
|
thread-safe version of the engine.
|
|
|
|
This macro is *unset* by default.
|
|
|
|
|
|
3. TT_CONFIG_OPTION_STATIC_RASTER
|
|
---------------------------------
|
|
|
|
If set, this macro builds a scan-line converter which uses a
|
|
static variable to store its state. On some processors, though
|
|
depending on the compiler used, this will produce code which is
|
|
bigger but moderately faster.
|
|
|
|
Note that you should NOT DEFINE this macro when building a
|
|
thread-safe version of the engine.
|
|
|
|
This macro is *unset* by default. We do not recommend using it
|
|
except for extreme cases where a performance `edge' is needed.
|
|
|
|
|
|
|
|
II. Replacing some components with optimized versions
|
|
=====================================================
|
|
|
|
You can also, in order to improve performance, replace one or more
|
|
components from the original source files. Here are our
|
|
suggestions.
|
|
|
|
|
|
1. Use memory-mapped files whenever available
|
|
---------------------------------------------
|
|
|
|
Loading a glyph from a TrueType file needs many random seeks,
|
|
which take a lot of time when using disk-based files.
|
|
|
|
Whenever possible, use memory-mappings to improve load
|
|
performance dramatically. For an example, see the source file
|
|
|
|
freetype/lib/arch/unix/ttmmap.c
|
|
|
|
which uses Unix memory-mapped files.
|
|
|
|
|
|
2. Replace the computation routines in `ttcalc.c'
|
|
---------------------------------------------------
|
|
|
|
This file contains many computation routines that can easily be
|
|
replaced by inline-assembly, tailored for a specific processor
|
|
and/or compiler.
|
|
|
|
After heavy testing, we have found that these functions,
|
|
especially TT_MulDiv(), are the ones that are most extensively
|
|
used and called when loading glyphs from a font file.
|
|
|
|
We do not provide inline-assembly with this release, as we want
|
|
to emphasize the portability of our library. However, when
|
|
working on a specific project where the hardware is known to be
|
|
fixed (like on an embedded system), great performance gains
|
|
could be achieved by replacing these routines.
|
|
|
|
(By the way, the square root function is not optimal, but it is
|
|
very seldom called. However, its accuracy is _critical_.
|
|
Replacing it with a fast but inaccurate algorithm will ruin the
|
|
rendering of glyphs at small sizes.)
|
|
|
|
|
|
|
|
III. Measuring performance improvements
|
|
=======================================
|
|
|
|
Once you have chosen some improvements and rebuilt the library,
|
|
some quick ways to measure the `new' speed are:
|
|
|
|
- Run the test program `ftlint' on a directory containing many
|
|
TrueType fonts, and measure the time it takes. On Unix, you can
|
|
use the shell command `time' to do it like in
|
|
|
|
% time test/ftlint 10 /ttfonts/*.ttf
|
|
|
|
This will measure the performance improvement of the TrueType
|
|
interpreter.
|
|
|
|
- Run the test program `fttimer' on a font containing many complex
|
|
glyphs (the latest available versions of Times or Arial should
|
|
do it), probaby using anti-aliasing, as in:
|
|
|
|
% time test/fttimer -g /ttfonts/arial.ttf
|
|
|
|
Compare the results of several of these runs for each build.
|
|
|
|
|
|
--- end of OPTIMIZE ---
|