1.1     Introduction


Crystallographic Concept Library (CCL) and Crystallographic Protocol Library (CPL) are being designed and implemented to achieve industrial strength in crystallographic computing.  One of the goals in the current structural genomics efforts is to find a path to industrialization of macromolecular crystallography.  CCL and CPL are designed to meet demands in this industrialization process from two distinctive but complementary perspectives.


CCL contains a collection of computer code that reflects the fundamental concepts in crystallography, such as space group, unit cell, and structure factor.  Its sub-library Math Applications in Crystallography (MAC) collects the commonly-used mathematical procedures in crystallographic computing, e.g., matrix and vector operations, FFT, least-squares model fitting.  Most of these computer code are in C++ programming language with some well-tested FORTRAN code inherited from older programs.  I try to isolate the recurring code from the general crystallographic application programs.  Those parts that handle the crystallographic concepts are summarized and engineered in object-oriented fashion.  Other parts often involve pure math procedures or algorithms.  They are also isolated from the crystallographic topics.  Since CCL and MAC are not end-user applications, in order to take advantage of these libraries, the existing application programs in the field need to be redesigned or at least modified.  Thus CCL and MAC most frequently face the question ¡®Why reinvent the wheel?¡¯  Reinventing the wheel is not the goal, but to achieve industrial strength in crystallographic computing at an unprecedented high-throughput and robustness, ¡®reengineer the wheel¡¯ is envisaged to be inevitable.


Nevertheless, this field has accumulated an abundant source of computational tools that effectively implement the working methods over the decades.  In contrast to CCL, CPL tries to utilize the existing resources, and integrate them into a set of uniform-looking application programming interfaces.  CPL offers a set of Python modules and classes for writing applications, graphical user interfaces and databases.  CPL accommodates the existing software packages, such as CNS, SOLVE, CCP4, and makes them truly complementary by providing automatic trial-runs, independent error analysis, data rejection/weighting, result comparison, and other intelligence in order to find optimal protocols to individual data set and its specific error content.  CPL hides all specific data formats, command scripts, output logs of the underlying software.  From the application programmer¡¯s point of view, CPL is a set of high-level, automated, and intelligent protocols that perform complex crystallographic processes, such as data scaling, MAD phasing, and structure refinement.  The actual working engine and the corresponding logistic details are hidden from the users if not requested.  CPL generates reports in several formats including XML.  Finally, CCL and CPL are designed to bring in strength at different levels.  They complement each other when solving specific problem.  The state-of-the-art programming techniques make it very feasible to integrate CCL and CPL.  Both libraries emphasize extensibility and portability for the constantly-evolving field of structural biology and structural genomics.


From an even broader perspective, a variety of new approaches are proposed and being actively practiced in crystallographic computing with the demand of structural genomics in mind.  More sophisticated, robust, and sometimes inevitably complex algorithms are being introduced to the users. On the other hand, straightforward front-ends and high-level of automation are expected.  CCL and CPL attempt to bring a fresh thinking to the new wave of advancement in crystallographic computing.


1.2     A Tour of This Book


1.3     Conventions Used


1.3.1 Notations


s, S


f, F

scalar functions

c, C

complexes; c = a + ib

c = |c|

amplitude of a complex

cc, Cc

complex conjugates, if c=a + ib, cc = a - ib

v, V

vectors; v = (a, b, c)

vectors;  = (-a, -b, -c) if v = (a, b, c)

m, M


p, P

geometric points

(C, r)

circle with center C and radius r


distance from point O to point P




Table Notations used.


1.3.2 Type abbreviation in function or class identifiers


Type T


std::complex<long double>






long double






long int




short int


long unsigned int


unsigned int


short unsigned int



























Table Type T in C++ and TYP in Python function or class identifiers.


1.4     Related Software


CCL and CPL rely on many other software to be built and executed.  This section describes where to obtain these software and how to install them.


1.4.1 GCC


CCL and CPL are compiled by GCC or GNU Compiler Collection (  The current release has not yet been tested by other compilers.


GCC can be obtained from  First, download gcc-3.2.tar.gz to a local harddrive.  Second, uncompress and untar the file by command


tar xzvf [path/]gcc-3.2.tar.gz


in a directory where GCC will be installed, usually recommended in /usr/local, where [path/] means optional.  On some systems, one may need two commands of gunzip and tar.  The GCC top directory, e.g., /usr/local/gcc-3.2, will be created.  In the top directory, type command


./configure [--prefix=`pwd`]



If the option is used, GCC will be installed in its top directory, otherwise, in /usr/local.  In most cases, the latter is recommended, but if multiple versions of GCC are needed, their top directories can be the choice.  If configure went well, type make in the top directory to build the package.  This step will take a while.  Then type make install to install the files.


1.4.2 Python


Only CPL depends on Python (  Python distribution Python-2.2.2.tgz can be downloaded from  The installation is exactly same as that of GCC.  If it is desired to replace the existing Python version on your system, use /usr in the prefix option.


1.4.3 PIL


Python Imaging Library (PIL; adds image processing capability to Python.  The latest distribution of PIL Imaging-1.1.4a2.tar.gz can be downloaded from  Installation of PIL must be done after those of GCC and Python.  Change your working directory to where Python is installed, e.g., /usr/local/Python-2.2.2.  Make a directory Extensions, if there is not yet one.  In the directory Extensions, unpack PIL distribution using command:


tar xzvf [path/]Imaging-1.1.4a2.tar.gz


A new directory Imaging-1.1.4a2 will be created.  Move into Imaging-1.1.4a2/libImaging, and run the following configuration and make commands:





After these are done, move back to Imaging-1.1.4a2, and run:


python build

python install


Before the last command of installation, make sure the python command is indeed what you intend to use.  PIL installed by one python command will not be available for other Python releases on the same machine.


Some incomplete system may cause error during setup due to missing freetype.


1.4.4 Numeric/Numarray


Numeric package provides multidimensional arrays.  This package is in transition to a new generation of package Numarray.  The latest distribution can be found at  Download Numeric-23.0.tar.gz and unpack it in a directory, say, /usr/local/Python-2.2.2/Extensions or simply /usr/local.  A new directory Numeric-23.0 will be created.  In this top level directory, execute:


python install


Make sure that the python command is the very version Numeric is intended to be installed into.  Installation to one Python version will not be available to other versions on the same system.


1.4.5 Pmw


Pmw, Python Megawidgets (, is used to create GUI components.  Pmw.1.1.tar.gz can be obtained from Source Forge.  Unpacking of this file in a directory, say, /usr/local/Python-2.2.2/Extensions or simply /usr/local creates a new directory Pmw.  The parent directory of Pmw should be added to the environment variable PYTHONPATH before using it.  An alternative is to make a symbolic link of Pmw to /usr/local/rri/pub.


1.4.6 SWIG


SWIG, Simplified Wrapper and Interface Generator (, is used to wrap CCL into several modules of CPL.  Several SWIG shared libraries are required at runtime, even the user does not rebuild CPL.  Installation of SWIG is identical to that of GCC.  swig-1.3.19.tar.gz can be obtained from


1.4.7 FFTW


Fast Fourier Transform in the West (FFTW, package is used by CCL and in turn by CPL.  Its shared libraries are required at runtime.  fftw-2.1.3.tar.gz can be downloaded from  Its installation procedure is as described in 1.4.1 GCC.


1.4.8 TNT


Template Numerical Toolkit (TNT, is a library contains only herder files.  It is required only when CCL and CPL are built.  First, download from, then type command


unzip [path/]


in the directory where TNT will be installed.  /usr/local/include is recommended.  A subdirectory tnt will be created, which contains a set of header files.


1.4.9 LAPACK and BLAS


Linear Algebra PACKage (LAPACK, and Basic Linear Algebra Subprograms (BLAS, are low-level libraries used by CCL and CPL.  Rpms are available from  First, download the following rpm files:








then use the command rpm ¨Ci [--prefix=path] package.rpm to install package in the directory path.  /usr/local is recommended.  By default, it installs to /usr.


1.4.10 Gnuplot


Gnuplot ( is currently optional.  Future releases of our GUI programs may use as a graphic module.  Download and unpack gnuplot-3.7.3.tar.gz and gnuplot-py-1.6.tar.gz in directory /usr/local.  To install gnuplot, follow the procedure described in 1.4.1.  To install, move into the new directory gnuplot-py-1.6, and run command


python install


1.4.11 SOLVE




1.4.12 CNS




1.4.13 CCP4




1.4.14 SHELX




1.5 Installation and Execution