CHAPTER 6

Optimization

 

The topic of the last chapter is about evaluation of a function at any given location, however, one may be only interested in evaluation of a function at its maximum or minimum, and not quite so elsewhere.  This problem is called optimization.  This chapter discusses several frequently-used optimization techniques in crystallographic computing.  When choosing an appropriate means of optimization, object function and its functional complexity in the domain of search, also known as energy landscape, are the primary concerns.  The topic of optimization is largely out the scope of this book.  CCL and CPL offer only a few basic optimization tools.

 

6.1 Locate an Element in a Monotonic Sequence

 

Given a monotonically increasing or decreasing sequence s of objects and another object x, one may need to find where the object x shall fit in the sequence.  For example, it is often desired to find an appropriate resolution bin for any given reflection during a statistical calculation.

 

6.1.1 Template functions mac::locate and mac::hunt

 

template <class S, class T>

long locate(const S&, const T&, const long = -1);

template <class S, class T>

long   hunt(const S&, const T&, const long = -1);

These two functions have the exact same prototype, where S is a sequence type, and must support operator [] and function size().  The element of the sequence must have a comparable type T.  These functions return an index j that s[j] <= x < s[j + 1] or s[j] >= x > s[j + 1].  If j = -1 or j = s.size() ¨C 1, the given object x is out of range towards the beginning or end of the sequence, respectively.  Both functions may accept an initial guess for where the given object may locate.  hunt is a factor of 2 slower than locate at the worst case, but it can be a factor of log2(n) faster than locate, if the desired point is quite close to the input guess.  If a valid initial guess given to locate, it calls hunt.  If no valid initial guess, hunt is the same as locate.

 

template <class S, class T>

long locate(const S&, const long,

            const T&, const long = -1);

template <class S, class T>

long   hunt(const S&, const long,

            const T&, const long = -1);

These two functions behave the same as the previous ones, except that the sequence type S does not have to support size() function, but the size of the sequence must be given by the second argument.

 

#include <assert.h>

#include <math.h>

#include <vector.h>

#include <mac/locate.h>

 

main()

{

   const size_t n = 20;

   std::vector<double> s(n);

   for (int i = 0; i < n; i++) s[i] = log(i + 1);

      // s is populated with an ascending sequence.

 

   for (double t = -1; t < 4; t += 0.1)

   {

      const long k = mac::locate(s, t);

      assert (0 <= k && k <  n - 1 && s[k] <= t && t < s[k + 1] ||

              0 >  k               && s[k] >  t                 ||

                        k == n - 1 && s[k] <= t);

      assert (k == mac::hunt(s, t, long(n / 2)));

   }

 

   // Repeat the previous section, but specify the size of the vector.

   const long m = n / 2;

   for (double t = -1; t < 4; t += 0.1)

   {

      const long k = mac::locate(s, m, t);

      assert (0 <= k && k <  m - 1 && s[k] <= t && t < s[k + 1] ||

              0 >  k               && s[k] >  t                 ||

                        k == m - 1 && s[k] <= t);

      assert (k == mac::hunt(s, m, t, m / 2));

   }

 

   return 0;

}

 

Listing 6.1.1.0.1 Testing mac::locate and mac::hunt.

 

6.1.2 locate and hunt functions wrapped in cpl.mac.op

 

These functions are available in cpl.mac.op instantiated with several types.

 

locateVTYP(s, x, i = -1)

huntVTYP(s, x, i = -1)

The monotonic sequence s must be cpl.std.vectorTYP, where available TYPs are DBL, FLT, INT, SIN, LUI, UIN, and SUI (Table 1.3.2.0.1).

 

locateVsDBL(s, l, x, i = -1)

huntVsDBL(s, l, x, i = -1)

The monotonic sequence s must be cpl.std.vectorDBL.  l is an effective size of the sequence, which must be no greater than the real size of s.  If l is small than the real size, only the first l elements will be used.

 

locateADBL(s, l, x, i = -1)

huntADBL(s, l, x, i = -1)

The monotonic sequence s must be a C array of doubles, which is the returned type of cpl.ptr.new_ppDBL() or cpl.ptr.copy_ppDBL().  l must be no greater than the real size of the array.

 

#!/usr/bin/python

 

import cpl

import math

 

n = 20

s = cpl.std.vectorDBL()

for i in range(n): s.append(math.log(i + 1))

   # s is populated with an ascending sequence.

 

for i in range(-10, 40):

   t = 0.1 * i

   k = cpl.mac.op.locateVDBL(s, t)

   assert 0 <= k and k <  n - 1 and s[k] <= t and t < s[k + 1] or\

          0 >  k                and s[k] >  t                  or\

                     k == n - 1 and s[k] <= t

   assert k == cpl.mac.op.huntVDBL(s, t, n / 2)

 

# Repeat the previous section, but specify the size of the vector.

m = n / 2

for i in range(-10, 40):

   t = 0.1 * i

   k = cpl.mac.op.locateVsDBL(s, m, t)

   assert 0 <= k and k <  m - 1 and s[k] <= t and t < s[k + 1] or\

          0 >  k                and s[k] >  t                  or\

                     k == m - 1 and s[k] <= t

   assert k == cpl.mac.op.huntVsDBL(s, m, t, m / 2)

 

# Make a C array that contains the reverse order of s.

r = []

for i in range(n): r.append(s[i])      # Copy vectorDBL to a list

r.reverse()                            # Reverse the order

a = cpl.mathematics.sequenceToArray(r) # Make a C array from the list

app = cpl.ptr.copy_ppDBL(a.cast())     # Need a pointer to pointer to double

for i in range(-10, 40):

   t = 0.1 * i

   k = cpl.mac.op.locateADBL(app, n, t)

   assert 0 <= k and k <  n - 1 and a[k] >= t and t > a[k + 1] or\

          0 >  k                and a[k] <  t                  or\

                     k == n - 1 and a[k] >= t

   assert k == cpl.mac.op.huntADBL(app, n, t, n / 2)

cpl.ptr.delete_ppDBL(app)              # Clean up

 

Listing 6.1.2.0.1 Testing functions locate and hunt in Python.

 

6.2 Golden Section Search

 

If one has to find a minimum of a smooth, 1-dimentional function within a known range, golden section search is a robust algorithm of choice.

 

6.2.1 Template function mac::golden

 

template <class X, class Y>

Y golden(const X& xa, const X& xb, const X& xc,

         Y (*f)(const X&), X *xmin,

         const X& t = 0.0001);

Given a 1-dimensioanl function f with domain type X and function type Y, this function returns the minimum of the function and modifies xmin by the abscissa of the minimum.  The golden section search is performed between abscissas xa and xc.  xb is an initial guess where the functional minimum shall locate, that is, f(xb) is less than both f(xa) and f(xc).  The search stops when a fractional precision of about t is reached.

 

#include <iostream.h>

#include <math.h>

#include <mac/golden.h>

 

double func(const double& x)

{

   return -cos(x - 0.2);

}

 

main()

{

   double xmin;

   const double ymin = mac::golden(-0.5, 0.1, 0.5, func, &xmin);

   cout << "xmin, ymin shall be close to 0.2, -1, respectively." << endl;

   cout << "xmin = " << xmin << endl;

   cout << "ymin = " << ymin << endl;

 

   return 0;

}

 

Listing 6.2.1.0.1 Testing mac::golden().

 

6.2.2 cpl.mac.op.goldenDBL

 

goldenTYP(xa, xb, xc, f, t = 0.0001)

Given a pointer to a C function template <class T> T (*f)(const T&), returns a pair containing the minimum functional value of f and the variable value where the minimum is found between xa and xc.  xb is a hint where the minimum can be found.  t is a fractional tolerance.  The available types are DBL and FLT.

 

#include <math.h>

 

double y(const double& x)

{

   return -cos(x - 0.2);

}

 

Listing 6.2.2.0.1 Call-back function in C++ for cpl.mac.op.goldenDBL.

 

%module cb

 

%{

#include "cb.h"

%}

 

%include "cb.h"

 

%constant double (*goldenDBL_cb)(const double&) = y;

 

Listing 6.2.2.0.2 SWIG instruction that wraps the call-back function into Python.

 

#!/usr/bin/python

 

import cpl, cb

 

ymin, xmin = cpl.mac.op.goldenDBL(-0.5, 0.1, 0.5, cb.goldenDBL_cb, 1e-6)

print "xmin, ymin shall be close to 0.2, -1, respectively."

print "xmin =", xmin

print "ymin =", ymin

 

Listing 6.2.2.0.3 Testing goldenDBL in Python.

 

xmin, ymin shall be close to 0.2, -1, respectively.

xmin = 0.199689437822

ymin = -0.999999951776

 

Listing 6.2.2.0.4 Output result from the testing program above.

 

This is a trivial example, but illustrates the usage of call-back function.  When the template function mac::golden() is wrapped into cpl.mac.io, the call-back function it requires must be wrapped from C++ too.  (Using call-back function in Python?)  The SWIG instruction defines the wrapped call-back function as goldenDBL_cb.

 

6.3 Linear Least-squares Minimization

 

Consider a set of experimental measurements and a theoretical model intended to describe the experiment.  It is desired to adjust several parameters in the model in order to fit the noise-containing experimental data as well as possible.  This process is also known as parameterization.  Here ¡®fit¡¯ may be under the least-squares sense when the noise distributes normally.  However, the least-squares may not be the best in case of other noise distributions, as we will see later.  When the theoretical model can be written as a linear equation , where ai are the coefficients to be adjusted and fi(x) are a set of base functions, the least-squares minimization can be very straightforward.

 

6.4 Nonlinear Least-squares Minimization

 

When the theoretical model cannot be described by a linear equation as it often happens in crystallographic computing, nonlinear minimization becomes useful.  This section presents some template classes that perform large-scale non-linear least-squares fitting against various types of experimental measurements using a trust-region implementation of Levenberg-Marquardt method.

 

The theoretical model can be written as a nonlinear function Y = f(x, A), where A is a set of adjustable parameters A = (a1, a2, ¡­, an).  A set of experimental measurements (xj, yj), j = 1, 2, ¡­, m, is available, each is also known as a data point.  Theoretical calculation using the model at the experimental abscissas xj results in (xj, Yj), j = 1, 2, ¡­, m.  A best set of parameters ai in terms of least-squares fitting can be found when  is minimized.

 

6.4.1 Base class mac::nonlinear<Y, X>

 

This base class is an abstract template.  Use the derived classes for various measurement types as described below.  The parameters ai are always of type double.  If they are some types other than double, e.g., complex<double>, they must be split into doubles.  Y is the type of both the measurements yi and the theoretical model Y.  X is the type of the variable x.

 

6.4.1.1 Constructor and destructor

nonlinear(Y (*)(const X&,

                const std::vector<X>&,

                const std::vector<Y>&,

                const std::vector<double>&,

                const std::vector<bool>&,

                      std::vector<Y>*),

          const std::vector<X>&      x,

          const std::vector<Y>&      y,

          const std::vector<double>& para,

          const std::vector<bool>&   mask,

          const std::vector<double>& weight,

          const int                  iter,

          const double               tol,

          const double               ratio);

The first argument to the constructor is a function pointer.  The user supplied function shall evaluate the theoretical model Y = f(x, A) at given x and parameter set A.  See 6.4.1.2 for detail.

 

The second and third arguments x and y are vector<X> and vector<Y> of a same length, respectively.  They contain the entire set of measurement data.  weight has the same length, and specifies the weight of all data points.

 

The forth and fifth arguments para and mask are vectors of a common length.  para brings in the initial parameters.  mask is a set of Booleans that indicates whether the corresponding parameters are free for fitting.

 

iter is a maximum iteration allowed in the least-squares minimization.  tol is a fractional precision that triggers the end of the minimization.  ratio specifies the maximum data-to-parameter ratio.

 

virtual ~nonlinear();

This a virtual destructor.

 

6.4.1.2 Evaluation functions of theoretical model

The user supplied evaluation function is required to calculate the function value and partial derivatives of the function to all adjustable parameters.  This function can be specified at the time of construction.  It can again be changed later by a set function.

 

void setTheoretical(

   Y (*)(const X&                   u,

         const std::vector<X>&      x,

         const std::vector<Y>&      y,

         const std::vector<double>& para,

         const std::vector<bool>&   mask,

               std::vector<Y>*      deri));

This function simply takes a function pointer and have the class remember it being the evaluation function for the theoretical model Y = f(x, A) at the given x = u, and a set of specified parameters A = para.  This evaluation function calculates one model value at a time.  The previous function pointer is forgotten once a new pointer is set.  The entire data set of experimental measurements is also given as x and y, since they might be required during the evaluation.  If the last pointer deri is valid, the vector pointed by deri will be modified by partial derivatives of the function to all parameters.  mask is a screen that indicates the free or fixed parameters.  If a parameters is fixed, the partial derivative to this parameter needs not and will not be evaluated.

 

void testCalculation(const int    = 0,

                     const double = 0.001) const;

This function tests the user-supplied evaluation function.  Given a 0-offset index of a data point, and a small value as interval step for all parameters, this function calls the user-supplied evaluation function and print out the function values and partial derivatives in a table.  The interval is used when estimating partial derivatives using the function evaluation.  One may check whether the function value and partial derivatives are consistent.  The default data point is the first one and the default interval is 0.001.

 

void testCalculation(

   const int,

   const std::vector<double>&) const;

Given a 0-offset index of a data point, and a vector of small numbers for intervals that correspond to all parameters, this function tests the user-supplied evaluation function.

 

void setTheoreticalAll(std::vector<Y> (*)(

   const std::vector<X>&               x,

   const std::vector<Y>&               y,

   const std::vector<double>&          para,

   const std::vector<bool>&            mask,

         std::vector<std::vector<Y> >* deri));

There is another type of evaluation function that calculates a set of model values at once.  This set function sets a function pointer of such type.  This type of evaluation function does not take the variable u, instead it evaluates at all values in the vector x, and returns a vector<Y> of the size of x.  The last pointer deri points to a vector of vectors.  If it is valid, it will be modified by partial derivatives upon return.  The outer vector has the same length of x, and the inner ones have that of para after returning from the evaluation function.  However, these dimensions may not be correct when entering the evaluation function.  This evaluation function has higher priority than the previous one, that is, if it is set, the other function will not be used.

 

No testing function is available for this type of evaluation function.  It may be added in future release.

 

6.4.1.3 Experimental measurements

Experimental measurements are the noise-containing data that the parameters will be fitted against.  They are a sequence of pairs {(x1, y1), (x2, y2), ¡­, (xm, ym)}.  Each pair is associated with a weight that indicates the reliability of the data point.  This class does not take standard deviations since the way a weight is calculated based upon standard deviation may vary.  It is the user¡¯s control on how the weights are evaluated.  Measurement data are set at the time of construction, but more data can be added later, the data can be selected, and the entire data set can be exchanged after construction.

 

void appendMeasurement(const X&,

                       const Y&,

                       const double = 1);

void appendMeasurement(

   const std::vector<X>&,

   const std::vector<Y>&,

   const std::vector<double>& =

         std::vector<double>());

These functions append one or more data points to the measurement data, respectively.  The last arguments double or vector<double> supply the weights.  If they are not given, the weights are set to 1.  The additional data points will be used in the next fitting if no screening is done (see below).

 

void exchangeMeasurement(

   const std::vector<X>&,

   const std::vector<Y>&,

   const std::vector<double>& =

         std::vector<double>());

This function replaces the entire measurement data set with the newly specified one.  The default weights are all 1.  All data points will be used in the next fitting if no data screening is done.  The old data set will be lost after calling this function.

 

void screenMeasurement(const double errmax = 0);

Given a maximum allowed error errmax, this function screens all data points.  If errmax, the data point j is temporarily rejected.  Rejected data points will not be used in the next fitting, but they are not lost and can be reaccepted by later screening or by the next function useAllMeasurement().  This function may change the order of the measurement data.

 

void useAllMeasurement();

Reaccepts all data points.  The original sequence of the data points is not restored.

 

std::vector<X> getDatapoint() const;

std::vector<Y> getMeasurement() const;

Return currently accepted xj¡¯s and yj¡¯s, respectively.

 

int getMeasurementSize() const;

int getTotalMeasurement() const;

Return the sizes of currently accepted data points and the total data points, respectively.

 

void printMeasurement(const bool = 0) const;

Prints a message on the status of the current measurement data, e.g., number of accepted and rejected data points.  If the Boolean is set, the accepted xj¡¯s and yj¡¯s will also be printed, however it will be a compile error if the operator << is not defined for type X or Y.

 

6.4.1.4 Model parameters

It is the goal of this class to find a best set of parameters for the theoretical model.  The initial values of the parameters can be set at the time of construction, but they may be reset at a later time too.  A vector<bool> is used as a mask.  If an element is true, the corresponding parameter will be refined; if false, it will be fixed.

 

void setParameter(const std::vector<double>&);

Sets the new parameter set.  The old parameters will be lost after calling this function.

 

std::vector<double> getParameter() const;

int getParameterSize() const;

Return the current parameter set or its size, respectively.

 

void printParameter() const;

Prints a message on the current parameter set.

 

void setParameterName(

   const std::vector<std::string>&);

Sets the names of the parameters.  They will be used in printout.

 

void setMask(const std::vector<bool>&);

std::vector<bool> getMask() const;

Sets and gets the mask, respectively.

 

6.4.1.5 Other control

void setDataToParameterRatio(const double = 0);

double getDataToParameterRatio() const;

Sets and gets the maximum data-to-parameter ratio used in fitting.  If 0 is given, there will be no control for maximum data-to-parameter ratio.  If 0 is returned from the get function, it indicates no control for the maximum.

 

int numberOfDataUsed();

Returns the number of data points will be actually used in fitting considering the maximum data-to-parameter ratio and number of free parameters.  (This function should be a const, but it is not now.  It changes a private data member dataUsed.  This may be fixed in future releases.)

 

void setIterationMax(const int);

int getIterationMax() const;

void printIterationMax() const;

Sets, gets, and prints maximum number of iteration in fitting, respectively.

 

void setTolerance(const double);

double getTolerance() const;

void printTolerance() const;

Sets, gets, and prints a fractional tolerance that controls ending of the least-squares fitting, respectively.

 

6.4.1.6 Least-squares fitting

double fit(const std::vector<bool>& =

                 std::vector<bool>());

This function is the most important one in the class.  It performs the least-squares fitting, alters the parameter set and the statistics of the result.  It may take a new mask as an option.  This function can be called multiple times with different masks.

 

double fit(const std::vector<double>&);

Performs the least-squares fitting starting from the specified new parameter set instead of the one before this fitting.  The parameters before the fitting are lost.

 

void unfit();

Undoes the last fitting.  The parameter set and statistics before the last fitting are restored, and the parameter set and statistics after the last fitting are saved.  However, if a new mask was used when calling the last fitting, this mask remains effective after calling unfit().  If a new parameter set was given to the last fitting, this set will be restored by unfit(), not the parameters before the last fitting.  Do not change data-to-parameter ratio between last fitting and calling this function.

 

6.4.1.7 Statistics of result

double calculateChiSquare();

double getChiSquare() const;

Calculates and returns , respectively, where wj is the weight of data point j.  The first function also returns the updated c2.

 

double getRMSD() const;

double RMSDDrop() const;

Return root-mean-square deviation defined as  and the fractional difference between the RMSD before last fitting and the current one, respectively.

 

6.4.2 Derived class mac::nonlinearOperator<Y = double, X = int> for operator *

 

The base class nonlinear<Y, X> is a pure virtual class since a required function dotProduct() is not defined.  If type Y is a single value such as double, float, int, or a vector such as mac::vector3D<T>, use this derived class.  The operator * of type Y must be defined as follow: if two vectors u = (u1, u2, ¡­, un) and v = (v1, v2, ¡­, vn), u * v = .  This derived class can be used for all types with operator * defined as above.

 

Constructor

nonlinearOperator(Y (*)(const X&,

                        const std::vector<X>&,

                        const std::vector<Y>&,

                        const std::vector<double>&,

                        const std::vector<bool>&,

                              std::vector<Y>*) = NULL,

   const std::vector<X>&      x      =

         std::vector<X>(),

   const std::vector<Y>&      y      =

         std::vector<Y>(),

   const std::vector<double>& para   =

         std::vector<double>(),

   const std::vector<bool>&   mask   =

         std::vector<bool>(),

   const std::vector<double>& weight =

         std::vector<double>(),

   const int                  iter   = 50,

   const double               tol    = 1e-6,

   const double               ratio  = 0);

This constructor supplies defaults for all arguments, so it is also the default constructor.

 

6.4.3 Derived class mac::nonlinearComplex<Y = complex<double>, X = int> for complex

 

If type Y is complex or derived classes from complex, use this class.  Functions real(y) and imag(y) must be defined, where y is of type Y.  The constructor is same as that of nonlinearOperator<Y, X>.

 

6.4.4 Derived class mac::nonlinearDot_prod<Y = TNT::Vector<double>, X = int> for types with dot_prod()

 

If type Y has function dot_prod() defined for the dot product such as TNT::Vector<T>, use this class.  The constructor is same as that of nonlinearOperator<Y, X>.

 

6.4.5 Testing non-linear least-squares fitting

 

6.4.5.1 A simple non-linear function

This example uses the generalized Gaussian function described in 4.1.  There are six parameters in this function can be refined.  Since both the variable and the function have type double and this type supports the operator *, we need to construct an object of mac::nonlinearOperator<double, double> for the refinement.  See Listing 6.4.5.1.1 for more detail.

 

#include <assert.h>

#include <mac/utility.h>

#include <mac/gGaussian.h>

#include <mac/nonlinear.h>

#include <iostream.h>

#include <vector.h>

 

std::vector<double> gall(const std::vector<double>&,

                         const std::vector<double>&,

                         const std::vector<double>&,

                         const std::vector<bool>&,

                               std::vector<std::vector<double> >*);

 

main()

{

   const double noise = 0.1;

   const int n = 100;

 

   // Parameter set

   std::vector<double> para(6);

   para[0] = 2;   // a

   para[1] = 3;   // b

   para[2] = 1.5; // c

   para[3] = 5;   // d

   para[4] = -1;  // e

   para[5] = 3;   // f

 

   // Generate a noise-containing data set, and weights.

   std::vector<double> x(n);

   std::vector<double> y(n);

   std::vector<double> w(n, 1);

   for (int i = 0; i < n; i++)

   {

      x[i] = 0.5 * (i - 50);

      y[i] = mac::gGaussian(x[i], para) +

             noise * para[5] * (mac::rand_fractional() - 0.5);

   // cout << i << " " << x[i] << " " << y[i] << endl;

   }

 

   // Masks and initial parameters

   std::vector<bool>   mask(6, 0);

   std::vector<double> init(6);

   init[0] = 0;

   mask[0] = 1;

   init[1] = log(1.7);

   mask[1] = 0;

   init[2] = log(1);

   mask[2] = 0;

   init[3] = log(para[3]);

   mask[3] = 1;

   init[4] = 0;

   mask[4] = 0;

   init[5] = 1;

   mask[5] = 1;

 

   // Constructs a non-linear least-squares fitting object.

   mac::nonlinearOperator<double, double> nlls(mac::gGaussian, x, y, init,

                                               mask, w);

   nlls.testCalculation(n / 3); // Testing the evaluation function.

 

   nlls.fit();

 

   nlls.setTheoreticalAll(gall); // Switch to a new evaluation function.

 

   mask = std::vector<bool>(6, 1);

   nlls.fit(mask); // Fitting more parameters.

 

   nlls.appendMeasurement(-25.5, mac::gGaussian(-25.5, para) +

                          noise * para[5] * (mac::rand_fractional() - 0.5));

   nlls.fit(); // Add an individule data point and fit again.

 

   for (int i = 0; i < n; i++)

   {

      x[i] += 0.25;

      y[i] = mac::gGaussian(x[i], para) +

             noise * para[5] * (mac::rand_fractional() - 0.5);

   }

   nlls.appendMeasurement(x, y, w);

   nlls.fit(); // Add a lot more data and fit again.

 

   for (int i = 0; i < n; i++)

   {

      x[i] = (i - 50) / 3.;

      y[i] = mac::gGaussian(x[i], para) +

             noise * para[5] * (mac::rand_fractional() - 0.5);

   }

   nlls.exchangeMeasurement(x, y, w);

   nlls.fit(); // Exchange data set and fit again.

 

   nlls.screenMeasurement(0.1);

   nlls.fit(); // Select the best fitted data points and fit again.

 

   init = nlls.getParameter();

   cout << "    target refined" << endl;

   cout << "a = " << para[0] << "      " << init[0]          << endl;

   cout << "b = " << para[1] << "      " << 1 + exp(init[1]) << endl;

   cout << "c = " << para[2] << "      " << exp(init[2])     << endl;

   cout << "d = " << para[3] << "      " << exp(init[3])     << endl;

   cout << "e = " << para[4] << "      " << init[4]          << endl;

   cout << "f = " << para[5] << "      " << init[5]          << endl;

 

   // Several functions related to measurements.

   assert (nlls.getDatapoint().size() == nlls.getMeasurement().size());

   assert (nlls.getDatapoint().size() == nlls.getMeasurementSize());

   assert (nlls.getTotalMeasurement() >= nlls.getMeasurementSize());

   assert (nlls.getTotalMeasurement() == n);

   nlls.useAllMeasurement();

   assert (nlls.getTotalMeasurement() == nlls.getMeasurementSize());

 

// nlls.printMeasurement();

// nlls.printMeasurement(1);

 

   nlls.setParameter(init);

      // The current result is lost.  All parameters are set back to the

      // initial values.

 

   // Specify the parameter names.

   std::vector<std::string> name(nlls.getParameterSize());

   name[0] = std::string("cneter");

   name[1] = std::string("base");

   name[2] = std::string("kurtosis");

   name[3] = std::string("deviation");

   name[4] = std::string("skew");

   name[5] = std::string("amplitude");

   nlls.setParameterName(name);

   mask[2] = mask[4] = 0;

   nlls.setMask(mask);

   nlls.fit();

   mask[2] = mask[4] = 1;

   nlls.fit(mask);

 

   nlls.setDataToParameterRatio(10);

   nlls.fit(); // Limit data-to-parameter ratio.

 

   nlls.setIterationMax(100);

   nlls.setTolerance(1e-8);

 

   // Testing undo function.

   init = nlls.getParameter();

   const double chi2 = nlls.getChiSquare();

   const double rmsd = nlls.getRMSD();

   nlls.fit();

   const double drop = nlls.RMSDDrop();

   nlls.unfit();

   for (int i = 0; i < 6; i++) assert (init[i] == nlls.getParameter()[i]);

   assert (chi2 == nlls.getChiSquare());

   assert (rmsd == nlls.getRMSD());

   assert (drop * nlls.RMSDDrop() < 0);

 

   return 0;

}

 

std::vector<double> gall(const std::vector<double>& x,

                         const std::vector<double>& y,

                         const std::vector<double>& p,

                         const std::vector<bool>&   m,

                               std::vector<std::vector<double> >* d)

{

   if (d)

      *d = std::vector<std::vector<double> >(x.size(),

                                             std::vector<double>(p.size()));

   std::vector<double> g(x.size());

   for (int i = 0; i < x.size(); i++)

   {

      if (d)

      {

         std::vector<double> deri(p.size());

         g[i] = mac::gGaussian(x[i], x, y, p, m, &deri);

         (*d)[i] = deri;

      }

      else

      {

         g[i] = mac::gGaussian(x[i], x, y, p);

      }

   }

   return g;

}

 

Listing 6.4.5.1.1 Testing non-linear least-squares fitting.

 

===============================================================

           Non-linear Least-squares Minimization by

a Trust-region Implementation of the Levenberg-Marquardt Method

===============================================================

parameter   derivative from derivative and function evaluation

_________   __________________________________________________

        0           -3.82760385e-02          -3.82760389e-02

        1           -1.03119719e-01          -1.03119724e-01

        2           -1.72637953e-01          -1.72637922e-01

        3            3.25346328e-01           3.25346362e-01

        4            1.76053619e+00           1.76051118e+00

        5            5.66707189e-02           5.66707189e-02

 

Iteration

  | LM-damping factor

  |     |     Trust-region size

  |     |         |     Relative distance to solution

  |     |         |         |     Predicted & actual relative reduction

  V     V         V         V             V          Chi-square      RMSD

___ _________ _________ _________ ___________________ _________ _________

  0         -         -         -         -         - 1.080e+02 1.039e+00

  1     0.689 1.111e+01 1.000e+00  0.606655  0.461623 5.814e+01 7.625e-01

  2     0.000 1.759e+01 6.326e-01  0.537312  0.377137 3.621e+01 6.018e-01

  3     0.000 1.045e+01 8.600e-01  0.806628  0.457703 1.964e+01 4.432e-01

  4     0.000 9.589e+00 4.037e-01  0.553107  0.473764 1.033e+01 3.215e-01

  5     0.000 2.743e+00 3.626e-01  0.179765  0.153919 8.744e+00 2.957e-01

  6     0.000 1.602e+00 1.017e-01  0.025775  0.019211 8.576e+00 2.929e-01

  7     0.000 3.795e-01 5.856e-02  0.002792  0.001949 8.559e+00 2.926e-01

  8     0.000 1.588e-01 1.389e-02  0.000302  0.000207 8.558e+00 2.925e-01

  9     0.000 4.641e-02 5.810e-03  0.000033  0.000022 8.557e+00 2.925e-01

 10     0.000 1.667e-02 1.698e-03  0.000004  0.000002 8.557e+00 2.925e-01

Termination       : the minimal chi-square is almost reached.

 

Current & previous parameter set   Mask   Absolute & relative change

________________________________   ____   __________________________

  0      5.274444       0.000000      1      5.274444             -

  1      0.530628       0.530628      0      0.000000      0.000000%

  2      0.000000       0.000000      0      0.000000             -

  3      1.691868       1.609438      1      0.082430      5.121686%

  4      0.000000       0.000000      0      0.000000             -

  5      3.122798       1.000000      1      2.122798    212.279827%

 

Total parameters  : 6

  Refined         : 3

Total measurements: 100

  Accepted        : 100

  Rejected        : 0

  Used            : 100

Data-to-parameter : 33.3333

Maximum iteration : 50

Tolerance         : 1e-06

Chi-square        :      8.55746       107.996      -99.4389      -92.0762%

R.M.S.D.          :     0.292531       1.03921     -0.746682      -71.8507%

(Current and previous values, absolute and relative changes)

 

 

Iteration

  | LM-damping factor

  |     |     Trust-region size

  |     |         |     Relative distance to solution

  |     |         |         |     Predicted & actual relative reduction

  V     V         V         V             V          Chi-square      RMSD

___ _________ _________ _________ ___________________ _________ _________

  0         -         -         -         -         - 8.557e+00 2.925e-01

  1     0.000 7.730e+00 1.733e+00  0.457289  0.508889 4.203e+00 2.050e-01

  2     0.000 9.916e+00 2.913e-01  0.506375  0.187454 3.415e+00 1.848e-01

¡­

 40     0.071 4.152e-04 1.813e-05  0.000001 -0.000000 7.106e-01 8.430e-02

Termination       : the minimal chi-square is almost reached.

 

Current & previous parameter set   Mask   Absolute & relative change

________________________________   ____   __________________________

  0      2.113545       5.274444      1     -3.160899    -59.928579%

  1      0.752051       0.530628      1      0.221423     41.728406%

  2      0.406333       0.000000      1      0.406333             -

  3      1.625241       1.691868      1     -0.066627     -3.938099%

  4     -0.892106       0.000000      1     -0.892106             -

  5      2.980295       3.122798      1     -0.142503     -4.563322%

 

Total parameters  : 6

  Refined         : 6

Total measurements: 100

  Accepted        : 100

  Rejected        : 0

  Used            : 100

Data-to-parameter : 16.6667

Maximum iteration : 50

Tolerance         : 1e-06

Chi-square        :     0.710599       8.55746      -7.84686      -91.6961%

R.M.S.D.          :    0.0842971      0.292531     -0.208234      -71.1836%

(Current and previous values, absolute and relative changes)

 

 

Iteration

  | LM-damping factor

  |     |     Trust-region size

  |     |         |     Relative distance to solution

  |     |         |         |     Predicted & actual relative reduction

  V     V         V         V             V          Chi-square      RMSD

___ _________ _________ _________ ___________________ _________ _________

  0         -         -         -         -         - 7.106e-01 8.388e-02

  1     0.000 1.009e-01 1.000e+00  0.000095 -0.011962 7.106e-01 8.388e-02

  2     0.002 1.009e-02 3.166e-03  0.000072 -0.003116 7.106e-01 8.388e-02

  3     0.032 1.152e-03 3.166e-04  0.000010 -0.000034 7.106e-01 8.388e-02

  4     0.296 4.073e-04 3.616e-05  0.000001 -0.000001 7.106e-01 8.388e-02

Termination       : the minimal chi-square is almost reached.

 

Current & previous parameter set   Mask   Absolute & relative change

________________________________   ____   __________________________

  0      2.113545       2.113545      1      0.000000      0.000000%

  1      0.752051       0.752051      1      0.000000      0.000000%

  2      0.406333       0.406333      1      0.000000      0.000000%

  3      1.625241       1.625241      1      0.000000      0.000000%

  4     -0.892106      -0.892106      1      0.000000     -0.000000%

  5      2.980295       2.980295      1      0.000000      0.000000%

 

Total parameters  : 6

  Refined         : 6

Total measurements: 101

  Accepted        : 101

  Rejected        : 0

  Used            : 101

Data-to-parameter : 16.8333

Maximum iteration : 50

Tolerance         : 1e-06

Chi-square        :     0.710627      0.710627             0             0%

R.M.S.D.          :    0.0838803     0.0838803             0             0%

(Current and previous values, absolute and relative changes)

 

 

Iteration

  | LM-damping factor

  |     |     Trust-region size

  |     |         |     Relative distance to solution

  |     |         |         |     Predicted & actual relative reduction

  V     V         V         V             V          Chi-square      RMSD

___ _________ _________ _________ ___________________ _________ _________

  0         -         -         -         -         - 1.587e+00 8.887e-02

  1     0.000 5.232e-01 1.000e-00  0.012115 -0.173823 1.587e+00 8.887e-02

  2     0.002 1.365e-01 1.150e-02  0.011812 -0.021086 1.587e+00 8.887e-02

¡­

 26     0.000 4.680e-03 5.519e-04  0.000002  0.000003 1.566e+00 8.828e-02

Termination       : the minimal chi-square is almost reached.

 

Current & previous parameter set   Mask   Absolute & relative change

________________________________   ____   __________________________

  0      2.151115       2.113545      1      0.037570      1.777603%

  1      0.764862       0.752051      1      0.012811      1.703448%

  2      0.406709       0.406333      1      0.000376      0.092438%

  3      1.622690       1.625241      1     -0.002551     -0.156937%

  4     -0.823404      -0.892106      1      0.068702     -7.701115%

  5      2.972653       2.980295      1     -0.007642     -0.256427%

 

Total parameters  : 6

  Refined         : 6

Total measurements: 201

  Accepted        : 201

  Rejected        : 0

  Used            : 201

Data-to-parameter : 33.5

Maximum iteration : 50

Tolerance         : 1e-06

Chi-square        :      1.56643        1.5873    -0.0208755      -1.31516%

R.M.S.D.          :     0.088279     0.0888653  -0.000586292     -0.659754%

(Current and previous values, absolute and relative changes)

 

 

Iteration

  | LM-damping factor

  |     |     Trust-region size

  |     |         |     Relative distance to solution

  |     |         |         |     Predicted & actual relative reduction

  V     V         V         V             V          Chi-square      RMSD

___ _________ _________ _________ ___________________ _________ _________

  0         -         -         -         -         - 6.846e-01 8.274e-02

  1     0.000 1.592e+00 1.000e+00  0.077658 -1.290891 6.846e-01 8.274e-02

  2     0.002 1.818e-01 4.134e-02  0.070557 -0.427366 6.846e-01 8.274e-02

¡­

 49     0.058 2.245e-02 5.690e-04  0.000419  0.000121 6.319e-01 7.949e-02

Termination       : maximum iteration is reached.

 

Current & previous parameter set   Mask   Absolute & relative change

________________________________   ____   __________________________

  0      2.033341       2.151115      1     -0.117774     -5.475041%

  1      0.751445       0.764862      1     -0.013417     -1.754153%

  2      0.343421       0.406709      1     -0.063288    -15.560919%

  3      1.625396       1.622690      1      0.002705      0.166728%

  4     -0.977127      -0.823404      1     -0.153723     18.669194%

  5      3.020441       2.972653      1      0.047788      1.607594%

 

Total parameters  : 6

  Refined         : 6

Total measurements: 100

  Accepted        : 100

  Rejected        : 0

  Used            : 100

Data-to-parameter : 16.6667

Maximum iteration : 50

Tolerance         : 1e-06

Chi-square        :     0.631907      0.684563    -0.0526567      -7.69201%

R.M.S.D.          :    0.0794925     0.0827383   -0.00324579      -3.92295%

(Current and previous values, absolute and relative changes)

 

 

Iteration

  | LM-damping factor

  |     |     Trust-region size

  |     |         |     Relative distance to solution

  |     |         |         |     Predicted & actual relative reduction

  V     V         V         V             V          Chi-square      RMSD

___ _________ _________ _________ ___________________ _________ _________

  0         -         -         -         -         - 2.501e-01 5.737e-02

  1     0.000 6.589e-01 1.000e+00  0.039507 -1.000213 2.501e-01 5.737e-02

  2     0.002 6.589e-02 1.805e-02  0.036529 -0.278972 2.501e-01 5.737e-02

¡­

 49     0.115 7.207e-03 1.959e-04  0.000150  0.000067 2.409e-01 5.629e-02

Termination       : maximum iteration is reached.

 

Current & previous parameter set   Mask   Absolute & relative change

________________________________   ____   __________________________

  0      1.994435       2.033341      1     -0.038906     -1.913386%

  1      0.738485       0.751445      1     -0.012960     -1.724634%

  2      0.314409       0.343421      1     -0.029012     -8.447875%

  3      1.628153       1.625396      1      0.002757      0.169619%

  4     -1.035730      -0.977127      1     -0.058603      5.997514%

  5      3.034111       3.020441      1      0.013670      0.452587%

 

Total parameters  : 6

  Refined         : 6

Total measurements: 100

  Accepted        : 76

  Rejected        : 24

  Used            : 76

Data-to-parameter : 12.6667

Maximum iteration : 50

Tolerance         : 1e-06

Chi-square        :     0.240851        0.2501   -0.00924895       -3.6981%

R.M.S.D.          :    0.0562947     0.0573654   -0.00107071      -1.86647%

(Current and previous values, absolute and relative changes)

 

    target refined

a = 2      1.99443

b = 3      3.09276

c = 1.5      1.36945

d = 5      5.09445

e = -1      -1.03573

f = 3      3.03411

 

Iteration

  | LM-damping factor

  |     |     Trust-region size

  |     |         |     Relative distance to solution

  |     |         |         |     Predicted & actual relative reduction

  V     V         V         V             V          Chi-square      RMSD

___ _________ _________ _________ ___________________ _________ _________

  0         -         -         -         -         - 6.386e-01 7.991e-02

  1     0.000 8.103e-02 1.000e-00  0.003373  0.003000 6.367e-01 7.979e-02

  2     0.000 2.483e-02 2.080e-03  0.000276  0.000131 6.366e-01 7.979e-02

  3     0.000 1.588e-02 6.375e-04  0.000110  0.000040 6.366e-01 7.979e-02

  4     0.000 7.940e-03 4.079e-04  0.000045  0.000011 6.366e-01 7.979e-02

  5     0.000 3.970e-03 2.040e-04  0.000019  0.000001 6.366e-01 7.979e-02

  6     0.000 1.054e-03 1.020e-04  0.000008 -0.000002 6.366e-01 7.979e-02

  7     0.960 5.130e-04 2.707e-05  0.000006 -0.000000 6.366e-01 7.979e-02

  8     3.478 2.565e-04 1.318e-05  0.000004  0.000000 6.366e-01 7.979e-02

  9     6.897 1.243e-04 6.589e-06  0.000002 -0.000000 6.366e-01 7.979e-02

Termination       : the minimal chi-square is almost reached.

 

Parameter Name

______________

  0   cneter

  1   base

  2   kurtosis

  3   deviation

  4   skew

  5   amplitude

 

Current & previous parameter set   Mask   Absolute & relative change

________________________________   ____   __________________________

  0      1.991015       1.994435      1     -0.003420     -0.171456%

  1      0.745930       0.738485      1      0.007445      1.008160%

  2      0.314409       0.314409      0      0.000000      0.000000%

  3      1.626524       1.628153      1     -0.001629     -0.100032%

  4     -1.035730      -1.035730      0      0.000000     -0.000000%

  5      3.034902       3.034111      1      0.000791      0.026077%

 

Total parameters  : 6

  Refined         : 4

Total measurements: 100

  Accepted        : 100

  Rejected        : 0

  Used            : 100

Data-to-parameter : 25

Maximum iteration : 50

Tolerance         : 1e-06

Chi-square        :     0.636574      0.638606    -0.0020322     -0.318225%

R.M.S.D.          :    0.0797856     0.0799128  -0.000127252     -0.159239%

(Current and previous values, absolute and relative changes)

 

 

Iteration

  | LM-damping factor

  |     |     Trust-region size

  |     |         |     Relative distance to solution

  |     |         |         |     Predicted & actual relative reduction

  V     V         V         V             V          Chi-square      RMSD

___ _________ _________ _________ ___________________ _________ _________

  0         -         -         -         -         - 6.366e-01 7.979e-02

  1     0.000 3.506e-01 1.000e+00  0.009449  0.003828 6.341e-01 7.963e-02

  2     0.000 1.271e-01 8.919e-03  0.005537  0.002685 6.324e-01 7.953e-02

  3     0.000 1.447e-01 3.231e-03  0.002883  0.002504 6.309e-01 7.943e-02

  4     0.000 2.686e-02 3.670e-03  0.000461 -0.001000 6.309e-01 7.943e-02

  5     0.002 1.312e-02 6.814e-04  0.000460 -0.000022 6.309e-01 7.943e-02

  6     0.009 1.312e-02 3.328e-04  0.000456  0.000326 6.306e-01 7.941e-02

  7     0.002 6.560e-03 3.329e-04  0.000139  0.000015 6.306e-01 7.941e-02

  8     0.002 1.376e-02 1.664e-04  0.000125  0.000094 6.306e-01 7.941e-02

  9     0.000 1.140e-02 3.492e-04  0.000033  0.000033 6.306e-01 7.941e-02

Termination       : the minimal chi-square is almost reached.

 

Parameter Name

______________

  0   cneter

  1   base

  2   kurtosis

  3   deviation

  4   skew

  5   amplitude

 

Current & previous parameter set   Mask   Absolute & relative change

________________________________   ____   __________________________

  0      1.997234       1.991015      1      0.006219      0.312334%

  1      0.745739       0.745930      1     -0.000191     -0.025601%

  2      0.342565       0.314409      1      0.028156      8.955164%

  3      1.626565       1.626524      1      0.000041      0.002508%

  4     -1.026307      -1.035730      1      0.009423     -0.909811%

  5      3.020397       3.034902      1     -0.014505     -0.477947%

 

Total parameters  : 6

  Refined         : 6

Total measurements: 100

  Accepted        : 100

  Rejected        : 0

  Used            : 100

Data-to-parameter : 16.6667

Maximum iteration : 50

Tolerance         : 1e-06

Chi-square        :     0.630556      0.636574   -0.00601795     -0.945367%

R.M.S.D.          :    0.0794075     0.0797856  -0.000378029     -0.473806%

(Current and previous values, absolute and relative changes)

 

 

Iteration

  | LM-damping factor

  |     |     Trust-region size

  |     |         |     Relative distance to solution

  |     |         |         |     Predicted & actual relative reduction

  V     V         V         V             V          Chi-square      RMSD

___ _________ _________ _________ ___________________ _________ _________

  0         -         -         -         -         - 1.793e-01 5.466e-02

  1     0.000 1.879e-01 1.000e+00  0.005918 -0.138128 1.793e-01 5.466e-02

  2     0.001 2.483e-02 5.714e-03  0.005813 -0.031741 1.793e-01 5.466e-02

¡­

 49     0.322 5.647e-04 1.718e-05  0.000002  0.000001 1.783e-01 5.451e-02

Termination       : maximum iteration is reached.

 

Parameter Name

______________

  0   cneter

  1   base

  2   kurtosis

  3   deviation

  4   skew

  5   amplitude

 

Current & previous parameter set   Mask   Absolute & relative change

________________________________   ____   __________________________

  0      1.993053       1.997234      1     -0.004181     -0.209356%

  1      0.746763       0.745739      1      0.001023      0.137244%

  2      0.327566       0.342565      1     -0.014999     -4.378399%

  3      1.626356       1.626565      1     -0.000209     -0.012835%

  4     -1.033012      -1.026307      1     -0.006705      0.653359%

  5      3.031559       3.020397      1      0.011162      0.369541%

 

Total parameters  : 6

  Refined         : 6

Total measurements: 100

  Accepted        : 100

  Rejected        : 0

  Used            : 60

Data-to-parameter : 10

Maximum iteration : 50

Tolerance         : 1e-06

Chi-square        :     0.178268      0.179272   -0.00100433     -0.560224%

R.M.S.D.          :    0.0545081     0.0546614  -0.000153328     -0.280505%

(Current and previous values, absolute and relative changes)

 

 

Iteration

  | LM-damping factor

  |     |     Trust-region size

  |     |         |     Relative distance to solution

  |     |         |         |     Predicted & actual relative reduction

  V     V         V         V             V          Chi-square      RMSD

___ _________ _________ _________ ___________________ _________ _________

  0         -         -         -         -         - 1.783e-01 5.451e-02

  1     0.000 1.582e-01 1.000e-00  0.000311 -0.108012 1.783e-01 5.451e-02

  2     0.001 1.582e-02 4.818e-03  0.000233 -0.028601 1.783e-01 5.451e-02

¡­

 99     0.980 1.587e-04 4.828e-06  0.000000  0.000000 1.783e-01 5.451e-02

Termination       : maximum iteration is reached.

 

Parameter Name

______________

  0   cneter

  1   base

  2   kurtosis

  3   deviation

  4   skew

  5   amplitude

 

Current & previous parameter set   Mask   Absolute & relative change

________________________________   ____   __________________________

  0      1.990700       1.993053      1     -0.002353     -0.118035%

  1      0.744816       0.746763      1     -0.001947     -0.260664%

  2      0.325980       0.327566      1     -0.001586     -0.484178%

  3      1.626775       1.626356      1      0.000419      0.025779%

  4     -1.037289      -1.033012      1     -0.004276      0.413951%

  5      3.031755       3.031559      1      0.000197      0.006496%

 

Total parameters  : 6

  Refined         : 6

Total measurements: 100

  Accepted        : 100

  Rejected        : 0

  Used            : 60

Data-to-parameter : 10

Maximum iteration : 100

Tolerance         : 1e-08

Chi-square        :     0.178262      0.178268  -5.79146e-06   -0.00324874%

R.M.S.D.          :    0.0545072     0.0545081  -8.85421e-07   -0.00162438%

(Current and previous values, absolute and relative changes)

 

Undo the last fitting.

Parameter Name

______________

  0   cneter

  1   base

  2   kurtosis

  3   deviation

  4   skew

  5   amplitude

 

Current & previous parameter set   Mask   Absolute & relative change

________________________________   ____   __________________________

  0      1.993053       1.990700      1      0.002353      0.118175%

  1      0.746763       0.744816      1      0.001947      0.261345%

  2      0.327566       0.325980      1      0.001586      0.486533%

  3      1.626356       1.626775      1     -0.000419     -0.025772%

  4     -1.033012      -1.037289      1      0.004276     -0.412245%

  5      3.031559       3.031755      1     -0.000197     -0.006495%

 

Total parameters  : 6

  Refined         : 6

Chi-square        :     0.178268      0.178262   5.79146e-06    0.00324885%

R.M.S.D.          :    0.0545081     0.0545072   8.85421e-07    0.00162441%

(Current and previous values, absolute and relative changes)

 

==========End of Non-linear Least-squares Minimization=========

 

Listing 6.4.5.1.2 Output from the testing program in Listing 6.4.5.1.1.

Figure 6.4.5.1.1 Generalized Gaussian function in red fitted to noise-containing data in black.

 

6.4.5.2 A more complex example

This example shows an oversimplified refinement of non-crystallographic symmetry (NCS).  Suppose there are 5 heavy atom sites per monomer and 5 monomers per asymmetric unit related by a local 5-fold rotation axis.  There will be 25 sites per asymmetric unit.  This example uses an object of mac::nonlinearOperator<ccl::CartesianCoordinates, int> to refine 5-fold axes from the 25 sites.

 

#include <assert.h>

#include <mac/utility.h>

#include <mac/vector3D.h>

#include <mac/nonlinear.h>

#include <iostream.h>

#include <vector.h>

#include <ccl/coordinates.h>

#include <mac/rotationMatrix.h>

#include "022.h"

 

main()

{

   const mac::vector3D<double> polar(85, 160, 72); // A 5-fold

   const double errAmp = 2;

   const ccl::CartesianCoordinates point(4, 3, 5);

 

   // Generate the first five data points.

   std::vector<ccl::CartesianCoordinates> meas(25);

   for (int i = 0; i < 5; i++)

   {

      mac::vector3D<double> p = polar.scale(mac::vector3D<double>(1, 1, i));

      p *= degreeToRadian;

      const mac::polarRotationMatrix r(p);

      meas[i] = r * point;

   }

 

   // Weights

   const std::vector<double> wght(meas.size(), 1);

 

   // Set the initial parameters

   std::vector<double> para(15);

   for (int i = 0; i < 5; i++)

   {

      mac::vector3D<double> p = polar.scale(mac::vector3D<double>(1, 1, i));

      p *= degreeToRadian;

      for (int j = 0; j < 3; j++) para[3 * i + j] = p[j];

   }

 

   // Masks

   std::vector<bool> mask(15, 0);

   mask[5] = mask[8] = mask[11] = mask[14] = 1;

 

   std::vector<int> indx(25);

   for (int i = 0; i < 25; i++) indx[i] = i;

 

   // Generate some more data with added error

   for (int i = 0; i < 25; i++)

   {

      mac::vector3D<double> err(mac::rand_fractional(),

                                mac::rand_fractional(),

                                mac::rand_fractional());

      err *= errAmp;

      meas[i] = rot(i, indx, meas, para) + err;

   }

 

   for (int i = 0; i < 25; i++)

      cout << meas[i] << " " << rot(i, indx, meas, para) << endl;

 

   mac::nonlinearOperator<ccl::CartesianCoordinates, int> nlls(

      rot, indx, meas, para, mask, wght);

// for (int i = 0; i < 25; i++) nlls.testCalculation(i);

 

   nlls.fit();

   for (int i = 3; i < 25; i++) mask[i] = 1;

   nlls.fit(mask);

 

   return 0;

}

 

Listing 6.4.5.2.1 Example of non-crystallographic symmetry refinement.

 

ccl::CartesianCoordinates rot(

   const int& index,

   const std::vector<int>& point,

   const std::vector<ccl::CartesianCoordinates>& data,

   const std::vector<double>& para,

   const std::vector<bool>& mask,

   std::vector<ccl::CartesianCoordinates>* deri)

{

   const int i = index / 5;

   const int j = mac::nearest(fmod(index, 5));

   mac::vector3D<double> p(para[3 * i], para[3 * i + 1], para[3 * i + 2]);

   const mac::polarRotationMatrix r(p);

 

   if (deri) for (int k = 0; k < 15; k++)

   {

      const int m = k / 3;

      const int l = mac::nearest(fmod(k, 3));

      if (i == m)

      {

         const mac::matrix3D<double> d =

               mac::polarRotationMatrix::derivative(p, l + 1);

         (*deri)[k] = ccl::CartesianCoordinates(data[j] * d[0],

                                                data[j] * d[1],

                                                data[j] * d[2]);

      }

      else (*deri)[k] = ccl::CartesianCoordinates();

   }

 

   return r * data[j];

}

 

Listing 6.4.5.2.2 Call-back function used by NCS refinement.

 

6.4.6 The nonlinear class in Python

 

This Python binder allows one to use various nonlinear least-squares minimization classes with data in Python.  However, one has to supply the function evaluation routine in C or C++.  This section demonstrates the same examples as those in 6.4.5, but this time, in Python.

 

The current release has the following class types instantiated:

 

C++ class

Python class

mac::nonlinear<double, double>

nonlinearBaseDBLDBL

mac::nonlinearOperator<double, double>

nonlinearDBLDBL

mac::nonlinear<ccl::CartesianCoordinates, int>

nonlinearBaseCTC

mac::nonlinearOperator<ccl::CartesianCoordinates>

nonlinearCTC

 

Table 6.4.6.0.1 Instantiated nonlinear classes.

 

6.4.6.1 Constructor

nonlinearTYP(eval = 0,

             x    = 0-length vector of X,

             y    = 0-length vector of Y,

             para = cpl.std.vectorDBL(),

             mask = cpl.std.vectorBOL(),

             wght = cpl.std.vectorDBL(),

             iter = 50,

             tol  = 1e-6,

             rat  = 0)

This constructor returns a new object that can be used to perform nonlinear least-squares minimization.  See 6.4.1.1 for description of each argument.  The first one eval is the evaluation function described below.  X and Y are types used to instantiate the class nonlinearTYP, such as int and ccl::CartesianCoordinates in Table 6.4.6.0.1.  The required types of x and y are the Python binders of std::vector<X> and std::vector<Y>, respectively.

 

6.4.6.2 Evaluation functions of theoretical model

Evaluation functions of a theoretical model must be implemented in C or C++, and then passed to a nonlinear object.  For better efficiency of performance, this is most desirable.  Nevertheless, it would also be possible to pass an evaluation function written in Python to a nonlinear object, but this has not been tested yet.

 

There are two types of evaluation function, calculating one data point at a time and calculating all data points in a single call.  The latter is in general more efficient, and therefore has higher priority.  The constructor may accept a function of the first type; a set function can again switch to a new one.

 

setTheoretical(eval)

Sets a new evaluation function of the first type.  The previous function is then lost.  See 6.4.1.2 for the required prototype of type 1 evaluation function.

 

setTheoreticalAll(eval)

Sets a new evaluation function of type 2 that calculates function values and partial derivatives of all data points in one call.  See 6.4.1.2 for detail.

 

testCalculation(i = 0, delta = 0.001)

Tests type 1 evaluation function at i-th data point, where i is a 0-offset index.  A small interval delta allows this function to test the consistency of function and derivative evaluation.

 

testCalculation(i, d)

Tests type 1 evaluation function at i-th data point, where i is a 0-offset index.  d is a cpl.std.vectorDBL of the same length as the parameter set, that contains a set of intervals for all parameters.

 

6.4.6.3 Experimental measurements

All functions in the original C++ code are available in Python.  See 6.4.1.3 for details.  This section only lists the function prototypes.

 

appendMeasurement(x, y, w = 1)

Appends a single data point at x and y with weight w.

 

appendMeasurement(x, y, w = cpl.std.vectorDBL())

Appends a set of data points.  x and y are equal length vector of types X and Y.  A same length vector of weights is optional.  It the weights are absent, all weights are set 1.

 

exchangeMeasurement(x, y,

                    w = cpl.std.vectorDBL())

Exchanges the entire data set of measurements.

 

screenMeasurement(errmax = 0)

useAllMeasurement()

Screens the current data set.

 

getDatapoint()

getMeasurement()

Return currently accepted xj¡¯s and yj¡¯s, respectively, as vectors.

 

getMeasurementSize()

getTotalMeasurement()

Return the sizes of accepted data points and total measurements, respectively.

 

printMeasurement(data = 0)

Prints a message about measurements.  If the Boolean data is set, it may print the accepted data points.

 

6.4.6.4 Model parameters

See 6.4.1.4 for detail.

 

setParameter(p)

getParameter()

printParameter()

Sets, gets, and prints parameter set.

 

getParameterSize()

Returns the size of parameter set.

 

setParameterName(name)

Sets names for all parameters.

 

setMask(mask)

getMask()

Sets and gets mask set.

 

6.4.6.5 Other control

See 6.4.1.5 for detail.

 

setDataToParameterRatio(r = 0)

getDataToParameterRatio()

Sets and gets the maximum data-to-parameter ratio.

 

numberOfDataUsed()

Returns the number of data points used in fitting.

 

setIterationMax(iter)

getIterationMax()

printIterationMax()

Sets, gets, and prints maximum iteration allowed in fitting.

 

setTolerance(tol)

getTolerance()

printTolerance()

Sets, gets, and prints a fractional tolerance that controls the fitting.

 

6.4.6.6 Least-squares fitting

See 6.4.1.6 for detail.

 

fit(mask = cpl.std.vectorBOL())

Performs least-squares fitting with an optional set of masks.  The current masks will be used if no new mask is given.

 

fit(para)

Performs least-squares fitting starting from a new set of parameters.

 

unfit()

Undoes the last fitting.

 

6.4.6.7 Statistics of result

See 6.4.1.7 for detail.

 

calculateChiSquare()

getChiSquare()

Calculates and returns c2.

 

getRMSD()

RMSDDrop()

Return RMSD and RMSD drop, respectively.

 

6.4.6.8 Example 1: generalized Gaussian function

 

The generalized Gaussian function is defined in module cpl.mac.geo.  This module also defines two call-back functions for the nonlinear class.  Thus, no additional preparation for evaluation functions is needed.  Notice the use of cpl.mac.geo.nonlinearDBLDBL_cb1 and cpl.mac.geo.nonlinearDBLDBL_cb2.  The output of this program is very similar to those in Listing 6.4.5.1.2.

 

#!/usr/local/bin/python

 

import cpl

import math

 

noise = 0.1

n     = 100

 

# Parameter set

para    = cpl.std.vectorDBL(6)

para[0] = 2   # a

para[1] = 3   # b

para[2] = 1.5 # c

para[3] = 5   # d

para[4] = -1  # e

para[5] = 3   # f

 

# Generate a noise-containing data set, and weights.

x = cpl.std.vectorDBL(n)

y = cpl.std.vectorDBL(n)

w = cpl.std.vectorDBL(n, 1)

for i in range(n):

   x[i] = 0.5 * (i - 50)

   y[i] = cpl.mac.geo.gGaussian(x[i], para) +\

             noise * para[5] * (cpl.mac.utility.rand_fractional() - 0.5)

 

# Masks and initial parameters

mask    = cpl.std.vectorBOL(6, 0)

init    = cpl.std.vectorDBL(6)

init[0] = 0

mask[0] = 1

init[1] = math.log(1.7)

mask[1] = 0

init[2] = math.log(1)

mask[2] = 0

init[3] = math.log(para[3])

mask[3] = 1

init[4] = 0

mask[4] = 0

init[5] = 1

mask[5] = 1

 

# Constructs a non-linear least-squares fitting object.

nlls = cpl.mac.op.nonlinearDBLDBL(cpl.mac.geo.nonlinearDBLDBL_cb1, x, y, init,\

                                  mask, w)

nlls.testCalculation(n / 3) # Testing the evaluation function.

 

nlls.fit()

 

nlls.setTheoreticalAll(cpl.mac.geo.nonlinearDBLDBL_cb2)

   # Switch to a new evaluation function.

 

mask = cpl.std.vectorBOL(6, 1)

nlls.fit(mask) # Fitting more parameters.

 

nlls.appendMeasurement(-25.5, cpl.mac.geo.gGaussian(-25.5, para) +\

   noise * para[5] * (cpl.mac.utility.rand_fractional() - 0.5))

nlls.fit() # Add an individule data point and fit again.

 

for i in range(n):

   x[i] += 0.25;

   y[i] = cpl.mac.geo.gGaussian(x[i], para) +\

             noise * para[5] * (cpl.mac.utility.rand_fractional() - 0.5)

nlls.appendMeasurement(x, y, w)

nlls.fit() # Add a lot more data and fit again.

 

for i in range(n):

   x[i] = (i - 50) / 3.;

   y[i] = cpl.mac.geo.gGaussian(x[i], para) +\

             noise * para[5] * (cpl.mac.utility.rand_fractional() - 0.5)

nlls.exchangeMeasurement(x, y, w)

nlls.fit() # Exchange data set and fit again.

 

nlls.screenMeasurement(0.1)

nlls.fit() # Select the best fitted data points and fit again.

 

init = nlls.getParameter()

print "    target refined"

print "a =", para[0], "      ", init[0]

print "b =", para[1], "      ", 1 + math.exp(init[1])

print "c =", para[2], "      ", math.exp(init[2])

print "d =", para[3], "      ", math.exp(init[3])

print "e =", para[4], "      ", init[4]

print "f =", para[5], "      ", init[5]

 

# Several functions related to measurements.

assert len(nlls.getDatapoint()) == len(nlls.getMeasurement())

assert len(nlls.getDatapoint()) == nlls.getMeasurementSize()

assert nlls.getTotalMeasurement() >= nlls.getMeasurementSize()

assert nlls.getTotalMeasurement() == n

nlls.useAllMeasurement()

assert nlls.getTotalMeasurement() == nlls.getMeasurementSize()

 

# nlls.printMeasurement();

# nlls.printMeasurement(1);

 

nlls.setParameter(init)

   # The current result is lost.  All parameters are set back to the

   # initial values.

 

# Specify the parameter names.

name    = cpl.std.vectorSTR(nlls.getParameterSize())

name[0] = "cneter"

name[1] = "base"

name[2] = "kurtosis"

name[3] = "deviation"

name[4] = "skew"

name[5] = "amplitude"

nlls.setParameterName(name)

mask[2] = mask[4] = 0

nlls.setMask(mask)

nlls.fit()

mask[2] = mask[4] = 1

nlls.fit(mask)

 

nlls.setDataToParameterRatio(10)

nlls.fit() # Limit data-to-parameter ratio.

 

nlls.setIterationMax(100)

nlls.setTolerance(1e-8)

 

# Testing undo function.

init = nlls.getParameter()

chi2 = nlls.getChiSquare()

rmsd = nlls.getRMSD()

nlls.fit()

drop = nlls.RMSDDrop()

nlls.unfit()

for i in range(6): assert init[i] == nlls.getParameter()[i]

assert chi2 == nlls.getChiSquare()

assert rmsd == nlls.getRMSD()

assert drop * nlls.RMSDDrop() < 0

 

Listing 6.4.6.8.1 Python version of Listing 6.4.5.1.1.

 

6.4.6.9 Example 2: refinement of non-crystallographic symmetry

This example shows how to supply a user-defined call-back function for nonlinear class in order to evaluate function value and partial derivative.  We already have the call-back function in Listing 6.4.5.2.2.  It is called cb.h in this example.  See 6.4.5.2 for detail.  We need to make a new module that contains the evaluation function defined in cb.h.  The following is a SWIG interface file that generates a module cb.

 

%module cb

 

%{

#include <ccl/spaceGroup.h>

#include <ccl/coordinates.h>

#include "cb.h"

%}

 

%include "cb.h"

%constant ccl::CartesianCoordinates (*nonlinearCTC_cb)(

   const int&,

   const std::vector<int>&,

   const std::vector<ccl::CartesianCoordinates>&,

   const std::vector<double>&,

   const std::vector<bool>&,

         std::vector<ccl::CartesianCoordinates>*) = rot;

 

Listing 6.4.6.9.1 Making a user-supplied module that contains evaluation functions by SWIG.

 

By importing from module cb, the call-back function nonlinearCTC_cb can now be accessed from Python.

 

#!/usr/local/bin/python

 

import cpl, cb

 

polar = cpl.mac.vm.vector3DDBL(85, 160, 72) # A 5-fold

errAmp = 2

point = cpl.ccl.symm.CartesianCoordinates(4, 3, 5)

 

# Generate the first five data points.

meas = cpl.ccl.symm.vectorCTC(25)

for i in range(5):

   p = polar.scale(cpl.mac.vm.vector3DDBL(1, 1, i))

   p *= cpl.mac.utility.degreeToRadian

   r = cpl.mac.vm.polarRotationMatrix(p(1), p(2), p(3))

   meas[i] = cpl.ccl.symm.CartesianCoordinates(r * point)

 

# Weights

wght = cpl.std.vectorDBL(len(meas))

for i in range(len(wght)): wght[i] = 1

 

# Set the initial parameters

para = cpl.std.vectorDBL(15)

for i in range(5):

   p = polar.scale(cpl.mac.vm.vector3DDBL(1, 1, i))

   p *= cpl.mac.utility.degreeToRadian

   for j in range(3): para[3 * i + j] = p(j + 1)

 

# Masks

mask = cpl.std.vectorBOL(15)

for i in range(len(mask)): mask[i] = 0

mask[5] = 1

mask[8] = 1

mask[11] = 1

mask[14] = 1

 

indx = cpl.std.vectorINT(25)

for i in range(25): indx[i] = i;

 

# Generate some more data with added error

for i in range(25):

   err = cpl.mac.vm.vector3DDBL(cpl.mac.utility.rand_fractional(),

                                cpl.mac.utility.rand_fractional(),

                                cpl.mac.utility.rand_fractional());

   err *= errAmp;

   meas[i] = cpl.ccl.symm.CartesianCoordinates(\

             cb.rot(i, indx, meas, para) + err)

 

for i in range(25):

   print meas[i], cb.rot(i, indx, meas, para)

 

nlls = cpl.mac.op.nonlinearCTC(cb.nonlinearCTC_cb,\

                              indx, meas, para, mask, wght)

for i in range(25): nlls.testCalculation(i)

 

nlls.fit()

for i in range(3, 15): mask[i] = 1

nlls.fit(mask)

 

Listing 6.4.6.9.2 NCS refinement in Python.