CHAPTER 12

Tutorials

 

This chapter presents several real data sets.  Through these examples, I hope to show how Precognition system works in real life.  Good data processing strategies are discussed.  Common mistakes are also analyzed.  All diffraction images, command scripts, and log files can be found at http://renzresearch.com/Precognition.

 

12.1 An Undulator Laue Dataset from a Protein Crystal

 

This dataset was collected at BioCARS 14-ID-B station of APS using an undulator with a gap of 25 mm from a crystal of the M37V mutant of CO-bound dimeric clam hemoglobin.  This crystal is in monoclinic C2 space group.  Its cell constants are 93.22, 44.00, 83.56 Å, and 90.00, 121.95, 90.00 degree.  The diffraction images were recorded on a MAR345 image plate detector.  The entire dataset was collected in two passes of 31 images each.  Both passes have 6¡ã angular spacing between consecutive images.  Pass a and pass b are offset from each other by 3¡ã.  All image files are stored in a subdirectory images of the working directory.  This dataset is a courtesy of Vukica Srajer, Reinhard Pahl of BioCARS, The University of Chicago, and James Knapp, William Royer of The University of Massachusetts.

 

12.1.1 Visual evaluation of images and estimates of soft limits

 

Visual evaluation of the diffraction images is the first step of data processing.  By visual inspection, one shall have a main idea of the data quality and develop some strategies on how to process.  There are many graphics programs for this purpose.  Precognition.py has the capability to display images as well, but it is currently under development.

 

Figure 12.1.1.0.1 shows the first image in this dataset.  It is a high quality image, that shall give good data quality, but the crystal orientation shown in the first image may not be ideal for indexing, since two major ellipses are nearly tangential to each other.  Figure 12.1.1.0.2 shows a lower central portion of the first image.  Spot shapes are nicely round.  Some spatial overlaps are clearly visible.  From this zoom-in image, one may find the typical spot size.  It seems to be 8 ¡Á 8 pixels for this dataset, approximately.  Several images into the dataset shows a very typical Laue pattern at a random crystal orientation, which is in general better for indexing (Figure 12.1.1.0.3).  Images at the end of pass a and pass b show some disruption to the crystal, but they largely maintained the original diffraction quality.

 

 

Figure 12.1.1.0.1 The first image in the dataset m37v_1a_001.mar3450.

 

 

Figure 12.1.1.0.2 A portion of the first image in the dataset m37v_1a_001.mar3450.

 

 

Figure 12.1.1.0.3 An image in the dataset m37v_1a_015.mar3450.

 

Unlike monochromatic diffraction images, it is not immediately obvious from a Laue image to what resolution the crystal diffracts.  The feature of soft limit estimation can be used now, even before indexing.  Listing 12.1.1.0.1 is the command script.  Among the input, crystal-to-detector distance and detector pixel size should be precisely known.  Direct-beam center, wavelength range and the peak wavelength should be roughly known.  Spot size can be estimated from visual inspection of the images.  One may start with the default s-cut of 3.  Running of this command script suggests a lower s-cut.  A part of the log file is in Listing 12.1.1.0.2.  As the results of this job suggest, the subsequent indexing and geometric refinement shall use a resolution of 2.0 Å and s-cut of 6.  The diffraction limit of this crystal is estimated at 1.56 Å.  We will see how good these estimates are.

 

diagnostic    off

Input

   Distance   180

   Center     1705 1711

   Pixel      0.1 0.1

   Format     Mar345

   Image      images/m37v_1a_015.mar3450

   Wavelength 1 1.5 1.1

   Quit

Spot          8 8 2.4

Limits

Quit

 

Listing 12.1.1.0.1 limit1.inp, estimation of soft limits before indexing.

 

 ______

|      )_

| Report |

| ------ |

| ------ |

| ------ |

| ----   |

|________|

 

Best sigma cut estimated at 2.56.

3321 real spots on this image.

Sigma cut results in 10% noise is 2.4.

Suggested sigma cut for indexing and geometry refinement is between 2.56 and 6.6.

 

Maximum spot density

14.8/mrad at Bragg angle of

16.0 degree.

 

Diffraction limit estimated at Bragg angle of

24.6 degree or

1.56 A resolution.

 

Suggested resolution for indexing and geometry refinement is 1.96 A.

 

Listing 12.1.1.0.2 Part of limit1.log, results from soft limit estimation.

 

12.1.2 Indexing

 

Under the guidance of these soft limits, we are now ready to index a pattern.  As noted before, it is perhaps the easiest to index a random orientation pattern like m37v_1a_015.mar3450 (Figure 12.1.1.0.3) rather than the first image (Figure 12.1.1.0.1).  As a matter of fact, I had hard time to find a pattern in the set so that I can demonstrate mis-indexing.

 

12.1.2.1 Indexing

The indexing script is listed below.  Once again, distance, pixel size, and goniometer setting are precisely known.  It is a good idea to fix distance without refinement.  Center and wavelength should be approximately known.  See 2.1.3 for discussion on error of direct-beam center.  Resolution and s-cut are taken from the soft limit estimation.  Here, a one-line script for crystal information is recommended, instead of the crystal information file (11.8.2), which is being deprecated.  This script can be stored in a centralized special directory, say ~/xtal_info, to be accessed by many data processing jobs.  Two spot files m37v_1a_015.re.spt and m37v_1a_015.pre.spt will contain recognized and predicted spots after geometric refinement.

 

Crystal 93.22 44.00 83.56 90.00 121.95 90.00 5

 

Listing 12.1.2.1.1 m37vHbI-CO.inp, a one-line script of crystal information.

 

diagnostic    off

busy          off

Input

   @ m37vHbI-CO.inp

   Distance   180 fix

   Center     1705 1711

   Pixel      0.1 0.1

   Goniometer 0 0 42

   Format     Mar345

   Image      images/m37v_1a_015.mar3450

   Resolution 2.0 100

   Wavelength 1.0 1.5 1.1

   Quit

Spot    8 8 6 m37v_1a_015.re.spt

Profile

Ellipse

Nodal

Pattern       m37v_1a_015.pre.spt

Quit

 

Listing 12.1.2.1.2 index.inp, command script for indexing.

 

The process carried out by index.inp has several other findings regardless success or failure of indexing (Listing 12.1.2.1.3).  First, the command Profile learns an overall spot profile, therefore better spot size can be suggested.  For example, this process suggests an overall spot size of 8 ¡Á 6 instead of 8 ¡Á 8 pixels.  The newly suggested spot size shall be used by subsequent jobs.  Second, by learning the spot profile, it estimates the crystal dimension and mosaic spread by a spherical crystal and isotropic mosaic model.  It is harder to validate their accuracy, nevertheless, these values are useful in scaling.  It would also be a good idea to insert a command Profile into the script limit1.inp (Listing 12.1.1.0.1) after the command Spot, so that these estimates can be obtained even earlier.  Third, direct-beam center is refined by the command Ellipse from (1705, 1711) to (1704.65, 1711.50).  This center is subsequently refined to (1704.648, 1711.326) against more than 1000 spots, which demonstrates that the center refined by a few ellipses is already quite accurate.  Finally, some nodal spots are found by the command Nodal.  Even if indexing is unsuccessful, these information may be useful in manual indexing.

 

This pattern was indexed without any struggle, and refined to a very low r.m.s.d. residual of 20 mm from 1238 spots.  Figure 12.1.2.1.1 shows the predicted pattern after refinement.  Accurate prediction lays a solid foundation for spot integration.

 

 ______

|      )_

| Report |

| ------ |

| ------ |

| ------ |

| ----   |

|________|

 

An overall mean profile is recognized.

Semi-major & -minor axes (pixel): 2.7398 1.93204

Non-elliptical correction:        0.00750808 0.0213153 0.0429673 0.00486854

Non-Gaussian correction:          0.918543 0.821306

R.m.s.d. (detector count):        0.000264097

 

Overall spot length is set to 8 pixels.

Overall spot width  is set to 6 pixels.

Estimated crystal dimension is 0.579033 mm.

Estimated mosaic spread in FWHM is 0.00856099 degree.

 

¡­

 

Direct-beam center is set to 1704.65, 1711.5 in pixel.

 

¡­

 

72 Nodals Recognized (Ordered by Rank)

 

                    Center (pixel)     Intensity   Sigma(I)

                  _________________   __________ __________

    0    0    0    1839.44  1729.09     127480.8      628.4

    0    0    0    1509.85  2002.07      10219.2      185.5

    0    0    0    1190.44  2027.94      60649.0      434.3

    0    0    0    1607.14  1641.84     353385.2     1035.3

    0    0    0    1578.12  2209.16      34719.0      326.2

    0    0    0    1264.19  1296.46      18101.8      249.0

    0    0    0    1734.85  2321.03        213.5       88.0

    0    0    0    2251.67  2511.87       5748.0      142.9

    0    0    0    2307.05  1533.35       4859.0      147.5

    0    0    0    1867.74  2155.45      90569.8      522.1

    ¡­

 

 ______

|      )_

| Report |

| ------ |

| ------ |

| ------ |

| ----   |

|________|

 

1 possible crystal orientation is recognized;

corresponding cell constants and detector parameters are refined.

 

Indexing 1

R.M.S. deviation (pixel):              0.196704

Number of spots matched:             1238

Cell lengths (Angstrom):              93.2118    44.0000    83.5906

Cell angles (degree):                 90.0000   122.0802    90.0000

Missetting matrix:                -0.06422445   0.02471138  -0.99762947

                                  -0.04430840   0.99863688   0.02758878

                                   0.99695135   0.04597524  -0.06304198

Goniometer omega, chi, phi(degree):    0.0000     0.0000    42.0000

Omega-axis polar orientation (deg):   90.0000     0.0000

Crystal-to-detector distance (mm):   180.0000

Direct-beam center (pixel):         1704.6480  1711.3262

Pixel size (mm):                       0.1000042  0.1000000

Detector swing angles (degree):        0.0000     0.0000

Detector tilt angles (degree):         0.1024     0.0627

Detector bulge corrections:           -1.212e-06  4.493e-10

 

Listing 12.1.2.1.3 Part of index.log.

 

 

Figure 12.1.2.1.1 Predicted pattern displayed over real image.

 

12.1.2.2 Estimation of more soft limits

Once a pattern is indexed and refined, some more soft limits, even the l-curve, can be estimated.  This is done by the same command Limits, however, there are a number of differences in the command script.  First of all, the non-frame-specific parameter file obtained from indexing shall be loaded, instead of the commands Distance, Center, and Pixel.  Second, goniometer setting must be included, since the non-frame-specific parameter file does not have the frame-specific information.  For the same reason, image must be loaded explicitly.  Third, spot size and s-cut shall be updated with the latest values.  In addition, a filename can be supplied as a string argument to the command Limits for saving the estimated l-curve.  If a frame-specific parameter file is given, goniometer setting is no longer needed.

 

diagnostic    off

@ m37v_1a_015.pre.spt.inp

Input

   Goniometer 0 0 42

   Format     Mar345

   Image      images/m37v_1a_015.mar3450

   Wavelength 1 1.5 1.1

   Quit

Spot          8 6 2.16

Limits  1 1.5 estimate.lam

Quit

 

Listing 12.1.2.2.1 limit2.inp, estimation of soft limits after indexing.

 

Figure 12.1.2.2.1 Estimated l-curves.

 

The results are similar to those in Listing 12.1.1.1.2, except all values may be updated due to the newly specified spot size.  The estimated l-curves are shown in Figure 12.1.2.2.1.  Their ranges can be specified by two numerical arguments to the command Limits.  The curve with narrower range shown in red is more accurate.  From the l-curves, better reference wavelength may be read.

 

12.1.2.3 Indexing problems

We have a pattern in a set indexed and refined.  We have very good idea about many soft limits.  We are ready to refine each pattern in the set.  However, let¡¯s deviate from the main flow a bit in this section, and look at some potential problems in indexing.

 

The most common source of error is uncertainty in direct-beam center.  A test shows that the radius of convergence is 5 pixels around the correct center.  See 2.5 for discussions on various options after a mis-indexing.  Here is an example of user-supplied nodal spots.  m37v_1a_015.mar3450 (Figure 12.1.1.0.3) was mis-indexed when the initial center deviates too far from the correct coordinates, however, 70 nodal spots were found (Figure 12.1.2.3.1).  The nodal spot file can be manually edited to select the most significant nodal spots.  Selected spots shall be listed at the beginning of the file (Listing 12.1.2.3.1).  This file can be loaded instead of running the auto-recognition command Nodal.  The rest of the indexed ran smoothly.

 

    0    0    0    1264.19  1296.46      18101.8      249.0

    0    0    0    2040.03  1329.59      21089.0      263.1

    0    0    0    2138.23  1921.38     135509.0      640.3

    0    0    0    1190.44  2027.94      60649.0      434.3

    0    0    0    2251.67  2511.87       5748.0      142.9

    0    0    0    2307.05  1533.35       4859.0      147.5

    0    0    0    1668.85   921.73       6006.0      153.3

    0    0    0    2827.23  1495.98        426.0       60.9

    ¡­

 

Listing 12.1.2.3.1 m37v_1a_015.ndl.spt, manually edited nodal spots.

 

diagnostic    off

busy          off

Input

   @ m37vHbI-CO.inp

   Distance   180 fix

   Center     1706 1716

      # correct center is 1705 1711

   Pixel      0.1 0.1

   Goniometer 0 0 42

   Format     Mar345

   Image      images/m37v_1a_015.mar3450

   Resolution 2.0 100

   Wavelength 1.0 1.5 1.1

   Nodal      m37v_1a_015.ndl.spt

   Quit

Spot    8 8 6 m37v_1a_015.re.spt

Profile

Ellipse

Pattern       m37v_1a_015.man.spt

Quit

 

Listing 12.1.2.3.2 index_man.inp, script for manual indexing.

 

Figure 12.1.2.3.1 Auto-recognized nodal spots marked in red circles.  Manually selected ones are marked in red squares.

 

12.1.3 Geometric refinement

 

Pass a and b can be refined together or separately.  In case of slipping crystal, refinement of pass a can even be broken into two parts, one before the indexed pattern in reverse order and another after.  Listing 12.1.3.0.1 is an example of the script file.

 

diagnostic    off

busy          off

@ m37v_1a_015.pre.spt.inp

Input

   Crystal    0.05 0 0.05 0 0.1 0 free

   Distance   fix

   Format     MAR345

   prompt     off

   result     off

   Goniometer 0 0 42 m37v_1a_015.mar3450

   Goniometer 0 0 36 m37v_1a_013.mar3450

   Goniometer 0 0 30 m37v_1a_011.mar3450

   Goniometer 0 0 24 m37v_1a_009.mar3450

   Goniometer 0 0 18 m37v_1a_007.mar3450

   Goniometer 0 0 12 m37v_1a_005.mar3450

   Goniometer 0 0  6 m37v_1a_003.mar3450

   Goniometer 0 0  0 m37v_1a_001.mar3450

   prompt     on

   result     on

   Resolution 2 100

   Wavelength 1 1.5

   Spot       8 6 2.3

   Quit

Dataset       progressive

   In         images

   Quit

Quit

 

Listing 12.1.3.0.1 A command script that refines patterns before the indexed one in reverse order.  Such order and progressive mode is necessary for refinement of slipping crystals.

 

diagnostic    off

busy          off

@ m37v_1a_015.pre.spt.inp

Input

   Crystal    0.05 0 0.05 0 0.1 0 free

   Distance   fix

   Format     MAR345

   @ m37v_1a.inp

   @ m37v_1b.inp

   Resolution 2 100

   Wavelength 1 1.5

   Spot       8 6 2.3

   Quit

Dataset       progressive

   In         images

   Quit

Quit

 

Listing 12.1.3.0.2 refine.inp, command script that refines pass a and b together.

 

It is recommended to refine cell constants under certain confinement, and not to refine distance at all.  Two script files m37v_1a.inp and m37v_1b.inp define all goniometer settings of pass a and b, respectively.  These files replace the deprecated .gon file.  The results of the refinement include a set of parameter files for all frames.

 

A process of final refinement is optional.  It updates each parameter file with the further refined values.  One may choose to fix cell constants and perhaps use higher resolution limit for final refinement.

 

diagnostic    off

busy          off

prompt        off

result        off

# pass a

@ m37v_1a_001.mar3450.inp

@ m37v_1a_003.mar3450.inp

¡­

@ m37v_1a_061.mar3450.inp

 

# pass b

@ m37v_1b_001.mar3450.inp

@ m37v_1b_003.mar3450.inp

¡­

@ m37v_1b_061.mar3450.inp

prompt        on

result        on

 

Input

   Crystal    fix

   Distance   fix

   Resolution 1.8 100

   Wavelength 1 1.5

   Spot       8 6 2.3

   Quit

Dataset       final

   In         images

   Quit

Quit

 

Listing 12.1.3.0.3 final.inp, command script of final refinement.

 

12.1.4 Integration

 

The command script for integration in Listing 12.1.4.0.1 is very similar to that of final refinement (Listing 12.1.3.0.3).  An initial l-curve outline.lam is loaded by the command Input:Image.  This l-curve will be used to filter out high resolution reflections at both wings of the spectrum according to wavelength-dependent bandwidth.  The previously estimated l-curve estimate.lam shall serve the same purpose.  The command Input:Spot specifies a spot size that initializes crystal mosaicity.  These values may be updated automatically under some integration modes.  The s-cut is used during selection of sample spots for profile fitting.  Appropriate value of s-cut ranges as the same as that for indexing and refinement (Listing 12.1.1.0.2), but usually toward the greater end and preferably greater than 3.  The saved .re.spt files can be used to check the suitable s-cut.  This process saves a set .ii files that contain integrated intensities.

 

diagnostic    off

busy          off

warning       off

prompt        off

result        off

# pass a

@ m37v_1a_001.mar3450.inp

@ m37v_1a_003.mar3450.inp

¡­

@ m37v_1a_061.mar3450.inp

 

# pass b

@ m37v_1b_001.mar3450.inp

@ m37v_1b_003.mar3450.inp

¡­

@ m37v_1b_061.mar3450.inp

prompt        on

result        on

 

Input

   Image      outline.lam

   Spot       8 6 3.9

   Quit

Dataset       linearAnalytical

   In         images

   Out        linearAnalytical

   Resolution 1.6 100

   Wavelength 1 1.5

   Quit

Quit

 

Listing 12.1.4.0.1 integrate.inp, script for integration.

 

12.1.5 Wavelength normalization and scaling

 

From this step on, the rest of the tasks are carried out by another program Epinorm (11.2), but command script has the same style.  Using the previously estimated l-curve as an initial is a good idea, if it looks all right.  Even an initial l-curve is given, explicitly specifies wavelength range, reference wavelength, and order of Chebyshev approximation are usually necessary, although the program default values are available.  The given spot size initializes crystal mosaicity.  The command Restore specifies a filename for saving the parameter set of scaling.  If further scaling is desired, this file can be loaded back.  Listing 12.1.5.0.2 gives an example of such script.  This script scales linear anisotropic factors, and overwrites the previous parameter file m37v_ab.inp.  The l-curves are shown in Figure 12.1.5.0.1.

 

diagnostic    off

busy          off

warning       off

prompt        off

result        off

# pass a

@ m37v_1a_001.mar3450.inp

@ m37v_1a_003.mar3450.inp

¡­

@ m37v_1a_061.mar3450.inp

 

# pass b

@ m37v_1b_001.mar3450.inp

@ m37v_1b_003.mar3450.inp

¡­

@ m37v_1b_061.mar3450.inp

prompt        on

result        on

 

Input

   Image      estimate.lam

   Wavelength 1 1.5 1.1

   Chebyshev  32

   Spot       8 6

   Quit

Restore       m37v_ab.inp

Scale         2

Lambda        m37v_ab.lam

Quit

 

Listing 12.1.5.0.1 scale.inp, script for scaling.

 

diagnostic    off

busy          off

warning       off

prompt        off

result        off

# pass a

@ m37v_1a_001.mar3450.inp

@ m37v_1a_003.mar3450.inp

¡­

@ m37v_1a_061.mar3450.inp

 

# pass b

@ m37v_1b_001.mar3450.inp

@ m37v_1b_003.mar3450.inp

¡­

@ m37v_1b_061.mar3450.inp

 

@ m37v_ab.inp

prompt        on

result        on

 

Restore       m37v_ab.inp

Scale         2 1

Quit

 

Listing 12.1.5.0.2 Command script for loading previous results and scaling again.

Figure 12.1.5.0.1 l-curves derived from scaling in red and the initially estimated l-curve in black.

 

12.1.6 Data merging, harmonic and spatial deconvolution

 

Since harmonic and spatial overlap deconvolution takes advantage of the intrinsic high redundancy of Laue dataset, these deconvolutions become special cases of data merging.  This job is carried out by the script in Listing 12.1.6.0.1.  Resolution and wavelength ranges of accepted data can be given here.  Spot size influences spatial overlap deconvolution.  A series of commands Apply accept data at difference s-cuts.  Merging statistics are shown in Table 12.1.6.0.1.

 

diagnostic    off

busy          off

warning       off

prompt        off

result        off

# pass a

@ m37v_1a_001.mar3450.inp

@ m37v_1a_003.mar3450.inp

¡­

@ m37v_1a_061.mar3450.inp

 

# pass b

@ m37v_1b_001.mar3450.inp

@ m37v_1b_003.mar3450.inp

¡­

@ m37v_1b_061.mar3450.inp

 

@ m37v_ab.inp

prompt        on

result        on

 

Input

   Resolution 1.6 100

   Wavelength 1.05 1.4

   Spot       8 6

   Quit

Apply     0.0 m37v_ab.0.0.hkl

Apply     0.5 m37v_ab.0.5.hkl

Apply     1.0 m37v_ab.1.0.hkl

Apply     1.5 m37v_ab.1.5.hkl

Apply     2.0 m37v_ab.2.0.hkl

Quit

 

Listing 12.1.6.0.1 apply.inp, script for data merging, harmonic, and spatial overlap deconvolution.

 

 

Single

Harmonic

Spatial overlap

 

s-cut

Rmodel/Rmerge

II

UR

Rmodel

II

UR

Rmodel