CHAPTER
12
This chapter presents several real data sets. Through these examples, I hope to show how Precognition system works in real life. Good data processing strategies are discussed. Common mistakes are also analyzed. All diffraction images, command scripts, and log files can be found at http://renzresearch.com/Precognition.
12.1
An Undulator Laue Dataset from a Protein
This dataset was collected at BioCARS 14-ID-B station of APS using an undulator with a gap of 25 mm from a crystal of the M37V mutant of CO-bound dimeric clam hemoglobin. This crystal is in monoclinic C2 space group. Its cell constants are 93.22, 44.00, 83.56 Å, and 90.00, 121.95, 90.00 degree. The diffraction images were recorded on a MAR345 image plate detector. The entire dataset was collected in two passes of 31 images each. Both passes have 6¡ã angular spacing between consecutive images. Pass a and pass b are offset from each other by 3¡ã. All image files are stored in a subdirectory images of the working directory. This dataset is a courtesy of Vukica Srajer, Reinhard Pahl of BioCARS, The University of Chicago, and James Knapp, William Royer of The University of Massachusetts.
12.1.1 Visual evaluation of
images and estimates of soft limits
Visual evaluation of the diffraction images is the first step of data processing. By visual inspection, one shall have a main idea of the data quality and develop some strategies on how to process. There are many graphics programs for this purpose. Precognition.py has the capability to display images as well, but it is currently under development.
Figure 12.1.1.0.1 shows the first image in this dataset. It is a high quality image, that shall give good data quality, but the crystal orientation shown in the first image may not be ideal for indexing, since two major ellipses are nearly tangential to each other. Figure 12.1.1.0.2 shows a lower central portion of the first image. Spot shapes are nicely round. Some spatial overlaps are clearly visible. From this zoom-in image, one may find the typical spot size. It seems to be 8 ¡Á 8 pixels for this dataset, approximately. Several images into the dataset shows a very typical Laue pattern at a random crystal orientation, which is in general better for indexing (Figure 12.1.1.0.3). Images at the end of pass a and pass b show some disruption to the crystal, but they largely maintained the original diffraction quality.

Figure 12.1.1.0.1 The first image in the dataset m37v_1a_001.mar3450.

Figure 12.1.1.0.2 A portion of the first image in the dataset m37v_1a_001.mar3450.

Figure 12.1.1.0.3 An image in the dataset m37v_1a_015.mar3450.
Unlike monochromatic diffraction images, it is not immediately obvious from a Laue image to what resolution the crystal diffracts. The feature of soft limit estimation can be used now, even before indexing. Listing 12.1.1.0.1 is the command script. Among the input, crystal-to-detector distance and detector pixel size should be precisely known. Direct-beam center, wavelength range and the peak wavelength should be roughly known. Spot size can be estimated from visual inspection of the images. One may start with the default s-cut of 3. Running of this command script suggests a lower s-cut. A part of the log file is in Listing 12.1.1.0.2. As the results of this job suggest, the subsequent indexing and geometric refinement shall use a resolution of 2.0 Å and s-cut of 6. The diffraction limit of this crystal is estimated at 1.56 Å. We will see how good these estimates are.
diagnostic off
Input
Distance 180
Center 1705 1711
Pixel 0.1 0.1
Format Mar345
Image
images/m37v_1a_015.mar3450
Wavelength 1 1.5 1.1
Quit
Spot
8 8 2.4
Limits
Quit
Listing 12.1.1.0.1 limit1.inp, estimation of soft limits before indexing.
______
| )_
|
Report |
|
------ |
|
------ |
|
------ |
|
---- |
|________|
Best
sigma cut estimated at 2.56.
3321
real spots on this image.
Sigma
cut results in 10% noise is 2.4.
Suggested
sigma cut for indexing and geometry refinement is between 2.56 and 6.6.
Maximum
spot density
14.8/mrad
at Bragg angle of
16.0
degree.
Diffraction
limit estimated at Bragg angle of
24.6
degree or
1.56 A resolution.
Suggested resolution for
indexing and geometry refinement is 1.96 A.
Listing 12.1.1.0.2 Part of limit1.log, results from soft limit estimation.
12.1.2 Indexing
Under the guidance of these soft limits, we are now ready to index a pattern. As noted before, it is perhaps the easiest to index a random orientation pattern like m37v_1a_015.mar3450 (Figure 12.1.1.0.3) rather than the first image (Figure 12.1.1.0.1). As a matter of fact, I had hard time to find a pattern in the set so that I can demonstrate mis-indexing.
12.1.2.1 Indexing
The indexing script is listed below. Once again, distance, pixel size, and
goniometer setting are precisely known.
It is a good idea to fix distance without refinement. Center and wavelength should be
approximately known. See 2.1.3 for
discussion on error of direct-beam center.
Resolution and s-cut
are taken from the soft limit estimation.
Here, a one-line script for crystal information is recommended, instead
of the crystal information file (11.8.2), which is being deprecated. This script can be stored in a centralized
special directory, say ~/xtal_info,
to be accessed by many data processing jobs. Two spot files m37v_1a_015.re.spt and m37v_1a_015.pre.spt
will contain recognized and predicted spots after geometric
refinement.
Listing
12.1.2.1.1 m37vHbI-CO.inp, a
one-line script of crystal information.
diagnostic off
busy
off
Input
@ m37vHbI-CO.inp
Distance 180 fix
Center 1705 1711
Pixel 0.1 0.1
Goniometer 0 0 42
Format Mar345
Image
images/m37v_1a_015.mar3450
Resolution 2.0 100
Wavelength 1.0 1.5 1.1
Quit
Spot 8 8 6 m37v_1a_015.re.spt
Profile
Ellipse
Nodal
Pattern
m37v_1a_015.pre.spt
Quit
Listing 12.1.2.1.2 index.inp, command script for indexing.
The process carried out by index.inp has several other findings regardless success or failure of indexing (Listing 12.1.2.1.3). First, the command Profile learns an overall spot profile, therefore better spot size can be suggested. For example, this process suggests an overall spot size of 8 ¡Á 6 instead of 8 ¡Á 8 pixels. The newly suggested spot size shall be used by subsequent jobs. Second, by learning the spot profile, it estimates the crystal dimension and mosaic spread by a spherical crystal and isotropic mosaic model. It is harder to validate their accuracy, nevertheless, these values are useful in scaling. It would also be a good idea to insert a command Profile into the script limit1.inp (Listing 12.1.1.0.1) after the command Spot, so that these estimates can be obtained even earlier. Third, direct-beam center is refined by the command Ellipse from (1705, 1711) to (1704.65, 1711.50). This center is subsequently refined to (1704.648, 1711.326) against more than 1000 spots, which demonstrates that the center refined by a few ellipses is already quite accurate. Finally, some nodal spots are found by the command Nodal. Even if indexing is unsuccessful, these information may be useful in manual indexing.
This pattern was indexed without any struggle, and refined to a very low r.m.s.d. residual of 20 mm from 1238 spots. Figure 12.1.2.1.1 shows the predicted pattern after refinement. Accurate prediction lays a solid foundation for spot integration.
______
| )_
|
Report |
|
------ |
|
------ |
|
------ |
|
---- |
|________|
An
overall mean profile is recognized.
Semi-major
& -minor axes (pixel): 2.7398 1.93204
Non-elliptical
correction:
0.00750808 0.0213153 0.0429673 0.00486854
Non-Gaussian
correction:
0.918543 0.821306
R.m.s.d.
(detector count):
0.000264097
Overall
spot length is set to 8 pixels.
Overall
spot width is set to 6 pixels.
Estimated
crystal dimension is 0.579033 mm.
Estimated
mosaic spread in FWHM is 0.00856099 degree.
¡
Direct-beam
center is set to 1704.65, 1711.5 in pixel.
¡
72
Nodals Recognized (Ordered by Rank)
Center (pixel)
Intensity Sigma(I)
_________________
__________ __________
0 0 0 1839.44 1729.09 127480.8 628.4
0 0 0 1509.85 2002.07 10219.2 185.5
0 0 0 1190.44 2027.94 60649.0 434.3
0 0 0 1607.14 1641.84 353385.2 1035.3
0 0 0 1578.12 2209.16 34719.0 326.2
0 0 0 1264.19 1296.46 18101.8 249.0
0 0 0 1734.85 2321.03
213.5
88.0
0 0 0 2251.67 2511.87
5748.0
142.9
0 0 0 2307.05 1533.35
4859.0
147.5
0 0 0 1867.74 2155.45 90569.8 522.1
¡
______
| )_
|
Report |
|
------ |
|
------ |
|
------ |
|
---- |
|________|
1
possible crystal orientation is recognized;
corresponding
cell constants and detector parameters are refined.
Indexing
1
R.M.S.
deviation (pixel):
0.196704
Number
of spots matched:
1238
Cell
lengths (Angstrom):
93.2118
44.0000 83.5906
Cell
angles (degree):
90.0000 122.0802 90.0000
Missetting
matrix:
-0.06422445
0.02471138 -0.99762947
-0.04430840 0.99863688 0.02758878
0.99695135 0.04597524 -0.06304198
Goniometer
omega, chi, phi(degree):
0.0000
0.0000 42.0000
Omega-axis
polar orientation (deg):
90.0000
0.0000
Direct-beam
center (pixel):
1704.6480 1711.3262
Pixel
size (mm):
0.1000042 0.1000000
Detector
swing angles (degree):
0.0000
0.0000
Detector
tilt angles (degree):
0.1024
0.0627
Detector
bulge corrections:
-1.212e-06 4.493e-10
Listing 12.1.2.1.3 Part of index.log.

Figure 12.1.2.1.1 Predicted pattern displayed over real image.
12.1.2.2 Estimation of
more soft limits
Once a pattern is indexed and refined, some more soft limits, even the l-curve, can be estimated. This is done by the same command Limits, however, there are a number of differences in the command script. First of all, the non-frame-specific parameter file obtained from indexing shall be loaded, instead of the commands Distance, Center, and Pixel. Second, goniometer setting must be included, since the non-frame-specific parameter file does not have the frame-specific information. For the same reason, image must be loaded explicitly. Third, spot size and s-cut shall be updated with the latest values. In addition, a filename can be supplied as a string argument to the command Limits for saving the estimated l-curve. If a frame-specific parameter file is given, goniometer setting is no longer needed.
diagnostic off
@
m37v_1a_015.pre.spt.inp
Input
Goniometer 0 0 42
Format Mar345
Image
images/m37v_1a_015.mar3450
Wavelength 1 1.5 1.1
Quit
Spot
8 6 2.16
Limits 1 1.5 estimate.lam
Quit
Listing 12.1.2.2.1 limit2.inp, estimation of soft limits after indexing.

Figure 12.1.2.2.1 Estimated l-curves.
The results are similar to those in Listing 12.1.1.1.2, except all values may be updated due to the newly specified spot size. The estimated l-curves are shown in Figure 12.1.2.2.1. Their ranges can be specified by two numerical arguments to the command Limits. The curve with narrower range shown in red is more accurate. From the l-curves, better reference wavelength may be read.
12.1.2.3 Indexing problems
We have a pattern in a set indexed and refined. We have very good idea about many soft limits. We are ready to refine each pattern in the set. However, let¡¯s deviate from the main flow a bit in this section, and look at some potential problems in indexing.
The most common source of error is uncertainty in direct-beam center. A test shows that the radius of convergence is 5 pixels around the correct center. See 2.5 for discussions on various options after a mis-indexing. Here is an example of user-supplied nodal spots. m37v_1a_015.mar3450 (Figure 12.1.1.0.3) was mis-indexed when the initial center deviates too far from the correct coordinates, however, 70 nodal spots were found (Figure 12.1.2.3.1). The nodal spot file can be manually edited to select the most significant nodal spots. Selected spots shall be listed at the beginning of the file (Listing 12.1.2.3.1). This file can be loaded instead of running the auto-recognition command Nodal. The rest of the indexed ran smoothly.
0 0 0 1264.19 1296.46 18101.8 249.0
0 0 0 2040.03 1329.59 21089.0 263.1
0 0 0 2138.23 1921.38 135509.0 640.3
0 0 0 1190.44 2027.94 60649.0 434.3
0 0 0 2251.67 2511.87
5748.0
142.9
0 0 0 2307.05 1533.35
4859.0
147.5
0 0 0 1668.85 921.73
6006.0
153.3
0 0 0 2827.23 1495.98 426.0 60.9
¡
Listing 12.1.2.3.1 m37v_1a_015.ndl.spt, manually edited nodal spots.
diagnostic off
busy
off
Input
@ m37vHbI-CO.inp
Distance 180 fix
Center 1706 1716
# correct center
is 1705 1711
Pixel 0.1 0.1
Goniometer 0 0 42
Format Mar345
Image
images/m37v_1a_015.mar3450
Resolution 2.0 100
Wavelength 1.0 1.5 1.1
Nodal
m37v_1a_015.ndl.spt
Quit
Spot 8 8 6 m37v_1a_015.re.spt
Profile
Ellipse
Pattern m37v_1a_015.man.spt
Quit
Listing 12.1.2.3.2 index_man.inp, script for manual indexing.

Figure 12.1.2.3.1 Auto-recognized nodal spots marked in red circles. Manually selected ones are marked in red squares.
12.1.3 Geometric refinement
Pass a and b can be refined together or separately. In case of slipping crystal, refinement of pass a can even be broken into two parts, one before the indexed pattern in reverse order and another after. Listing 12.1.3.0.1 is an example of the script file.
diagnostic off
busy
off
@
m37v_1a_015.pre.spt.inp
Input
Crystal 0.05 0 0.05 0 0.1 0 free
Distance fix
Format MAR345
prompt off
result off
Goniometer 0 0 42
m37v_1a_015.mar3450
Goniometer 0 0 36
m37v_1a_013.mar3450
Goniometer 0 0 30
m37v_1a_011.mar3450
Goniometer 0 0 24
m37v_1a_009.mar3450
Goniometer 0 0 18
m37v_1a_007.mar3450
Goniometer 0 0 12
m37v_1a_005.mar3450
Goniometer 0 0 6 m37v_1a_003.mar3450
Goniometer 0 0 0 m37v_1a_001.mar3450
prompt on
result on
Resolution 2 100
Wavelength 1 1.5
Spot 8 6 2.3
Quit
Dataset
progressive
In
images
Quit
Quit
Listing 12.1.3.0.1 A command script that refines patterns before the indexed one in reverse order. Such order and progressive mode is necessary for refinement of slipping crystals.
diagnostic off
busy
off
@
m37v_1a_015.pre.spt.inp
Input
Crystal 0.05 0 0.05 0 0.1 0 free
Distance fix
Format MAR345
@ m37v_1a.inp
@ m37v_1b.inp
Resolution 2 100
Wavelength 1 1.5
Spot 8 6 2.3
Quit
Dataset
progressive
In
images
Quit
Quit
Listing 12.1.3.0.2 refine.inp, command script that refines pass a and b together.
It is recommended to refine cell constants under certain confinement, and not to refine distance at all. Two script files m37v_1a.inp and m37v_1b.inp define all goniometer settings of pass a and b, respectively. These files replace the deprecated .gon file. The results of the refinement include a set of parameter files for all frames.
A process of final refinement is optional. It updates each parameter file with the further refined values. One may choose to fix cell constants and perhaps use higher resolution limit for final refinement.
diagnostic off
busy
off
prompt off
result off
#
pass a
@
m37v_1a_001.mar3450.inp
@
m37v_1a_003.mar3450.inp
¡
@
m37v_1a_061.mar3450.inp
#
pass b
@
m37v_1b_001.mar3450.inp
@
m37v_1b_003.mar3450.inp
¡
@
m37v_1b_061.mar3450.inp
prompt on
result on
Input
Distance fix
Resolution 1.8 100
Wavelength 1 1.5
Spot 8 6 2.3
Quit
Dataset final
In
images
Quit
Quit
Listing 12.1.3.0.3 final.inp, command script of final refinement.
12.1.4 Integration
The command script for integration in Listing 12.1.4.0.1 is very similar to that of final refinement (Listing 12.1.3.0.3). An initial l-curve outline.lam is loaded by the command Input:Image. This l-curve will be used to filter out high resolution reflections at both wings of the spectrum according to wavelength-dependent bandwidth. The previously estimated l-curve estimate.lam shall serve the same purpose. The command Input:Spot specifies a spot size that initializes crystal mosaicity. These values may be updated automatically under some integration modes. The s-cut is used during selection of sample spots for profile fitting. Appropriate value of s-cut ranges as the same as that for indexing and refinement (Listing 12.1.1.0.2), but usually toward the greater end and preferably greater than 3. The saved .re.spt files can be used to check the suitable s-cut. This process saves a set .ii files that contain integrated intensities.
diagnostic off
busy
off
warning off
prompt off
result off
#
pass a
@
m37v_1a_001.mar3450.inp
@
m37v_1a_003.mar3450.inp
¡
@
m37v_1a_061.mar3450.inp
#
pass b
@
m37v_1b_001.mar3450.inp
@
m37v_1b_003.mar3450.inp
¡
@
m37v_1b_061.mar3450.inp
prompt on
result on
Input
Image outline.lam
Spot 8 6 3.9
Quit
Dataset linearAnalytical
In
images
Out
linearAnalytical
Resolution 1.6 100
Wavelength 1 1.5
Quit
Quit
Listing 12.1.4.0.1 integrate.inp, script for integration.
12.1.5 Wavelength
normalization and scaling
From this step on, the rest of the tasks are carried out by another program Epinorm (11.2), but command script has the same style. Using the previously estimated l-curve as an initial is a good idea, if it looks all right. Even an initial l-curve is given, explicitly specifies wavelength range, reference wavelength, and order of Chebyshev approximation are usually necessary, although the program default values are available. The given spot size initializes crystal mosaicity. The command Restore specifies a filename for saving the parameter set of scaling. If further scaling is desired, this file can be loaded back. Listing 12.1.5.0.2 gives an example of such script. This script scales linear anisotropic factors, and overwrites the previous parameter file m37v_ab.inp. The l-curves are shown in Figure 12.1.5.0.1.
diagnostic off
busy
off
warning off
prompt off
result off
#
pass a
@
m37v_1a_001.mar3450.inp
@
m37v_1a_003.mar3450.inp
¡
@
m37v_1a_061.mar3450.inp
#
pass b
@
m37v_1b_001.mar3450.inp
@
m37v_1b_003.mar3450.inp
¡
@
m37v_1b_061.mar3450.inp
prompt on
result on
Input
Image estimate.lam
Wavelength 1 1.5 1.1
Chebyshev 32
Spot 8 6
Quit
Restore m37v_ab.inp
Scale
2
Lambda
m37v_ab.lam
Quit
Listing 12.1.5.0.1 scale.inp, script for scaling.
diagnostic off
busy
off
warning off
prompt off
result off
#
pass a
@
m37v_1a_001.mar3450.inp
@
m37v_1a_003.mar3450.inp
¡
@
m37v_1a_061.mar3450.inp
#
pass b
@
m37v_1b_001.mar3450.inp
@
m37v_1b_003.mar3450.inp
¡
@
m37v_1b_061.mar3450.inp
@
m37v_ab.inp
prompt on
result on
Restore m37v_ab.inp
Scale
2 1
Quit
Listing 12.1.5.0.2 Command script for loading previous results and scaling again.

Figure 12.1.5.0.1 l-curves derived from scaling in red and the initially estimated l-curve in black.
12.1.6 Data merging, harmonic
and spatial deconvolution
Since harmonic and spatial overlap deconvolution takes advantage of the intrinsic high redundancy of Laue dataset, these deconvolutions become special cases of data merging. This job is carried out by the script in Listing 12.1.6.0.1. Resolution and wavelength ranges of accepted data can be given here. Spot size influences spatial overlap deconvolution. A series of commands Apply accept data at difference s-cuts. Merging statistics are shown in Table 12.1.6.0.1.
diagnostic off
busy
off
warning off
prompt off
result off
#
pass a
@
m37v_1a_001.mar3450.inp
@
m37v_1a_003.mar3450.inp
¡
@
m37v_1a_061.mar3450.inp
#
pass b
@
m37v_1b_001.mar3450.inp
@
m37v_1b_003.mar3450.inp
¡
@
m37v_1b_061.mar3450.inp
@
m37v_ab.inp
prompt on
result on
Input
Resolution 1.6 100
Wavelength 1.05 1.4
Spot 8 6
Quit
Apply 0.0 m37v_ab.0.0.hkl
Apply 0.5 m37v_ab.0.5.hkl
Apply 1.0 m37v_ab.1.0.hkl
Apply 1.5 m37v_ab.1.5.hkl
Apply 2.0 m37v_ab.2.0.hkl
Quit
Listing 12.1.6.0.1 apply.inp, script for data merging, harmonic, and spatial overlap deconvolution.
|
|
Single |
Harmonic |
Spatial overlap |
|
||||||
|
s-cut |
Rmodel/Rmerge |
II |
|
Rmodel |
II |
|
Rmodel |
|||