POINTLESS (CCP4: Supported Program)

NAME

pointless

SYNOPSIS

pointless [HKLIN] foo_in.mtz
[Keyworded Input]

References
Input and Output files
Examples
Release Notes

DESCRIPTION

General disclaimer: this program is very much under development and is likely to change.

Pointless has two possible functions:

Mode 1. (MODE LAUEGROUP). Given a test dataset of unmerged observations (file HKLIN), the program looks for possible symmetry based on the unit cell in order to determine the Laue group, ie the symmetry of the diffraction pattern. This mode is selected if no HKLREF dataset is specified.

Warning Notes:

Mode 2. (MODE ALTERNATIVE). Given a test dataset, merged or unmerged (file HKLIN), and a merged reference dataset in a known space group (file HKLREF), the program tests any possible alternative indexing schemes of the test dataset to find which one best matches the reference set. Alternative indexing schemes  arise in high symmetry space groups when the lattice symmetry is higher than the point group symmetry (eg for trigonal space groups), but arise in any space group from special relationships between cell parameters (eg an orthorhombic cell with a=b).

In either mode, if a file is assigned to HKLOUT, then the reindexed file will be written out from the HKLIN file, using the best reindexing. In LAUEGROUP mode, the hklout file is assigned to the "best" pointgroup, and in ALTERNATIVE mode, to the spacegroup of the reference file.

Details of LAUEGROUP mode

  1. The maximum lattice symmetry consistent with the unit cell dimensions from the HKLIN file is determined, within an angular tolerence of 2 degrees (or that given on the TOLERANCE command). Alternatively, if the command ORIGINALLATTICE is given, the lattice symmetry corresponding to the space group in the HKLIN file is used.
  2. The intensity data are read from the HKLIN file, reindexed in the asymmetric unit of the lattice symmetry (if necessary), and sorted to bring potentially equivalent observations together.
  3. The intensities are normalised to E2 , making <E2> = 1, using an overall B-factor and a further correction smoothed on resolution bins. Unless resolution limits are explicitly set (RESOLUTION command), an automatic high resolution limit is applied, at the approximate point where <I>/<sigmaI> < IsigLimit (default value 2.0, set with ISIGLIMIT command). It is best to exclude weak high resolution data from the scoring functions, as they contain no useful information for this purpose.
  4. All rotational symmetry elements of the lattice symmetry are first scored separately. For example, in a tetragonal lattice, the symmetry elements are: 4-fold axis along c; 2-fold axes along a, b, c, (110) and (1-10)
  5. The most useful scoring function seems to be a correlation coefficient (CC) on E2 , calculated for pairs of observations related by a particular symmetry element. Other scores calculated are an rms difference (normalised by variances), and Rmeas, the multiplicity-weighted R-factor. In order to allow for small samples, the CC score is converted to a "significance" score or Z-score by dividing by an estimated standard deviation. This is calculated by taking many pairs of observations at the same resolution which cannot be related by symmetry, dividing them into groups of the same size as the test sample (with a maximum of 200), the score calculated for each group and their mean & standard deviation calculated. Then

    Z(score) = [Score - Mean(UnrelatedScore)]/Sigma(UnrelatedScore)]

  6. All Laue groups which are sub-groups of the lattice group are generated by combining pairs of symmetry elements (including the identity) and completing the groups. The sub-groups are then scored by combining the scores for the individual elements, counting scores for elements present in the sub-group as positive & those elements not in the sub-group as negative. It is not clear what is the best way of combining the element scores: at present two methods are used
    1. Combined Score ("Zc" in logfile)
      The correlation coefficients are recalculated (summed) over all "for" and over all "against" elements, Z(for) and Z(against) are calculated, then NetZ = Z(for) - Z(against)
    2. RMS average score ("Za" in logfile)
      Z(for or against) = +/-Sqrt(Sum(+/-Z(element)2))                    where "+/-" follows the sign of Z(element)
      NetZ = Z(for) - Z(against)
  7. The potential Laue groups are ranked according to scoring method (1), and tested for acceptance for further output and testing. A group is accepted if its score is:
    1. greater than (AcceptanceLimit * Maximum score)  where AcceptanceLimit is set by the ACCEPT command [default 0.9]   or
    2. greater than (Maximum score - AcceptanceDifference) where AcceptanceDifference is set by the ACCEPT command [default 1]   or
    3. the score from method (2) is greater than its value for the first ranked group.
  8. More information is printed for accepted Laue groups.

KEYWORDED INPUT - DESCRIPTION

Keywords are:
ORIGINALLATTICE, RESOLUTION, ISIGLIMIT, NONCHIRAL, LABREF, LABIN, LAUEGROUP, REINDEX, TOLERANCE, ACCEPT

All input is optional. Only the first four characters of each keyword are significant.

ORIGINALLATTICE

Use the original lattice symmetry from the file instead of determining the maximum lattice symmetry from the cell dimensions.

RESOLUTION [[LOW] <ResMin>] [[HIGH] <ResMax>

Resolution limits in A, either order or with keys HIGH or LOW. If this command is absent, the program imposes an automatic high resolution limit based on a minimum value for <I>/<sigmaI> within resolution shells (see ISIGLIMIT). Limits given here override the I/sigma limits.

ISIGLIMIT <minimum<I>/<sigmaI>>

Minimum value for <I>/<sigmaI> within resolution shells. This is used to set the maximum resolution for inclusion of data in the scoring. This is overridden by explicit RESOLUTION limits. Default value 4.0.

NONCHIRAL [CENTROSYMMETRIC]

If this is present, the lists of possible space groups include non-chiral (or just centrosymmetric) ones as well as the [default] chiral ones.

LABREF  [F|I =]<columnlabel>

Only for MODE ALTERNATIVE (ie if HKLREF is assigned). For the reference dataset, this defines the column label for intensity or amplitude (which will be squared to an intensity). If this command is omitted, the first intensity or amplitude will be used. The next column is assumed to contain the corresponding sigma.

LABIN  [F|I =]<columnlabel>

Only for MODE ALTERNATIVE (ie if HKLREF is assigned) and if the test dataset is merged. For the test dataset, this defines the column label for intensity or amplitude (which will be squared to an intensity). If this command is omitted, the first intensity or amplitude will be used. The next column is assumed to contain the corresponding sigma.

LAUEGROUP <Laue group name>

Select a Laue group instead of testing all possible ones, ie select one solution for further processing. A REINDEX command may be given to specify a particular reindexing operator.

REINDEX  <reindex operator>

Specify a reindex operator (in the form eg "k,h,-l") to go with a specified Laue group.

TOLERANCE <LatticeTolerance>

Tolerance in degrees for determination of lattice symmetry [default 2 degrees].

ACCEPT <AcceptanceLimit>  <AcceptanceDifference>

Parameters for acceptance criterion. A group is accepted if its score is:
  1. greater than (AcceptanceLimit * Maximum score)  where AcceptanceLimit is set by the ACCEPT command [default 0.9]   or
  2. greater than (Maximum score - AcceptanceDifference) where AcceptanceDifference is set by the ACCEPT command [default 1]

Input and output files

HKLIN

The file containing the test dataset.

Mode 1. This must be an unmerged file of intensities eg from Mosflm

Compulsory columns are H, K, L, M/ISYM, BATCH, I, SIGI
Optional columns are IPR, SIGIPR, TIME, XDET, YDET, ROT, WIDTH, MPART, FRACTIONCALC, LP, FLAG, BGPKRATIOS, SCALE, SIGSCALE

If a SCALE column is present it will be applied on input.

Mode 2. This may be unmerged (as above) or merged. Unless a column is specified in the control input, the first column of type J (intensity) or F (amplitude) will be used for comparison with the reference dataset. Amplitudes are squared to intensities on input.

HKLREF

The file containing the reference dataset for Mode 2 (alternative). This must be merged. Unless a column is specified in the control input, the first column of type J (intensity) or F (amplitude) will be used for comparison with the reference dataset. Amplitudes are squared to intensities on input.

HKLOUT

In LAUEGROUP mode, the test dataset reindexed in the "best" pointgroup (the Laue group without a centre of symmetry). In ALTERNATIVE mode, the test dataset with the best reindexing, in the spacegroup of the reference dataset. Note that for a merged test dataset, in ALTERNATIVE mode, reindexed reflections are not reduced to the asymmetric unit, because reindexing may generate a Bijvoet-related index and  if there are anomalous differences these need to be inverted.

Examples

Simple usage, all defaults (mode Lauegroup):

 pointless [hklin] <filename.mtz>

With reference dataset (mode alternative):

pointless hklref amph_I.mtz hklin amph_scaled.mtz

With reference dataset and control input (mode alternative)
:

pointless hklref n20n6c1c2n6x2x14e10e7.mtz \
      hklin cd3_1_F.mtz << eof
resolution 4.0
labref  F_nat20
labin F_cd3_1
eof

Release notes

0.5.0,1,2

Alternative ways of combining Z+ & Z- (Za, Zc), remove combined RMSD printing, more input controls (ORIGINALLATTICE, LAUEGROUP, REINDEX, TOLERANCE, ACCEPT)

0.4.0

HKLOUT output added

0.3.0

Mode Alternative, labin, labref

0.2.0

User input, resolution, Isiglimit, Nonchiral