POINTLESS (CCP4: Supported Program)
NAME
pointless
SYNOPSIS
pointless [HKLIN] foo_in.mtz
[Keyworded Input]
References
Input and Output files
Examples
Release Notes
DESCRIPTION
General disclaimer: this
program is very much under development and is likely to change.
Pointless has two possible functions:
Mode 1. (MODE LAUEGROUP). Given
a test dataset of unmerged observations (file HKLIN), the program looks
for
possible
symmetry based on the unit cell in order to determine the Laue group,
ie the symmetry of the
diffraction pattern. This mode is selected if no HKLREF dataset is
specified.
Warning Notes:
- The optimum scoring method remains to be discovered. The true
solution may not have the top score, but should be close to the top
- The program will find symmetry: there is no guarantee that
this symmetry is crystallographic or correct! Caveat
emptor.
Mode 2. (MODE ALTERNATIVE).
Given a test dataset, merged or unmerged (file HKLIN), and a merged
reference dataset in a known space group (file HKLREF), the program
tests any possible
alternative
indexing schemes of the test dataset to find which one best matches the
reference set. Alternative indexing schemes arise in high
symmetry space groups when the lattice symmetry is higher than the
point group symmetry (eg for trigonal space groups), but arise in any
space group from special relationships between cell parameters (eg an
orthorhombic cell with a=b).
In either mode, if a file is assigned to
HKLOUT, then the reindexed file will be written out from the HKLIN
file, using the best reindexing. In LAUEGROUP mode, the hklout file is
assigned to the "best" pointgroup, and in ALTERNATIVE mode, to the
spacegroup of the reference file.
Details of LAUEGROUP mode
- The maximum lattice symmetry consistent with the unit cell
dimensions from the HKLIN file is determined, within an angular
tolerence of 2 degrees (or that given on the TOLERANCE command).
Alternatively, if the command ORIGINALLATTICE is given, the lattice
symmetry corresponding to the space group in the HKLIN file is used.
- The intensity data are read from the HKLIN file, reindexed in the
asymmetric unit of the lattice symmetry (if necessary), and sorted to
bring potentially equivalent observations together.
- The intensities are normalised to E2 , making <E2>
= 1, using an overall B-factor and a further correction smoothed on
resolution bins. Unless resolution limits are explicitly set
(RESOLUTION command), an automatic high resolution limit is applied, at
the approximate point where <I>/<sigmaI> < IsigLimit
(default value 2.0, set with ISIGLIMIT command). It is best to exclude
weak high resolution data from the scoring functions, as they contain
no useful information for this purpose.
- All rotational symmetry elements of the lattice symmetry are
first
scored separately. For example, in a tetragonal lattice, the symmetry
elements are: 4-fold axis along c; 2-fold axes along a, b, c, (110) and
(1-10)
- The most useful scoring function seems to be a correlation
coefficient
(CC) on E2 , calculated for pairs of observations related by
a particular symmetry element. Other scores calculated are an rms
difference (normalised by variances), and Rmeas, the
multiplicity-weighted R-factor. In order to allow for small samples,
the CC score is converted to a "significance" score or Z-score by
dividing by an estimated standard deviation. This is calculated by
taking many pairs of observations at the same resolution which cannot
be related by symmetry, dividing them into groups of the same size as
the test sample (with a maximum of 200), the score calculated for each
group and their mean & standard deviation calculated. Then
Z(score) = [Score - Mean(UnrelatedScore)]/Sigma(UnrelatedScore)]
- All Laue groups which are sub-groups of the lattice group are
generated by combining pairs of symmetry elements (including the
identity) and completing the groups. The sub-groups are then scored by
combining the scores for the individual elements, counting scores for
elements present in the sub-group as positive & those elements not
in the sub-group as negative. It is not clear what is the best way of
combining the element scores: at present two methods are used
- Combined Score ("Zc" in logfile)
The correlation coefficients are recalculated (summed) over all "for"
and over all "against" elements, Z(for) and Z(against) are calculated,
then NetZ = Z(for) - Z(against)
- RMS average score ("Za" in logfile)
Z(for or against) = +/-Sqrt(Sum(+/-Z(element)2))
where "+/-" follows the sign of Z(element)
NetZ = Z(for) - Z(against)
- The potential Laue groups are ranked according to scoring method
(1), and tested for acceptance for further output and testing. A group
is accepted if its score is:
- greater than (AcceptanceLimit * Maximum score) where
AcceptanceLimit is set by the ACCEPT command [default 0.9]
or
- greater than (Maximum score - AcceptanceDifference) where
AcceptanceDifference is set by the ACCEPT command [default
1] or
- the score from method (2) is greater than its value for
the first ranked group.
- More information is printed for accepted Laue groups.
KEYWORDED INPUT - DESCRIPTION
Keywords are:
ORIGINALLATTICE, RESOLUTION,
ISIGLIMIT, NONCHIRAL,
LABREF,
LABIN, LAUEGROUP, REINDEX, TOLERANCE, ACCEPT
All input is optional. Only the first four characters of each keyword
are significant.
ORIGINALLATTICE
Use the original lattice symmetry from the file instead of determining
the maximum lattice symmetry from the cell dimensions.
RESOLUTION [[LOW] <ResMin>] [[HIGH]
<ResMax>
Resolution limits in A, either order or with keys HIGH or LOW. If this
command is absent, the program imposes an automatic high resolution
limit based on a minimum value for <I>/<sigmaI> within
resolution shells (see ISIGLIMIT). Limits
given here override the I/sigma limits.
ISIGLIMIT
<minimum<I>/<sigmaI>>
Minimum value for <I>/<sigmaI> within resolution shells.
This is used to set the maximum resolution for inclusion of data in the
scoring. This is overridden by explicit RESOLUTION
limits. Default value 4.0.
NONCHIRAL [CENTROSYMMETRIC]
If this is present, the lists of possible space groups include
non-chiral (or just centrosymmetric) ones as well as the [default]
chiral ones.
LABREF [F|I =]<columnlabel>
Only for MODE ALTERNATIVE (ie if HKLREF is assigned). For the reference
dataset, this defines the column label for intensity or amplitude
(which will be squared to an intensity). If this command is omitted,
the first intensity or amplitude will be used. The next column is
assumed to contain the corresponding sigma.
LABIN [F|I =]<columnlabel>
Only for MODE ALTERNATIVE (ie if HKLREF is assigned) and if the test
dataset is merged. For the test dataset, this defines the column label
for intensity or
amplitude (which will be squared to an intensity). If this command is
omitted, the first intensity or amplitude will be used. The next column
is assumed to contain the corresponding sigma.
LAUEGROUP <Laue group name>
Select a Laue group instead of testing all possible ones, ie select one
solution for further processing. A REINDEX command may be given to
specify a particular reindexing operator.
REINDEX
<reindex operator>
Specify a reindex operator (in the form eg "k,h,-l") to go with a
specified Laue group.
TOLERANCE <LatticeTolerance>
Tolerance in degrees for determination of lattice symmetry [default 2
degrees].
ACCEPT <AcceptanceLimit> <AcceptanceDifference>
Parameters for acceptance criterion. A group is accepted if its score
is:
- greater than (AcceptanceLimit * Maximum score) where
AcceptanceLimit is set by the ACCEPT command [default 0.9]
or
- greater than (Maximum score - AcceptanceDifference) where
AcceptanceDifference is set by the ACCEPT command [default 1]
Input and output files
HKLIN
The file containing the test dataset.
Mode
1. This must be an unmerged file of intensities eg from
Mosflm
Compulsory columns are H, K, L, M/ISYM, BATCH, I, SIGI
Optional columns are IPR, SIGIPR, TIME, XDET, YDET, ROT, WIDTH, MPART,
FRACTIONCALC, LP, FLAG, BGPKRATIOS, SCALE, SIGSCALE
If a SCALE column is present it will be applied on input.
Mode 2.
This may be unmerged (as above) or merged. Unless a column is specified
in the control input, the first column of type J (intensity) or F
(amplitude) will be used for comparison with the reference dataset.
Amplitudes are squared to intensities on input.
HKLREF
The file containing the reference dataset for Mode 2 (alternative).
This must be
merged. Unless a column is specified in the control input, the first
column of
type J (intensity) or F (amplitude) will be used for comparison with
the reference dataset. Amplitudes are squared to intensities on input.
HKLOUT
In LAUEGROUP mode, the test dataset reindexed in the "best" pointgroup
(the Laue group without a centre of symmetry). In ALTERNATIVE mode, the
test dataset with the best reindexing, in the spacegroup of the
reference dataset. Note that for a merged test dataset, in ALTERNATIVE
mode, reindexed reflections are not reduced to the
asymmetric unit, because reindexing may generate a Bijvoet-related
index and if there are anomalous differences these need to be
inverted.
Examples
Simple usage, all defaults (mode
Lauegroup):
pointless
[hklin] <filename.mtz>
With reference dataset (mode
alternative):
pointless hklref amph_I.mtz hklin
amph_scaled.mtz
With reference dataset and control input (mode alternative):
pointless hklref
n20n6c1c2n6x2x14e10e7.mtz \
hklin cd3_1_F.mtz << eof
resolution 4.0
labref F_nat20
labin F_cd3_1
eof
Release notes
0.5.0,1,2
Alternative ways of combining Z+ & Z- (Za, Zc), remove combined
RMSD printing, more input controls (ORIGINALLATTICE, LAUEGROUP,
REINDEX, TOLERANCE, ACCEPT)
0.4.0
HKLOUT output added
0.3.0
Mode Alternative, labin, labref
0.2.0
User input, resolution, Isiglimit, Nonchiral