.. |AA| unicode:: U+212B ======== Tutorial ======== Contact Analysis ================ The basicrta workflow starts with collecting contacts between two atom groups ``sel1`` and ``sel2`` based on a single ``cutoff`` using contacts.py. :: python -m basicrta.contacts --top top.pdb --traj traj.xtc --sel1 "protein" --sel2 "resname CHOL" --cutoff 7.0 --nproc 4 This will create two contact maps: ``contacts_max10.0.pkl`` and ``contacts_7.0.pkl``. The first is used to map all contacts defined with a 10 |AA| cutoff, upon which the cutoff is imposed and a new contact map is created for each desired cutoff value (for storage reasons this may change in the future). The protein residues in the topology should be correctly numbered, as these will be the names of directories where results are stored. Gibbs Sampler ============= Next the contact map is used to collect the contacts between a specified residue of ``sel1`` and each residue of the ``sel2`` group. The contact durations (residence times) are then used as input data for the Gibbs sampler. A specific residue of ``sel1`` can be used to run a Gibbs sampler for only that residue :: python -m basicrta.gibbs --contacts contacts_7.0.pkl --nproc 5 --resid 313 or if ``resid`` is left out, a Gibbs sampler will be executed for all ``sel1`` residues in the contact map. :: python -m basicrta.gibbs --contacts contacts_7.0.pkl --nproc 5 Clustering ========== Next the samples obtained from the Gibbs sampler are processed and clustered. :: python -m basicrta.cluster --niter 110000 --nproc 3 --cutoff 7.0 --prot b2ar The ``prot`` argument is used to create rectangles in the :math:`\tau` vs resid plot that correspond to the TM segments of the protein (figures 7-10). Your protein can be added to ``basicrta/basicrta/data/tm_dict.txt`` in the same format as the existing proteins. Optional parameters ``--gskip`` and ``--burnin`` can be used to control the Gibbs sampler processing. The ``--gskip`` parameter specifies the skip interval for decorrelated samples (default: 100), while ``--burnin`` sets the number of initial samples to discard during equilibration (default: 10000). The default values are taken from the `research paper `_. :: python -m basicrta.cluster --niter 110000 --nproc 3 --cutoff 7.0 --prot b2ar \ --gskip 100 --burnin 10000 ``basicrta.cluster`` will process the Gibbs samplers, compute :math:`\tau` for each residue, plot :math:`\tau` vs resid, and write the data to ``tausout.npy`` (see next section). If a structure is passed to the script, the b-factors of the residues will be populated with the appropriate :math:`\tau`. Rate/Tau Estimates ================== Estimates for :math:`\tau` are obtained by using the ``write_data()`` method of the ``ProcessProtein`` class contained in ``cluster.py``. Data is saved as a ``.npy`` file and contains [protein residue, tau, CI lower bound, CI upper bound] for each residue of ``sel1`` analyzed (given dataset size is sufficiently large (~50 points)). Writing out all model parameters to a ``.npy`` file is currently not implemented, but will be possible in later versions. Kinetic Mapping =============== The kinetically mapped trajectory and weighted densities can be created using ``kinetics.py``. :: python -m basicrta.kinetics --gibbs basicrta_7.0/W313/gibbs_110000.pkl --contacts contacts_7.0.pkl --wdensity To create only the mapped trajectory, leave out the ``wdensity`` flag. :: python -m basicrta.kinetics --gibbs basicrta_7.0/W313/gibbs_110000.pkl --contacts contacts_7.0.pkl This can also be done using the ``top_n`` most likely frames belonging to each component of the exponential mixture model. :: python -m basicrta.kinetics --gibbs basicrta_7.0/W313/gibbs_110000.pkl --contacts contacts_7.0.pkl --top_n 500 Weighted densities can be computed over the ``top_n`` frames, over the whole trajectory, or by using the ``step`` argument in combination with ``top_n`` or the whole trajectory. :: python -m basicrta.kinetics --gibbs basicrta_7.0/W313/gibbs_110000.pkl --contacts contacts_7.0.pkl --step 100 Supplemental Scripts ==================== Slurm scripts for submitting Gibbs sampler jobs to distributed systems are located in the ``basicrta/scripts`` directory. Note that some of these were rewritten and may contain slight errors, testing still needs to be done (but can be used with some slight modifications).