API Documentation
- class domhmm.PropertyCalculation(universe_or_atomgroup: Universe | AtomGroup, membrane_select: str = 'all', gmm_kwargs: None | dict = None, hmm_kwargs: None | dict = None, leaflet_kwargs: Dict[str, Any] = {}, leaflet_select: None | AtomGroup | str | list = None, heads: Dict[str, Any] = {}, tails: Dict[str, Any] = {}, sterol_heads: Dict[str, Any] = {}, sterol_tails: Dict[str, Any] = {}, tmd_protein_list: None | AtomGroup | str | list = None, frac: float = 0.5, p_value: float = 0.05, leaflet_frame_rate: None | int = None, sterol_frame_rate: int = 1, asymmetric_membrane: bool = False, verbose: bool = False, result_plots: bool = False, trained_hmms: Dict[str, Any] = {}, n_init_hmm: int = 2, save_plots: bool = False, **kwargs)[source]
Bases:
LeafletAnalysisBaseThe DirectorOrder class calculates an order parameter for each selected lipid according to the formula:
P2 = 0.5 * (3 * cos(a)^2 - 1), (1)
where an is the angle between a pre-defined director in the lipid (e.g. C-H or CC) and a reference axis.
- GMM(gmm_kwargs)[source]
Fit Gaussian Mixture
- Parameters:
gmm_kwargs (dict) – Additional parameters for mixture.GaussianMixture
- HMM(hmm_kwargs)[source]
Create Gaussian based Hidden Markov Models for each residue type
- Parameters:
hmm_kwargs (dict) – Additional parameters for hmmlearn.hmm.GaussianHMM
- static area_per_lipid_vor(coor_xy, boxdim, frac)[source]
Calculation of the area per lipid employing Voronoi tessellation on coordinates mapped to the xy plane. The function takes also the periodic boundary conditions of the box into account.
- Parameters:
coor_xy (numpy.ndarray) – Coordinates of upper/lower leaflet
boxdim (array) – Length of box vectors in all directions
frac (float) – Fraction of box length in x and y outside the unit cell considered for Voronoi calculation
- Returns:
vor (Voronoi Tesselation) – Scipy’s Voronoi Diagram object
apl (Area per lipid) – Area per lipid based on Scipy’s Voronoi Diagram
pbc_idx (Indices) – Unit cell indices for periodic image coordinates
- static assign_core_lipids(weight_matrix_f, g_star_i_f, order_states_f, w_ii_f, z_score)[source]
Assign lipids as core members (aka lipids with a high positive autocorrelation) depending on the Getis-Ord spatial local autocorrelation statistic.
- Parameters:
weight_matrix_f (numpy.ndarray) – Matrix containing the weight factors between all lipid pairs at one time step
g_star_i_f (numpy.ndarray) – Getis-Ord spatial local autocorrelation statistic for every lipid at one time step
order_states_f (numpy.ndarray) – Order states for every lipid at one time step
w_ii_f (numpy.ndarray) – Self-influence weight factor for every lipid at one time step
z_score (dict) – Contains boundary of the reaction region
- Returns:
core_lipids – Contains a TRUE value if the lipid is a core member, otherwise it FALSE
- Return type:
numpy.ndarray (bool)
- static calc_order_parameter(chain)[source]
Calculate average Scc order parameters per acyl chain according to the equation:
S_cc = (3 * cos( theta )^2 - 1) / 2,
where theta describes the angle between the z-axis of the system and the vector between two subsequent tail beads.
- Parameters:
chain (Selection) – MDAnalysis Selection object
- Returns:
s_cc – Mean S_cc parameter for the selected chain of each residue
- Return type:
numpy.ndarray
- clustering_plot()[source]
Runs hierarchical clustering and plots clustering results in different frames.
- fit_hmm(data, gmm, hmm_kwargs, n_repeats=10)[source]
Fit several HMM models to the data and return the best one.
- Parameters:
data (numpy.ndarray) – Data of the single lipid properties for one lipid type at each time step
gmm (GaussianMixture Model) – Scikit-learn object
hmm_kwargs (dict) – Additional parameters for Hidden Markov Model
n_repeats (int) – Number of independent fits
- Returns:
best_ghmm – hmmlearn object
- Return type:
GaussianHMM
- get_leaflet_step_order(leaflet, step)[source]
Receive residue’s order state with respect to the leaflet
- Parameters:
leaflet (numpy.ndarray) – leaflet index
step (numpy.ndarray) – step index
- Returns:
order_states – Numpy array contains order state results of the leaflet at step in order of system’s residues
- Return type:
numpy.ndarray
- get_leaflet_step_order_index(leaflet, step)[source]
Receive residue’s indexes and positions for a specific leaflet at any frame of the trajecytory.
- Parameters:
leaflet (numpy.ndarray) – leaflet index
step (numpy.ndarray) – step index
- Returns:
indexes (dict) – dictionary contains numpy.arrays containing residue’s indexes for each unique residue type
positions (dict) – dictionary contains numpy.arrays containing residue’s positions for each unique residue type
- static get_residue_idx(resids_index_map, resids)[source]
Checks resids index map (lookup table) to return corresponding indexes to use in DomHMM result section.
- Parameters:
resids_index_map (dict) – Lookup table where each residue id’s index in self.membrane_unique_resids is kepts
resids (numpy.ndarray) – Residue id list for indexing
- Returns:
idx – Indexes in the same order with resids
- Return type:
numpy.ndarray
- getis_ord_stat(weight_matrix_all, leaflet)[source]
Getis-Ord Local Spatial Autocorrelation Statistics calculation based on the predicted order states of each lipid and the weighting factors between neighbored lipids.
- Parameters:
weight_matrix_all (sparse.csr_array) – Weight matrix for all lipid in a leaflet at current time step
leaflet (int) – 0 = upper leaflet, 1 = lower leafet
- Returns:
g_star_i (list of numpy.ndarrays) – G*i values for each lipid at each time step
w_ii_all (list of numpy.ndarrays) – Self-influence of the lipids
- static hierarchical_clustering(weight_matrix_f, w_ii_f, core_lipids)[source]
Hierarchical clustering approach to identify spatial related Lo domains.
- Parameters:
weight_matrix_f (numpy.ndarray) – Matrix containing the weight factors between all lipid pairs at one time step
w_ii_f (numpy.ndarray) – Self-influence weight factor for every lipid at one time step
core_lipids (numpy.ndarray) – Array contains information which lipid is assigned as core member
- Returns:
clusters – The dictionary contains each found cluster
- Return type:
- static hmm_diff_checker(means, prediction_results)[source]
- Checks if prediction assignments correct with respect to means. Since HMM is unsupervised, model can assign 0 to
disordered domains and 1 to ordered domains. In these cases needs to be changed to vice versa.
- Parameters:
means (np.ndarray) – Table of model which contains means of area per lipid and order parameters
prediction_results (np.ndarray) – Initial prediction results which is 1s and 0s
- Returns:
prediction_results – Same or flipped prediction results with respect to means table
- Return type:
np.ndarray
- permut_getis_ord_stat(weight_matrix_all, leaflet)[source]
Getis-Ord Local Spatial Autocorrelation Statistics calculation based on the predicted order states of each lipid and the weighting factors between neighbored lipids.
- Parameters:
weight_matrix_all (sparse.csr_array) – Weight matrices for all lipid in a leaflet at each time step
leaflet (int) – 0 = upper leaflet, 1 = lower leafet
- Returns:
g_star_i – G*i values for each lipid at each time step
- Return type:
list of numpy.ndarrays
- prepare_train_data()[source]
Prepare train data for GMM and HMM while separating self.results.train with respect to each residue
- state_validate()[source]
Validate state assignments of HMM model by checking means of the model of each residue.
- static weight_matrix(vor, pbc_idx, coor_xy)[source]
Calculate the weight factors between neighbored lipid pairs based on a Voronoi tessellation.
- Parameters:
vor (Voronoi Tesselation) – Scipy’s Voronoi Diagram object
pbc_idx (Indices) – Unit cell indices of periodic image coordinates
coor_xy (numpy.ndarray) – Coordinates of upper/lower leaflet
- Returns:
weight_matrix – Weight factors wij between neighbored lipid pairs. Is 0 if lipids are not directly neighbored.
- Return type:
numpy.ndarray
- class domhmm.analysis.base.LeafletAnalysisBase(universe_or_atomgroup: Universe | AtomGroup, membrane_select: str = 'all', gmm_kwargs: None | dict = None, hmm_kwargs: None | dict = None, leaflet_kwargs: Dict[str, Any] = {}, leaflet_select: None | AtomGroup | str | list = None, heads: Dict[str, Any] = {}, tails: Dict[str, Any] = {}, sterol_heads: Dict[str, Any] = {}, sterol_tails: Dict[str, Any] = {}, tmd_protein_list: None | AtomGroup | str | list = None, frac: float = 0.5, p_value: float = 0.05, leaflet_frame_rate: None | int = None, sterol_frame_rate: int = 1, asymmetric_membrane: bool = False, verbose: bool = False, result_plots: bool = False, trained_hmms: Dict[str, Any] = {}, n_init_hmm: int = 2, save_plots: bool = False, **kwargs)[source]
Bases:
AnalysisBaseLeafletAnalysisBase class.
This class marks the start point for each analysis method.
It connects to the MDAnalysis core library, setups the MDAnalysis universe, and provides coordinates for each membrane leaflet.
- Parameters:
universe_or_atomgroup (
UniverseorAtomGroup) – Universe or group of atoms to apply this analysis to. If a trajectory is associated with the atoms, then the computation iterates over the trajectory.membrane_select (str) – Membrane selection query for analysis of simulation trajectories.
gmm_kwargs (Optional[Dict]) – Optional parameter for Gaussian mixture model function.
hmm_kwargs (Optional[Dict]) – Optional parameter for Hidden Markov model function.
leaflet_kwargs (Optional[dict]) – dictionary containing additional arguments for the MDAnalysis LeafletFinder
leaflet_select (Union[“auto”, List[AtomGroup], List[str]]) –
- Leaflet selection options for lipids which can be automatic by finding Leafletfinder, atomgroup, string query or
list
heads (Dict[str, Any]) – dictionary containing residue name and atom selection for lipid head groups
tails (Dict[str, Any]) – dictionary containing residue name and atom selection for lipid tail groups
sterol_heads (Dict[str, Any]) – dictionary containing residue name and atom selection for sterol head groups
sterol_tails (Dict[str, Any]) – dictionary containing residue name and atom selection for sterol tail groups (head as first, tail beginning as second)
tmd_protein_list (Union[“auto”, List[AtomGroup], List[str]]) – Transmembrane domain protein list to include area per lipid calculation
frac (float) – fraction of box length in x and y outside the unit cell considered for Voronoi calculation
p_value (float) – p_value for z_score calculation
leaflet_frame_rate (Union[None, int]) – Frame rate for checking lipids leaflet assignments via LeafletFinder
sterol_frame_rate (int) – Frame rate for checking sterols leaflet assignments via LeafletFinder
asymmetric_membrane (bool) – Asymmetric membrane option to train models by separated data w.r.t. leaflets
verbose (bool) – Debug option to print step progress, warnings and errors
result_plots (bool) – Plotting intermediate result option
save_plots (bool) – Option for saving intermediate plots in pdf format
trained_hmms (dict) – User-specific HMM (e.g., pre-trained on another simulation)
- Variables:
universe_or_atomgroup (
Universe) – The universe to which this analysis is appliedatomgroup (
AtomGroup) – The atoms to which this analysis is appliedresults (
Results) – results of calculation are stored here, after callingrun()start (Optional[int]) – The first frame of the trajectory used to compute the analysis
stop (Optional[int]) – The frame to stop at for the analysis
step (Optional[int]) – Number of frames to skip between each analyzed frame
n_frames (int) – Number of frames analysed in the trajectory
times (numpy.ndarray) – array of Timestep times. Only exists after calling
run()frames (numpy.ndarray) – array of Timestep frame indices. Only exists after calling
run()
- get_heads()[source]
Make an atomgroup containing all head groups (lipids + sterols).
- Returns:
all_heads – AtomGroup containing headgroups of all lipids and sterols
- Return type:
MDAnalysis.AtomGroup
- get_leaflets()[source]
Automatically assign non-sterol compounds to the upper and lower leaflet of a bilayer using the MDAnalysis.LeafletFinder
- get_leaflets_sterol()[source]
Assign sterol compounds to the upper and lower leaflet of a bilayer using a distance cut-off.
- get_lipid_tails()[source]
Make atomgroups for tailgroups for each chain of residues
- Variables:
self.resid_tails_selection (dict) – dictionary containing the tail atomgroups for all residues each tail