Protein Threading Sequence

Template-based:
Already known protein structure is used. The difference is that in this case, a single structure is used. Instead a family of folds is used e.g. globular family etc. Threading is actually a generalization of homology modeling when a single protein structure (know) fails in the sequence alignment to establish homology i.e to act as a template then threading is selected. Threading is also called as

1. ➡️ Fold Recognition
Such an algorithm is designed for threading that shows which fold will act as the best fold for our protein.

2. ➡️ Inverse Folding
The fold is recognized first but protein is folded later.

(Show the same approach with a bit different).

In homology modeling, the sequence was aligned with sequence and then folded into the structure but in case of threading, we take out of existing folds/shapes. It is recognized first and then used. A library of folds is constructed.

The basic theory of using threading is that all proteins of all organisms have structural folds in only thousands whereas the number of proteins is in millions. The number of newly discovered structural folds is less. The assessment or estimation of protein structures is that all proteins can maximum go around 10000 folds, not more than it. Thus such a number of folds can be used as a template.

To build a model known experimental data is used. The algorithm used here is of sequence-structure alignment. It is also called as threading because as we have different libraries of folds so we can thread our sequence on it or align our sequence with its structure.

Through an algorithm, it is seen that till how much area the sequence can be extended over the structure. In sequence-structure alignment, gaps come as well. Sometimes gaps are not modeled hence it is not known that what will not affect the structure globally because homology is used 30% amino acids in homology are required to form a specific fold. Even in the twilight zone, the specific fold may form. Thus it is determined homology.

The matrix for alignment formed is 0f 18x20.

(Dynamic programming is also used here ).

18=Profiles 20=Amino acids

Profiles:
This means that either the amino acids will be exposed buried or partially exposed / partially buried. Thus, different programs measure surface accessibility. The basis of it is biochemical properties (hydrophobicity hydrophilic) polarization character varies within a patch e.g

Hydrophillic with increased polar character

Hydrophillic with decreased polar character.

Hydropathy plot is a ratio of hydrophilicity and hydrophobicity.

The most prominent character of hydrophobicity & hydrophilicity is polar character i.e which amino acid will be a surface and which not.

Surface accessibility depends on

Hydrophobic/hydrophilic character
Type of secondary structure.

On this base 18 profiles form

                                        Expose        E

                           Partially buried,                Partially polar             P1

                           Partially buried,                Fully polar                   P2

               B1, B2, B3 fully buried with an increasing degree of the polar environment. (Least, Moderate, High)

(When a helix is entangled between strong helix formers then it has to take helix shape. Hence, the polar environment can be around buried.)

When with these 6 conditions, the 3 secondary structures (alpha helix, Beta-sheets turns /coils) are combined than 18 profiles form.

Then the best alignment is chosen for building a model out of all the multiple alignments formed for each condition a model will form. Then this template will be used and a trace will be constructed on it followed by the addition of side chains.

A scoring function is designed for it:

Score (i,j) = loge [(Probability of residue in class i) / (Probability of residue j in any class)]

If residue has a high probability to fall in a specific class then it does not mean it cannot full in any other class but there its probability will be less. The less the denominator, the more will be the above value i.e numerator & score will be more.

As this method is a generation, hence every time the structure forming out of it will not be much reliable.

Editor's Recommendation:

Protein Threading Sequence

Protein Threading Sequence

No comments:

Translate

Facebook

Categories

Random Posts

Tags

Recent Posts