Sunday, May 25, 2008

A novel approach to predicting aminoacids and proteins by relation Even-Odd (Second part)

A novel approach to predicting aminoacids and proteins by relation Even-Odd

(Second part)

Lutvo Kurić

Independent Researcher

Bosnia and Herzegovina

72290 Novi Travnik

Kalinska 7

Tel. 061 763 917

lutvokuric@yahoo.com

Distance 7 in digital genetic code matrix

An even genetic code matrix An odd genetic code matrix

36

38

38

38

40

40

40

270

37

37

37

39

39

39

39

267

38

38

38

40

40

40

40

274

37

37

39

39

39

39

41

271

38

38

40

40

40

40

40

276

37

39

39

39

39

41

41

275

38

40

40

40

40

40

40

278

39

39

39

39

41

41

41

279

40

40

40

40

40

40

40

280

39

39

39

41

41

41

41

281

40

40

40

40

40

40

40

280

39

39

41

41

41

41

41

283

x

x

x

x

X

x

x

0

x

x

x

x

X

x

x

0

x

x

x

x

X

x

x

0

x

x

x

x

X

x

x

0

x

x

x

x

X

x

x

0

x

x

x

x

X

x

x

0

44

44

44

44

44

44

44

308

43

43

43

43

43

45

45

305

44

44

44

44

44

44

44

308

43

43

43

43

45

45

45

307

44

44

44

44

44

44

46

310

43

43

43

45

45

45

45

309

44

44

44

44

44

46

46

312

43

43

45

45

45

45

47

313

44

44

44

44

46

46

46

314

43

45

45

45

45

47

47

317

44

44

44

46

46

46

48

318

45

45

45

45

47

47

47

321

1070

1078

1084

1092

1100

1106

1114

7644

1068

1076

1084

1092

1100

1108

1116

7644

(270+274+276+278+280+280+280+282+284+286+288+290+292+296+298+300+302+304+306+308+

+308+308+310+312+314+318) = 7644;

(267+271+275+279+281+283+285+287+287+287+289+291+293+295+297+299+301+301+301+303+305+307+309+313+317+321) = 7644;

Therefore, groups with 7 triplets in digital tables of even genetic code have 7644 atoms. Triplets in distance 7 on opposite sides are muttualy attracted, respectively they mathematicaly gravitate towards each other. Here are some examples:

(1070+1114) = (1078+1106) = (1084+1100) = (1092 x 2) = 2184;

2184 = [(19 + 7) x Y];

(270+318) = (274+314) = (276+312)…, + (296+292);

Codes 19 and 7 in even code matrix

(D1+D2+D3…, + D19) + (D29+D28+D27..., + D11) = 11172 = [(7x19 x 7) x Y]

(D2+D2+D4…, + D20) + (D28+D27+D26…, + D10) = 11172 = [(7x19 x 7) x Y]

(D3+D2+D5…, + D21) + (D27+D26+D25…, + D9) = 11172 = [(7x19 x 7) x Y]

etc.

Codes 19 and 7 in odd code matrix

(D1+D2+D3…, + D19) + (D29+D28+D27..., + D11) = 11172 = [(7x19 x 7) x Y]

(D2+D2+D4…, + D20) + (D28+D27+D26…, + D10) = 11172 = [(7x19 x 7) x Y]

(D3+D2+D5…, + D21) + (D27+D26+D25…, + D9) = 11172 = [(7x19 x 7) x Y]

etc.

Distance 19 in digital genetic code matrih

An even genetic code matrix

36

38

38

38

40

40

40

40

40

40

40

40

40

42

42

42

42

42

42

762

38

38

38

40

40

40

40

40

40

40

40

40

42

42

42

42

42

42

44

770

38

38

40

40

40

40

40

40

40

40

40

42

42

42

42

42

42

44

44

776

38

40

40

40

40

40

40

40

40

40

42

42

42

42

42

42

44

44

44

782

40

40

40

40

40

40

40

40

40

42

42

42

42

42

42

44

44

44

44

788

40

40

40

40

40

40

40

40

42

42

42

42

42

42

44

44

44

44

44

792

40

40

40

40

40

40

40

42

42

42

42

42

42

44

44

44

44

44

44

796

40

40

40

40

40

40

42

42

42

42

42

42

44

44

44

44

44

44

44

800

40

40

40

40

40

42

42

42

42

42

42

44

44

44

44

44

44

44

44

804

40

40

40

40

42

42

42

42

42

42

44

44

44

44

44

44

44

44

44

808

40

40

40

42

42

42

42

42

42

44

44

44

44

44

44

44

44

44

46

814

40

40

42

42

42

42

42

42

44

44

44

44

44

44

44

44

44

46

46

820

40

42

42

42

42

42

42

44

44

44

44

44

44

44

44

44

46

46

46

826

42

42

42

42

42

42

44

44

44

44

44

44

44

44

44

46

46

46

48

834

552

558

562

566

570

572

576

580

584

588

592

596

600

604

606

610

614

618

624

11172

Codes 19 and 7 in even code matrix

(D1+D2+D3+D4+D5+D6+D7) + (D14+D13+D12+D11+D10+D9+D8) = 11172 = [(7x19 x 7) x Y]

(D2+D3+D4+D5+D6+D7+D8) + (D13+D12+D11+D10+D9+D8+D7) = 11172 = [(7x19 x 7) x Y]

(D3+D4+D5+D6+D7+D8+D9) + (D12+D11+D10+D9+D8+D7+D6) = 11172 = [(7x19 x 7) x Y]

etc.

Codes 19 and 7 in odd code matrix

(D1+D2+D3+D4+D5+D6+D7) + (D14+D13+D12+D11+D10+D9+D8) = 11172 = [(7x19 x 7) x Y]

(D2+D3+D4+D5+D6+D7+D8) + (D13+D12+D11+D10+D9+D8+D7) = 11172 = [(7x19 x 7) x Y]

(D3+D4+D5+D6+D7+D8+D9) + (D12+D11+D10+D9+D8+D7+D6) = 11172 = [(7x19 x 7) x Y]

etc.

An odd genetic code matrix

37

37

37

39

39

39

39

41

41

41

41

41

41

41

41

41

43

43

43

765

37

37

39

39

39

39

41

41

41

41

41

41

41

41

41

43

43

43

43

771

37

39

39

39

39

41

41

41

41

41

41

41

41

41

43

43

43

43

43

777

39

39

39

39

41

41

41

41

41

41

41

41

41

43

43

43

43

43

43

783

39

39

39

41

41

41

41

41

41

41

41

41

43

43

43

43

43

43

43

787

39

39

41

41

41

41

41

41

41

41

41

43

43

43

43

43

43

43

43

791

39

41

41

41

41

41

41

41

41

41

43

43

43

43

43

43

43

43

43

795

41

41

41

41

41

41

41

41

41

43

43

43

43

43

43

43

43

43

45

801

41

41

41

41

41

41

41

41

43

43

43

43

43

43

43

43

43

45

45

805

41

41

41

41

41

41

41

43

43

43

43

43

43

43

43

43

45

45

45

809

41

41

41

41

41

41

43

43

43

43

43

43

43

43

43

45

45

45

45

813

41

41

41

41

41

43

43

43

43

43

43

43

43

43

45

45

45

45

47

819

41

41

41

41

43

43

43

43

43

43

43

43

43

45

45

45

45

47

47

825

41

41

41

43

43

43

43

43

43

43

43

43

45

45

45

45

47

47

47

831

554

558

562

568

572

576

580

584

586

588

590

592

596

600

604

608

614

618

622

11172

Therefore, groups with 19 triplets in digital tables of genetic code have 11172 atoms. And also in this example groups of triplets on opposite sides mutually attract forces of genetic gravitation.

Here are some examples:

(552+624) = (558+618) = (562+614) = (566+610)…, etc.

(762+834) = (770+826) = (776+820)…, etc.

11172 = [(7 x 19 x 7) x Y]; Y = 12;

Codes 19 and 7 in even code matrix

(D1+D2+D3+D4+D5+D6+D7) + (D14+D13+D12+D11+D10+D9+D8) = 11172 = [(7x19 x 7) x Y]

(D2+D3+D4+D5+D6+D7+D8) + (D13+D12+D11+D10+D9+D8+D7) = 11172 = [(7x19 x 7) x Y]

(D3+D4+D5+D6+D7+D8+D9) + (D12+D11+D10+D9+D8+D7+D6) = 11172 = [(7x19 x 7) x Y]

etc.

Codes 19 and 7 in odd code matrix

(D1+D2+D3+D4+D5+D6+D7) + (D14+D13+D12+D11+D10+D9+D8) = 11172 = [(7x19 x 7) x Y]

(D2+D3+D4+D5+D6+D7+D8) + (D13+D12+D11+D10+D9+D8+D7) = 11172 = [(7x19 x 7) x Y]

(D3+D4+D5+D6+D7+D8+D9) + (D12+D11+D10+D9+D8+D7+D6) = 11172 = [(7x19 x 7) x Y]

etc.

Distances in digital genetic code matrix

Number of atoms in triplets

An even code

Matrix

An odd code

matrix

D1

1344

D17

11424

D1

1344

D17

11424

D2

2604

D18

11340

D2

2604

D18

11340

D3

3780

D19

11172

D3

3780

D19

11172

D4

4872

D20

10920

D4

4872

D20

10920

D5

5880

D21

10584

D5

5880

D21

10584

D6

6804

D22

10164

D6

6804

D22

10164

D7

7644

D23

9660

D7

7644

D23

9660

D8

8400

D24

9072

D8

8400

D24

9072

D9

9072

D25

8400

D9

9072

D25

8400

D10

9660

D26

7644

D10

9660

D26

7644

D11

10164

D27

6804

D11

10164

D27

6804

D12

10584

D28

5880

D12

10584

D28

5880

D13

10920

D29

4872

D13

10920

D29

4872

D14

11172

D30

3780

D14

11172

D30

3780

D15

11340

D31

2604

D15

11340

D31

2604

D16

11424

D32

1344

D16

11424

D32

1344




















D1=D32; D2=D31; D3=D30; etc.

In previously mentioned examples forces of mathematical gravitation are interconnected groups with different number of triplets.

C O N C L U S I O N

It is a rewarding work to translate the biochemical language of amino acids into a digital language because it may be very useful for developing new methods for predicting protein sub cellular localization, membrane protein type, protein structure secondary prediction or any other protein attributes. 

This is because ever since the concept of Chou's pseudo amino acid composition was proposed [1,2], many efforts have been made trying to use various digital numbers to represent the 20 native amino acids in order to better reflect the sequence-order effects through the vehicle of pseudo amino acid composition. Some investigators used complexity measure factor [3], some used the values derived from the cellular automata [4-7], some used hydrophobic and/or hydrophilic values [8-16], some were through Fourier transform [17, 18], and some used the physicochemical distance [19]. 
 

In view of this, we finding might have a series of impacts to the aforementioned work. (33) We devoted to provide a digital code for each of 20 native amino acids. These digital codes should more complete and better reflect the essence of each of the 20 amino acids. Therefore, it might stimulate a series of future work by using the author’s digital codes to formulate the pseudo amino acid composition for predicting protein structure class [20-22], subcellular location [23, 24], membrane protein type [9, 25], enzyme family class [26, 27], GPCR type [28, 29], protease type [30], protein-protein interaction [31], metabolic pathways [32], protein quaternary structure [33], and other protein attributes.

Now, it is going to be possible to use the completely new strategy of research in genetics. However, observation of all these relations which are the outcome of the periodic law (actually, of the law of binary coding) is necessary, because it can be of great importance for decoding conformational forms and stereo-chemical and digital structure of proteins.

R E F E R E N C E S

[1] K. C. Chou (2002) in Gene Cloning & Expression Technologies, Chapter 4 (Weinrer, P. W., and Lu, Q., Eds.), pp. 57-70 Eaton Publishing, Westborough, MA.

[2] K. C. Chou, Prediction of protein cellular attributes using pseudo amino acid composition, PROTEINS: Structure, Function, and Genetics (Erratum: ibid., 2001, Vol.44, 60) 43 (2001) 246-255.

[3] X. Xiao, S. Shao, Y. Ding, Z. Huang, Y. Huang, K. C. Chou, Using complexity measure factor to predict protein subcellular location, Amino Acids 28 (2005) 57-61.

[4] X. Xiao, S. Shao, Y. Ding, Z. Huang, X. Chen, K. C. Chou, Using cellular automata to generate Image representation for biological sequences, Amino Acids 28 (2005) 29-35.

[5] X. Xiao, S. Shao, Y. Ding, Z. Huang, X. Chen, K. C. Chou, An Application of Gene Comparative Image for Predicting the Effect on Replication Ratio by HBV Virus Gene Missense Mutation, Journal of Theoretical Biology 235 (2005) 555-565.

[6] X. Xiao, S. H. Shao, Z. D. Huang, K. C. Chou, Using pseudo amino acid composition to predict protein structural classes: approached with complexity measure factor, Journal of Computational Chemistry 27 (2006) 478-482.

[7] X. Xiao, S. H. Shao, Y. S. Ding, Z. D. Huang, K. C. Chou, Using cellular automata images and pseudo amino acid composition to predict protein sub-cellular location, Amino Acids 30 (2006) 49-54.

[8] K. C. Chou, Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes, Bioinformatics 21 (2005) 10-19.

[9] K. C. Chou, Y. D. Cai, Prediction of membrane protein types by incorporating amphipathic effects, Journal of Chemical Information and Modeling 45 (2005) 407-413.

[10] Z. P. Feng, Prediction of the subcellular location of prokaryotic proteins based on a new representation of the amino acid composition, Biopolymers 58 (2001) 491-499.

[11] Z. P. Feng, An overview on predicting the subcellular location of a protein, In Silico Biol 2 (2002) 291-303.

[12] M. Wang, J. Yang, Z. J. Xu, K. C. Chou, SLLE for predicting membrane protein types, Journal of Theoretical Biology 232 (2005) 7-15.

[13] S. Q. Wang, J. Yang, K. C. Chou, Using stacked generalization to predict membrane protein types based on pseudo amino acid composition, Journal of Theoretical Biology, in press (2006) doi:10.1016/j.jtbi.2006.1005.1006.

[14] M. Wang, J. Yang, G. P. Liu, Z. J. Xu, K. C. Chou, Weighted-support vector machines for predicting membrane protein types based on pseudo amino acid composition, Protein Engineering, Design, and Selection 17 (2004) 509-516.

[15] S. W. Zhang, Q. Pan, H. C. Zhang, Z. C. Shao, J. Y. Shi, Prediction protein homo-oligomer types by pseudo amino acid composition: Approached with an improved feature extraction and naive Bayes feature fusion, Amino Acids 30 (2006) 461-468.

[16] Y. Gao, S. H. Shao, X. Xiao, Y. S. Ding, Y. S. Huang, Z. D. Huang, K. C. Chou, Using pseudo amino acid composition to predict protein subcellular location: approached with Lyapunov index, Bessel function, and Chebyshev filter, Amino Acids 28 (2005) 373-376.

[17] Y. Z. Guo, M. Li, M. Lu, Z. Wen, K. Wang, G. Li, J. Wu, Classifying G protein-coupled receptors and nuclear receptors based on protein power spectrum from fast Fourier transform, Amino Acids 30 (2006) 397-402.

[18] H. Liu, M. Wang, K. C. Chou, Low-frequency Fourier spectrum for predicting membrane protein types, Biochem Biophys Res Commun 336 (2005) 737-739.

[19] K. C. Chou, Prediction of protein subcellular locations by incorporating quasi-sequence-order effect, Biochemical & Biophysical Research Communications 278 (2000) 477-483.

[20] Ricard Dawkins: Science and Sensibility- Queen Elizabeth Hall Lecture, London, 24th March 1998. Series title: Sounding the Century (‘What will the Twentieth Century leave to its heirs?’)

[21] Knight, R.D; Freeland S.J. and Landweber, L.F. (1999) The 3 Faces of the

Genetic Code. Trends in the Biochemical Sciences 24(6), 241-247

[22] K. C. Chou, A novel approach to predicting protein structural classes in a (20-1)-D amino acid composition space, Proteins: Structure, Function & Genetics 21 (1995) 319-344.

[23] K. C. Chou, C. T. Zhang, Predicting protein folding types by distance functions that make allowances for amino acid interactions, Journal of Biological Chemistry 269 (1994) 22014-22020.

[24] K. C. Chou, C. T. Zhang, Review: Prediction of protein structural classes, Critical Reviews in Biochemistry and Molecular Biology 30 (1995) 275-349.

[25] K. C. Chou, D. W. Elrod, Protein subcellular location prediction, Protein Engineering 12 (1999) 107-118.

[26] K. C. Chou, Review: Prediction of protein structural classes and subcellular locations, Current Protein and Peptide Science 1 (2000) 171-208.

[27] K. C. Chou, D. W. Elrod, Prediction of membrane protein types and subcellular locations, PROTEINS: Structure, Function, and Genetics 34 (1999) 137-153.

[28] K. C. Chou, D. W. Elrod, Prediction of enzyme family classes, Journal of Proteome Research 2 (2003) 183-190.

[29] K. C. Chou, Y. D. Cai, Predicting enzyme family class in a hybridization space, Protein Science 13 (2004) 2857-2863.

[30] K. C. Chou, D. W. Elrod, Bioinformatical analysis of G-protein-coupled receptors, Journal of Proteome Research 1 (2002) 429-433.

[31] K. C. Chou, Prediction of G-protein-coupled receptor classes, Journal of Proteome Research 4 (2005) 1413-1418.

[32] K. C. Chou, Y. D. Cai, Prediction of protease types in a hybridization space, Biochem. Biophys. Res. Comm. 339 (2006) 1015-1020.

[33] L.Kurić (2007) The digital language of amino acids, Amino Acids, January 25,

2007.

[34] K. C. Chou, Y. D. Cai, Predicting protein-protein interactions from sequences in a hybridization space, Journal of Proteome Research 5 (2006) 316-322.

[35] K. C. Chou, Y. D. Cai, W. Z. Zhong, Predicting networking couples for metabolic pathways of Arabidopsis, EXCLI Journal 5 (2006) 55-65.

[36] K. C. Chou, Y. D. Cai, Predicting protein quaternary structure by pseudo amino acid composition, PROTEINS: Structure, Function, and Genetics 53 (2003) 282-289.

[37] Brooks, Dawn J.; Fresco, Jacques R.; Lesk, Arthur M.; and Singh, Mona. (2002). Evolution of Amino Acid Frequencies in Proteins Over Deep Time: Inferred Order of Introduction of Amino Acids into the Genetic Code. Molecular Biology and Evolution 19, 1645-1655.

[38] Mesure complexe des caractéristiques dynamiques de séries temporelles par l'utilisation d'indices en chaîne et de taux moyens de croissance / Lutvo Kuric in Journal de la société de statistique de Paris (127° Annee, N° 2 (1986, 2° Trim.), [03/11/2000])

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

Links to this post:

Create a Link

<< Home