Comparison of Methods of Detection of Exceptional Sequences in Prokaryotic Genomes
- Авторы: Rusinov I.S.1, Ershova A.S.1,2,3, Karyagina A.S.1,2,3, Spirin S.A.1,4,5, Alexeevski A.V.1,4,5
- 
							Учреждения: 
							- Belozersky Institute of Physico-Chemical Biology
- Gamaleya National Research Center of Epidemiology and Microbiology
- All-Russia Research Institute of Agricultural Biotechnology
- Institute of System Studies
- Faculty of Bioengineering and Bioinformatics
 
- Выпуск: Том 83, № 2 (2018)
- Страницы: 129-139
- Раздел: Article
- URL: https://journals.rcsi.science/0006-2979/article/view/151596
- DOI: https://doi.org/10.1134/S0006297918020050
- ID: 151596
Цитировать
Аннотация
Many proteins need recognition of specific DNA sequences for functioning. The number of recognition sites and their distribution along the DNA might be of biological importance. For example, the number of restriction sites is often reduced in prokaryotic and phage genomes to decrease the probability of DNA cleavage by restriction endonucleases. We call a sequence an exceptional one if its frequency in a genome significantly differs from one predicted by some mathematical model. An exceptional sequence could be either under- or over-represented, depending on its frequency in comparison with the predicted one. Exceptional sequences could be considered biologically meaningful, for example, as targets of DNA-binding proteins or as parts of abundant repetitive elements. Several methods to predict frequency of a short sequence in a genome, based on actual frequencies of certain its subsequences, are used. The most popular are methods based on Markov chain models. But any rigorous comparison of the methods has not previously been performed. We compared three methods for the prediction of short sequence frequencies: the maximum-order Markov chain model-based method, the method that uses geometric mean of extended Markovian estimates, and the method that utilizes frequencies of all subsequences including discontiguous ones. We applied them to restriction sites in complete genomes of 2500 prokaryotic species and demonstrated that the results depend greatly on the method used: lists of 5% of the most under-represented sites differed by up to 50%. The method designed by Burge and coauthors in 1992, which utilizes all subsequences of the sequence, showed a higher precision than the other two methods both on prokaryotic genomes and randomly generated sequences after computational imitation of selective pressure. We propose this method as the first choice for detection of exceptional sequences in prokaryotic genomes.
Об авторах
I. Rusinov
Belozersky Institute of Physico-Chemical Biology
														Email: aba@belozersky.msu.ru
				                					                																			                												                	Россия, 							Moscow, 119992						
A. Ershova
Belozersky Institute of Physico-Chemical Biology; Gamaleya National Research Center of Epidemiology and Microbiology; All-Russia Research Institute of Agricultural Biotechnology
														Email: aba@belozersky.msu.ru
				                					                																			                												                	Россия, 							Moscow, 119992; Moscow, 123098; Moscow, 127550						
A. Karyagina
Belozersky Institute of Physico-Chemical Biology; Gamaleya National Research Center of Epidemiology and Microbiology; All-Russia Research Institute of Agricultural Biotechnology
														Email: aba@belozersky.msu.ru
				                					                																			                												                	Россия, 							Moscow, 119992; Moscow, 123098; Moscow, 127550						
S. Spirin
Belozersky Institute of Physico-Chemical Biology; Institute of System Studies; Faculty of Bioengineering and Bioinformatics
														Email: aba@belozersky.msu.ru
				                					                																			                												                	Россия, 							Moscow, 119992; Moscow, 117281; Moscow, 119991						
A. Alexeevski
Belozersky Institute of Physico-Chemical Biology; Institute of System Studies; Faculty of Bioengineering and Bioinformatics
							Автор, ответственный за переписку.
							Email: aba@belozersky.msu.ru
				                					                																			                												                	Россия, 							Moscow, 119992; Moscow, 117281; Moscow, 119991						
Дополнительные файлы
 
				
			 
						 
					 
						 
						 
						 
									 
  
  
  
  
  Отправить статью по E-mail
			Отправить статью по E-mail  Открытый доступ
		                                Открытый доступ Доступ предоставлен
						Доступ предоставлен Только для подписчиков
		                                		                                        Только для подписчиков
		                                					