Title: Digital Signal Processing Approaches in the field of Genomics: A Recent Trend

Authors: Shivani Saxena, Ahsan Z, Rizvi

 DOI: https://dx.doi.org/10.18535/jmscr/v12i02.10

Abstract

Digital signal processing (DSP) techniques have emerged as powerful tools in the field of genomics, enabling researchers to extract valuable insights from complex genetic data. This research paper presents a comprehensive analysis of the recent trends and advance- ments in applying DSP approaches to genomics. The objective is to provide an overview of the transformative role of DSP  in  genomic  data  analysis,  variant  calling, and interpretation. By leveraging DSP methods such as filtering, feature extraction, time-frequency analysis, and machine learning algorithms, researchers can enhance the quality of genetic signals, identify  genetic  variants, and gain a deeper understanding of genomic processes. The paper highlights key applications of DSP in genomics, including DNA sequence analysis, RNA expression pro- filing, epigenetics, and genome-wide association studies. Additionally, the challenges associated with applying DSP techniques in genomics, such as signal noise, data in- tegration, and computational complexity, are discussed. This research paper serves as a valuable resource for researchers, bioinformaticians, and geneticists seeking to harness the power of DSP in genomics, advancing our knowledge of genetic diseases and paving the way for personalized medicine and precision healthcare.

Keywords: Digital signal processing, Genome analysis, Feature extraction, DNA sequence analysis, RNA expression profiling.

References

  1. Gargour, M. Gabrea, V. Ramachandran, and J.M. Lina, “A Short Introduction to Wavelets and Their Applications”, IEEE Circuits and Systems Magazine, vol. 9, no. 2, pp. 57-68, Second Quarter 2009.
  2. cancer paper
  3. Sudipta Acharya, Laizhong Cui, Yi Pan, “A Re- fined 3-in-1 Fused Protein Similarity Measure: Application in Threshold-Free Hub Detection”, IEEE/ACM Transactions on Computational Biol- ogy and Bioinformatics ( Volume: 19, Issue: 1, -Feb. 1 2022), 13 February 2020 , D.O.I.
  4. Yu Tian, Ruiqing  Zheng,  Zhenlan  Liang,  Sun- ing Li, Fang-Xiang Wu, Min Li , “A data-driven clustering  recommendation  method  for  single- cell RNA-sequencing data”, Tsinghua Science and Technology ( Volume: 26, Issue: 5, Oct. 2021), 20 April 2021, O.I. (10.26599/TST.2020.9010028).
  5. Alexandre Gondeau, Zahia Aouabed, Mohamed Hijri, Pedro R. Peres-Neto, Vladimir Makarenkov, “Object Weighting: A New  Clustering  Approach to Deal with Outliers and Cluster Overlap in Computational Biology”, IEEE/ACM Transactions on Computational Biology and Bioinformatics ( Volume: 18, Issue: 2, March-April 1 2021). 10 June 2019, O.I. (10.1109/TCBB.2019.2921577).
  6. Jensen, Anders la Cour-Harbo, "Ripples in Mathematics: The Discrete Wavelet Transform", Springer Science Business Media, 2001.
  7. K. Nagar and D. Sokhi, “On Wavelet-Based Adaptive Approach for Gene Comparison”, Int’l
  8. Intelligent Systems Technologies and Applica- tions, vol. 5, pp. 104-114, 2008.
  9. Saxena, S., Nair, A.M., Rizvi, A.Z., "Analysis of COVID-19 Genome Using Continuous Wavelet Transform," Networks and Systems, vol 662, 1, pp. 1047-1077, 2023.
  10. Liu, S. Wan, and Y. Sun, “Identification of Splice Sites Based on Discrete Wavelet Transform and Support Vector Machine,” Proc. Int’l Conf. Bioinformatics and Biomedical Eng., 2008.
  11. Cattani, “Fractals and Hidden Symmetries in DNA,” Math. Problems in Eng., vol. 2010, pp. 1- 32, 2010.
  12. Cattani, “On the Existence of Wavelet Symme- tries in Archaea DNA,” Computational and Math. Methods in Medicine, vol. 2012, pp. 1-21, 2012.
  13. K. Meher, M.K. Raval, P.K. Meher, and G.N. Nash, “Wavelet Transform for Detection of Con- served Motifs in Protein Sequences with Ten Bit Physico-Chemical Properties,” Int’l J. Information and Electronics Eng., vol. 2, no. 2, pp. 200-204, 2012.
  14. Ahmad, A. Abdullah, and K. Buragga, “A Novel Optimized Approach for Gene Identification in DNA Sequences,” J. Applied Sciences, vol. 11, no. 5, pp. 806-814, 2011.
  15. Gupta, A. Mittal, K. Singh, P. Bajpai, and S. Prakash, “A Time Series Approach for Identifi- cation of Exons and Introns,” Proc.  10th  Int’l Conf. Information Technology (ICIT ’07), pp. 91- 93, Dec. 2007
  16. DasGupta, S. Lin, and L. Carin, “Sequential Modeling for Identifying CPG Island Locations in Human Genome,” IEEE Signal Processing Letters, vol. 9, no. 12, pp. 407-409, Dec. 2002.
  17. Deng, Z. Chen, G. Ding, and Y. Li, “Predic- tion of Protein Coding Regions by Combining Fourier and Wavelet Transform,” Proc. Third Int’l Congress on Image and Signal Processing (CISP), vol. 9, pp. 4113-4117, Oct. 2010.
  18. B. Murray, D. Gorse, and J.M. Thornton, “Wavelet Transforms for the Characterization and Detection of Repeating Motifs,” J. Molecular Biology, vol. 316, no. 2, pp. 341-363, 2002.
  19. P. Mena-Chalco, H. Carrer, Y. Zana, and R.M. Cesar, “Identification of Protein Coding Regions Using the Modified Gabor-Wavelet Transform,” IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 5, no. 2, pp. 198-207, Apr.- June 2008
  20. S. Gunawan and E. Ambikairajah, “Parallel Im- plementation of Genomic Sequences Classifica- tion Using Modified Gabor Wavelet Transform on Multicore Systems,” Proc. Int’l Conf. Biomedical Eng. (ICoBE), pp. 165-168, Feb. 2012
  21. A.T. Machado, A.C. Costa, and M.D. Quelhas, “Wavelet Analysis of Human Dna,”  Genomics, vol. 98, no. 3, pp. 155-163, 2011.
  22. Qiu, R. Liang, X. Zou, and J. Mo, “Prediction of Protein Secondary Structure Based on Contin- uous Wavelet Transform,” Talanta, vol. 61, no. 1, pp. 285-293, Apr. 2003.
  23. Chou, “Prediction of Protein Cellular Attributes Using Pseudo Amino Acid Composition,” Pro- teins, vol. 43, no. 3, pp. 246-255, May 2001.
  24. Chandra and A.Z. Rizvi, “Wavelet Analysis of Hiv-1 Genome,” Proc. Int’l Assoc. Computer Sci- ence and Information Technology - Spring Conf. (IACSITSC ’09), pp. 559-561, Apr. 2009.
  25. T. Jones, M. Tress, K. Bryson, and C. Hadley, “Successful Recognition of Protein Folds Using Threading Methods Biased by Sequence Similar- ity and Predicted Secondary Structure,” Proteins: Structure, Function, and Bioinformatics, vol. 37, no. S3, pp. 104-111, 1999.
  26. H. Trad, Q. Fang, and I. Cosic, “Protein Se- quence Comparison Based on the Wavelet Trans- form Approach,” Protein Eng., vol. 15, pp. 193- 203, Mar. 2002.
  27. Qiu, S. Luo, J. Huang, and R. Liang, “Using Support Vector Machine for Prediction of Pro- tein Structural Classes Based on Discrete Wavelet Transform,” J. Computation Chemistry, vol. 30, no. 8, pp. 1344-1350, June 2009.
  28. Chen, F. Gu, and F. Liu, “Predicting Protein Secondary Structure Using Continuous Wavelet Transform and ChouFasman Method,” Proc. IEEE Conf. Eng. in Medicine and Biology Soc., vol. 3, pp. 2603-2606, Mar. 2005.
  29. Boveri, “Concerning the Origin of Malignant Tumours by Theodor Boveri,” J. Cell Science, vol. 121, no. Supplement 1, pp. 1-84, 2008.
  30. Lina Yang, Yuan Yan Tang, Yang Lu, Huiwu Luo, "A Fractal Dimension and Wavelet Transform Based Method for Protein Sequence Similarity Analysis", IEEE/ACM Transactions on Computa- tional Biology and Bioinformatics, Vol. 12, Issue, 2, 16 October
  31. Yang Liu, Lina Yang, Yuan Yan Tang, Patrick Wang, "Comparison of RNA Secondary Structure by using Discrete Wavelet Transform and Fractal Dimension", Proceedings of the 2020 interna- tional Conference on Wavelet Analysis and Pattern Recognition, 4 Dec
  32. Dutta, S. Basu, and M. Kundu, “Assessment of semantic similarity between proteins using infor- mation content and topological properties of the gene ontology graph,” IEEE/ACM transactions on computational biology and bioinformatics, vol. 15, no. 3, pp. 839–849, 2018.
  33. Maji, E. Shah, and S. Paul, “Relsim: An inte- grated method to identify disease genes using gene expression profiles and ppin based similar- ity measure,” Information Sciences, vol. 384, pp. 110– 125, 2017.
  34. Schlicker, F. S. Domingues, J. Rahnenfuhrer, and T. Lengauer, ¨ “A new measure for functional similarity of gene products based on gene ontol- ogy,” BMC bioinformatics, vol. 7, no. 1, p. 302, 2006.
  35. Xu, L. Du, and Y. Zhou, “Evaluation of go-based functional similarity measures using s. cerevisiae protein interaction and expression profile data,” BMC bioinformatics, vol. 9, no. 1, p. 472, 2008.
  36. Cao and J. Cheng, “Integrated protein function prediction by mining function associations, se- quences, and protein–protein and gene–gene in- teraction networks,” Methods, vol. 93, pp. 84–91, 2016.
  37. Li, J.  Z.  Wang,  F.  A.  Feltus,  J.  Zhou,  and
  38. Luo, “Effectively integrating information con- tent and structural relationship to improve the go-based similarity measure between proteins,” arXiv preprint arXiv:1001.0958, 2010.
  39. Peng, Y. Wang, and J. Chen, “Towards inte- grative gene functional  similarity  measurement,” in BMC bioinformatics, vol. 15, no. 2. BioMed Central, 2014, p. S5.
  40. Tian, M. Guo, C. Wang, L. Xing, L. Wang, and
  41. Zhang, “Constructing an integrated gene sim- ilarity network for the identification of disease genes,” Journal of biomedical semantics, vol. 8, no. 1, p. 32, 2017.
  42. Baldi and G.W. Hatfield, DNA microarrays and gene expression: from experiments to data anal- ysis and modeling, Cambridge University Press, 2002.
  43. N. Li et al, “Integrating spatial fuzzy clustering with level set methods for automated medical image segmentation,” Comput. Biol. Med., vol. 41, pp. 1-10, 2011.
  44. H. Young et al, “Computational discovery of pathway-level genetic vulnerabilities in non- small-cell lung cancer,” Bioinformatics, vol. 32, pp. 1373–1379, 2016.
  45. Tamborero et al, “OncodriveCLUST: exploiting the positional clustering of somatic mutations to identify cancer genes,” Bioinformatics, vol. 29, pp. 2238-2244, 2013.
  46. Yang et al, “SAFE-clustering: single-cell aggre- gated (from ensemble) clustering for single-cell RNA-seq data,” Bioinformatics, bty793, in press, 2018.
  47. Hao et al, “Clustering 16S rRNA for OTU prediction: a method of unsupervised Bayesian clustering,” Bioinformatics, vol. 27, pp. 611-618, 2011.
  48. Jiang and M. Singh, “SPICi: a fast clustering algorithm for large biological networks,” Bioin- formatics, vol. 26, pp. 1105-1111, 2010.
  49. Choi et al, “Improved prediction of breast cancer outcome by identifying heterogeneous biomarkers,” Bioinformatics, vol. 33, pp. 3619–3626, 2017.

Corresponding Author

Shivani Saxena

Dept. of Computer Sciences and Engineering Institute of Advanced Research Gandhinagar, India