Research Article
BibTex RIS Cite

Compositional correlation analysis of gene expression time series

Year 2022, Volume: 10 Issue: 1, 30 - 41, 01.01.2022
https://doi.org/10.21541/apjess.1060765

Abstract

Accurate determination of temporal dependencies among gene expression patterns is crucial in the assessment of functions of genes. The gene expression series generally show a periodic behavior with nonlinear curved patterns. This paper presents the determination of temporally associated budding yeast gene expression series by using compositional correlation method. The results show that the method is capable of determining real direct or inverse linear, nonlinear and monotonic relationships between all gene pairs. Pearson’s correlation values between some of the gene pairs have shown negative or very weak relationships (r ≈ 0) even though they were found to be strongly associated. Inversely, a high positive r value was obtained even though the genes are inversely related as determined by the compositional correlation approach. Comparisons with Pearson’s correlation, Spearman’s correlation, distance correlation and the simulated annealing genetic algorithm maximal information coefficient (SGMIC) have shown that the presented compositional correlation method detects important associations which were not found by the compared methods. Supplementary materials containing the code of the used software together with some extended figures and tables are available online.

References

  • H. P. Lovecraft. (1928, February) The Call of Cthulhu. Weird Tales. 159-178.
  • K. Pearson, "Note on Regression and Inheritance in the Case of Two Parents," Proceedings of the Royal Society of London, vol. 58, no. 347-352, pp. 240-242, January 1, 1895 1895, doi: 10.1098/rspl.1895.0041.
  • J.-L. Magnard et al., "Biosynthesis of monoterpene scent compounds in roses," Science, vol. 349, no. 6243, pp. 81-83, 2015, doi: 10.1126/science.aab0696.
  • Y. X. R. Wang, K. Jiang, L. J. Feldman, P. J. Bickel, and H. Huang, "Inferring gene-gene interactions and functional modules using sparse canonical correlation analysis," (in en), Ann. Appl. Stat., vol. 9, no. 1, pp. 300-323, 2015/03 2015, doi: 10.1214/14-AOAS792.
  • J. M. Bland and D. G. Altman, "Statistical methods for assessing agreement between two methods of clinical measurement," Lancet, vol. 1, no. 8476, pp. 307-310, 1986. [Online]. Available: http://www.scopus.com/inward/record.url?eid=2-s2.0-0022624332&partnerID=40&md5=7814d6e99afa1a58edebf08387536f8c.
  • M. B. I. Lobbes and P. J. Nelemans, "Good correlation does not automatically imply good agreement: The trouble with comparing tumour size by breast MRI versus histopathology," European Journal of Radiology, vol. 82, no. 12, pp. e906-e907, 2013, doi: 10.1016/j.ejrad.2013.08.025.
  • M. T. Brett, "When is a correlation between non-independent variables "spurious"?," Oikos, vol. 105, no. 3, pp. 647-656, 2004, doi: 10.1111/j.0030-1299.2004.12777.x.
  • L. Duan, W. N. Street, Y. Liu, S. Xu, and B. Wu, "Selecting the Right Correlation Measure for Binary Data," ACM Trans. Knowl. Discov. Data, vol. 9, no. 2, p. Article 13, 2014, doi: 10.1145/2637484.
  • N. Coffey and J. Hinde, "Analyzing time-course microarray data using functional data analysis - A review," Statistical Applications in Genetics and Molecular Biology, Review vol. 10, no. 1, 2011, Art no. 23, doi: 10.2202/1544-6115.1671.
  • J. Zhang et al., "Genome-wide association study for flowering time, maturity dates and plant height in early maturing soybean (Glycine max) germplasm," BMC Genomics, vol. 16, no. 1, p. 217, 2015/03/20 2015, doi: 10.1186/s12864-015-1441-4.
  • X. Zhang, F. Zou, and W. Wang, "Efficient algorithms for genome-wide association study," ACM Trans. Knowl. Discov. Data, vol. 3, no. 4, p. Article 19, 2009, doi: 10.1145/1631162.1631167.
  • S. Kumari et al., "Evaluation of Gene Association Methods for Coexpression Network Construction and Biological Knowledge Discovery," PLoS One, vol. 7, no. 11, p. e50411, 2012, doi: 10.1371/journal.pone.0050411.
  • F. Dikbaş, "A novel two-dimensional correlation coefficient for assessing associations in time series data," International Journal of Climatology, vol. 37, no. 11, pp. 4065-4076, 2017, doi: https://doi.org/10.1002/joc.4998.
  • F. Dikbaş, "A New Two-Dimensional Rank Correlation Coefficient," Water Resources Management, vol. 32, no. 5, pp. 1539-1553, 2018/03/01 2018, doi: 10.1007/s11269-017-1886-0.
  • S.-J. Chou et al., "Analysis of spatial-temporal gene expression patterns reveals dynamics and regionalization in developing mouse brain," Sci. Rep., vol. 6, no. 1, p. 19274, 2016/01/20 2016, doi: 10.1038/srep19274.
  • E. Martinez, K. Yoshihara, H. Kim, G. M. Mills, V. Trevino, and R. G. W. Verhaak, "Comparison of gene expression patterns across 12 tumor types identifies a cancer supercluster characterized by TP53 mutations and cell cycle defects," Oncogene, Original Article vol. 34, no. 21, pp. 2732-2740, 05/21/print 2015, doi: 10.1038/onc.2014.216.
  • J. A. Bubier et al., "Integration of heterogeneous functional genomics data in gerontology research to find genes and pathway underlying aging across species," PLoS One, vol. 14, no. 4, p. e0214523, 2019, doi: 10.1371/journal.pone.0214523.
  • D. I. Scheffer, J. Shen, D. P. Corey, and Z. Y. Chen, "Gene expression by mouse inner ear hair cells during development," Journal of Neuroscience, vol. 35, no. 16, pp. 6366-6380, 2015, doi: 10.1523/JNEUROSCI.5126-14.2015.
  • J. Delfini et al., "Population structure, genetic diversity and genomic selection signatures among a Brazilian common bean germplasm," Sci. Rep., vol. 11, no. 1, p. 2964, 2021/02/03 2021, doi: 10.1038/s41598-021-82437-4.
  • A. R. Marderstein, E. R. Davenport, S. Kulm, C. V. Van Hout, O. Elemento, and A. G. Clark, "Leveraging phenotypic variability to identify genetic interactions in human phenotypes," The American Journal of Human Genetics, vol. 108, no. 1, pp. 49-67, 2021/01/07/ 2021, doi: https://doi.org/10.1016/j.ajhg.2020.11.016.
  • M. Perros, "A sustainable model for antibiotics," Science, vol. 347, no. 6226, pp. 1062-1064, 2015, doi: 10.1126/science.aaa3048.
  • F. Dikbaş, "Compositional Correlation for Detecting Real Associations Among Time Series," in Academic Researches in Mathematic and Sciences, Z. Yildirim Ed., 1 ed. Ankara: Gece Kitaplığı, 2018, pp. 27-46.
  • S. Heubach and T. Mansour, "Compositions of n with parts in a set," Congressus Numerantium, vol. 168, p. 127, 2004.
  • G. E. Andrews, The Theory of Partitions (Encyclopedia of Mathematics and its Applications). Cambridge: Cambridge University Press, 1984.
  • G. E. Andrews and K. Eriksson, Integer Partitions. Cambridge: Cambridge University Press, 2004.
  • G. H. Hardy and E. M. Wright, An introduction to the theory of numbers. Oxford university press, 1979.
  • J. J. Watkins, Number theory: a historical approach. Princeton University Press, 2013.
  • A. P. Stakhov, "The golden section in the measurement theory," Computers and Mathematics with Applications, vol. 17, no. 4-6, pp. 613-638, 1989, doi: 10.1016/0898-1221(89)90252-6.
  • L. Lindroos, "Integer Compositions, Gray Code, and the Fibonacci Sequence," 2012.
  • P. T. Spellman et al., "Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization," Molecular Biology of the Cell, vol. 9, no. 12, pp. 3273-3297, 1998. [Online]. Available: http://www.scopus.com/inward/record.url?eid=2-s2.0-0031742022&partnerID=40&md5=212944b877cb8836ca1f33a585f0b8c9.
  • D. N. Reshef et al., "Detecting novel associations in large data sets," Science, vol. 334, no. 6062, pp. 1518-1524, 2011, doi: 10.1126/science.1205438.
  • V. Subbarayan et al., "Inverse relationship between 15-lipoxygenase-2 and PPAR-γ gene expression in normal epithelia compared with tumor epithelia," Neoplasia, vol. 7, no. 3, pp. 280-293, 2005, doi: 10.1593/neo.04457.
  • Y. Zhang, S. Jia, H. Huang, J. Qiu, and C. Zhou, "A novel algorithm for the precise calculation of the maximal information coefficient," Sci. Rep., Article vol. 4, 2014, Art no. 6662, doi: 10.1038/srep06662.
  • M. Sardi et al., "Genome-wide association across Saccharomyces cerevisiae strains reveals substantial variation in underlying gene requirements for toxin tolerance," PLoS Genet., vol. 14, no. 2, p. e1007217, 2018, doi: 10.1371/journal.pgen.1007217.
  • C. G. Liu, Y. H. Lin, and F. W. Bai, "Global gene expression analysis of Saccharomyces cerevisiae grown under redox potential-controlled very-high-gravity conditions," (in eng), Biotechnol J, vol. 8, no. 11, pp. 1332-40, Nov 2013, doi: 10.1002/biot.201300127.
  • C. F. Connelly and J. M. Akey, "On the prospects of whole-genome association mapping in Saccharomyces cerevisiae," (in eng), Genetics, vol. 191, no. 4, pp. 1345-1353, 2012, doi: 10.1534/genetics.112.141168.
  • S. Bergmann, J. Ihmels, and N. Barkai, "Similarities and Differences in Genome-Wide Expression Data of Six Organisms," PLoS Biol., vol. 2, no. 1, p. e9, 2003, doi: 10.1371/journal.pbio.0020009.
  • D. Wang, A. Arapostathis, C. O. Wilke, and M. K. Markey, "Principal-Oscillation-Pattern Analysis of Gene Expression," PLoS One, vol. 7, no. 1, p. e28805, 2012, doi: 10.1371/journal.pone.0028805.
  • U. de Lichtenberg, L. J. Jensen, A. Fausbøll, T. S. Jensen, P. Bork, and S. Brunak, "Comparison of computational methods for the identification of cell cycle-regulated genes," (in eng), Bioinformatics, vol. 21, no. 7, pp. 1164-71, Apr 1 2005, doi: 10.1093/bioinformatics/bti093.
  • J. Kelleher, Encoding Partitions as Ascending Compositions. NUI, 2005 at Department of Computer Science, UCC., 2005.
Year 2022, Volume: 10 Issue: 1, 30 - 41, 01.01.2022
https://doi.org/10.21541/apjess.1060765

Abstract

References

  • H. P. Lovecraft. (1928, February) The Call of Cthulhu. Weird Tales. 159-178.
  • K. Pearson, "Note on Regression and Inheritance in the Case of Two Parents," Proceedings of the Royal Society of London, vol. 58, no. 347-352, pp. 240-242, January 1, 1895 1895, doi: 10.1098/rspl.1895.0041.
  • J.-L. Magnard et al., "Biosynthesis of monoterpene scent compounds in roses," Science, vol. 349, no. 6243, pp. 81-83, 2015, doi: 10.1126/science.aab0696.
  • Y. X. R. Wang, K. Jiang, L. J. Feldman, P. J. Bickel, and H. Huang, "Inferring gene-gene interactions and functional modules using sparse canonical correlation analysis," (in en), Ann. Appl. Stat., vol. 9, no. 1, pp. 300-323, 2015/03 2015, doi: 10.1214/14-AOAS792.
  • J. M. Bland and D. G. Altman, "Statistical methods for assessing agreement between two methods of clinical measurement," Lancet, vol. 1, no. 8476, pp. 307-310, 1986. [Online]. Available: http://www.scopus.com/inward/record.url?eid=2-s2.0-0022624332&partnerID=40&md5=7814d6e99afa1a58edebf08387536f8c.
  • M. B. I. Lobbes and P. J. Nelemans, "Good correlation does not automatically imply good agreement: The trouble with comparing tumour size by breast MRI versus histopathology," European Journal of Radiology, vol. 82, no. 12, pp. e906-e907, 2013, doi: 10.1016/j.ejrad.2013.08.025.
  • M. T. Brett, "When is a correlation between non-independent variables "spurious"?," Oikos, vol. 105, no. 3, pp. 647-656, 2004, doi: 10.1111/j.0030-1299.2004.12777.x.
  • L. Duan, W. N. Street, Y. Liu, S. Xu, and B. Wu, "Selecting the Right Correlation Measure for Binary Data," ACM Trans. Knowl. Discov. Data, vol. 9, no. 2, p. Article 13, 2014, doi: 10.1145/2637484.
  • N. Coffey and J. Hinde, "Analyzing time-course microarray data using functional data analysis - A review," Statistical Applications in Genetics and Molecular Biology, Review vol. 10, no. 1, 2011, Art no. 23, doi: 10.2202/1544-6115.1671.
  • J. Zhang et al., "Genome-wide association study for flowering time, maturity dates and plant height in early maturing soybean (Glycine max) germplasm," BMC Genomics, vol. 16, no. 1, p. 217, 2015/03/20 2015, doi: 10.1186/s12864-015-1441-4.
  • X. Zhang, F. Zou, and W. Wang, "Efficient algorithms for genome-wide association study," ACM Trans. Knowl. Discov. Data, vol. 3, no. 4, p. Article 19, 2009, doi: 10.1145/1631162.1631167.
  • S. Kumari et al., "Evaluation of Gene Association Methods for Coexpression Network Construction and Biological Knowledge Discovery," PLoS One, vol. 7, no. 11, p. e50411, 2012, doi: 10.1371/journal.pone.0050411.
  • F. Dikbaş, "A novel two-dimensional correlation coefficient for assessing associations in time series data," International Journal of Climatology, vol. 37, no. 11, pp. 4065-4076, 2017, doi: https://doi.org/10.1002/joc.4998.
  • F. Dikbaş, "A New Two-Dimensional Rank Correlation Coefficient," Water Resources Management, vol. 32, no. 5, pp. 1539-1553, 2018/03/01 2018, doi: 10.1007/s11269-017-1886-0.
  • S.-J. Chou et al., "Analysis of spatial-temporal gene expression patterns reveals dynamics and regionalization in developing mouse brain," Sci. Rep., vol. 6, no. 1, p. 19274, 2016/01/20 2016, doi: 10.1038/srep19274.
  • E. Martinez, K. Yoshihara, H. Kim, G. M. Mills, V. Trevino, and R. G. W. Verhaak, "Comparison of gene expression patterns across 12 tumor types identifies a cancer supercluster characterized by TP53 mutations and cell cycle defects," Oncogene, Original Article vol. 34, no. 21, pp. 2732-2740, 05/21/print 2015, doi: 10.1038/onc.2014.216.
  • J. A. Bubier et al., "Integration of heterogeneous functional genomics data in gerontology research to find genes and pathway underlying aging across species," PLoS One, vol. 14, no. 4, p. e0214523, 2019, doi: 10.1371/journal.pone.0214523.
  • D. I. Scheffer, J. Shen, D. P. Corey, and Z. Y. Chen, "Gene expression by mouse inner ear hair cells during development," Journal of Neuroscience, vol. 35, no. 16, pp. 6366-6380, 2015, doi: 10.1523/JNEUROSCI.5126-14.2015.
  • J. Delfini et al., "Population structure, genetic diversity and genomic selection signatures among a Brazilian common bean germplasm," Sci. Rep., vol. 11, no. 1, p. 2964, 2021/02/03 2021, doi: 10.1038/s41598-021-82437-4.
  • A. R. Marderstein, E. R. Davenport, S. Kulm, C. V. Van Hout, O. Elemento, and A. G. Clark, "Leveraging phenotypic variability to identify genetic interactions in human phenotypes," The American Journal of Human Genetics, vol. 108, no. 1, pp. 49-67, 2021/01/07/ 2021, doi: https://doi.org/10.1016/j.ajhg.2020.11.016.
  • M. Perros, "A sustainable model for antibiotics," Science, vol. 347, no. 6226, pp. 1062-1064, 2015, doi: 10.1126/science.aaa3048.
  • F. Dikbaş, "Compositional Correlation for Detecting Real Associations Among Time Series," in Academic Researches in Mathematic and Sciences, Z. Yildirim Ed., 1 ed. Ankara: Gece Kitaplığı, 2018, pp. 27-46.
  • S. Heubach and T. Mansour, "Compositions of n with parts in a set," Congressus Numerantium, vol. 168, p. 127, 2004.
  • G. E. Andrews, The Theory of Partitions (Encyclopedia of Mathematics and its Applications). Cambridge: Cambridge University Press, 1984.
  • G. E. Andrews and K. Eriksson, Integer Partitions. Cambridge: Cambridge University Press, 2004.
  • G. H. Hardy and E. M. Wright, An introduction to the theory of numbers. Oxford university press, 1979.
  • J. J. Watkins, Number theory: a historical approach. Princeton University Press, 2013.
  • A. P. Stakhov, "The golden section in the measurement theory," Computers and Mathematics with Applications, vol. 17, no. 4-6, pp. 613-638, 1989, doi: 10.1016/0898-1221(89)90252-6.
  • L. Lindroos, "Integer Compositions, Gray Code, and the Fibonacci Sequence," 2012.
  • P. T. Spellman et al., "Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization," Molecular Biology of the Cell, vol. 9, no. 12, pp. 3273-3297, 1998. [Online]. Available: http://www.scopus.com/inward/record.url?eid=2-s2.0-0031742022&partnerID=40&md5=212944b877cb8836ca1f33a585f0b8c9.
  • D. N. Reshef et al., "Detecting novel associations in large data sets," Science, vol. 334, no. 6062, pp. 1518-1524, 2011, doi: 10.1126/science.1205438.
  • V. Subbarayan et al., "Inverse relationship between 15-lipoxygenase-2 and PPAR-γ gene expression in normal epithelia compared with tumor epithelia," Neoplasia, vol. 7, no. 3, pp. 280-293, 2005, doi: 10.1593/neo.04457.
  • Y. Zhang, S. Jia, H. Huang, J. Qiu, and C. Zhou, "A novel algorithm for the precise calculation of the maximal information coefficient," Sci. Rep., Article vol. 4, 2014, Art no. 6662, doi: 10.1038/srep06662.
  • M. Sardi et al., "Genome-wide association across Saccharomyces cerevisiae strains reveals substantial variation in underlying gene requirements for toxin tolerance," PLoS Genet., vol. 14, no. 2, p. e1007217, 2018, doi: 10.1371/journal.pgen.1007217.
  • C. G. Liu, Y. H. Lin, and F. W. Bai, "Global gene expression analysis of Saccharomyces cerevisiae grown under redox potential-controlled very-high-gravity conditions," (in eng), Biotechnol J, vol. 8, no. 11, pp. 1332-40, Nov 2013, doi: 10.1002/biot.201300127.
  • C. F. Connelly and J. M. Akey, "On the prospects of whole-genome association mapping in Saccharomyces cerevisiae," (in eng), Genetics, vol. 191, no. 4, pp. 1345-1353, 2012, doi: 10.1534/genetics.112.141168.
  • S. Bergmann, J. Ihmels, and N. Barkai, "Similarities and Differences in Genome-Wide Expression Data of Six Organisms," PLoS Biol., vol. 2, no. 1, p. e9, 2003, doi: 10.1371/journal.pbio.0020009.
  • D. Wang, A. Arapostathis, C. O. Wilke, and M. K. Markey, "Principal-Oscillation-Pattern Analysis of Gene Expression," PLoS One, vol. 7, no. 1, p. e28805, 2012, doi: 10.1371/journal.pone.0028805.
  • U. de Lichtenberg, L. J. Jensen, A. Fausbøll, T. S. Jensen, P. Bork, and S. Brunak, "Comparison of computational methods for the identification of cell cycle-regulated genes," (in eng), Bioinformatics, vol. 21, no. 7, pp. 1164-71, Apr 1 2005, doi: 10.1093/bioinformatics/bti093.
  • J. Kelleher, Encoding Partitions as Ascending Compositions. NUI, 2005 at Department of Computer Science, UCC., 2005.
There are 40 citations in total.

Details

Primary Language English
Subjects Artificial Intelligence
Journal Section Research Articles
Authors

Fatih Dikbaş 0000-0001-5779-2801

Early Pub Date January 20, 2022
Publication Date January 1, 2022
Submission Date May 18, 2021
Published in Issue Year 2022 Volume: 10 Issue: 1

Cite

IEEE F. Dikbaş, “Compositional correlation analysis of gene expression time series”, APJESS, vol. 10, no. 1, pp. 30–41, 2022, doi: 10.21541/apjess.1060765.

Academic Platform Journal of Engineering and Smart Systems