Journal of Biomedical Science and Engineering

Volume 2, Issue 6 (October 2009)

ISSN Print: 1937-6871   ISSN Online: 1937-688X

Google-based Impact Factor: 0.66  Citations  h5-index & Ranking

Classification with binary gene expressions

HTML  Download Download as PDF (Size: 160KB)  PP. 390-399  
DOI: 10.4236/jbise.2009.26056    5,141 Downloads   9,215 Views  Citations

Affiliation(s)

.

ABSTRACT

Microarray gene expression measurements are reported, used and archived usually to high numerical precision. However, properties of mRNA molecules, such as their low stability and availability in small copy numbers, and the fact that measurements correspond to a population of cells, rather than a single cell, makes high precision meaningless. Recent work shows that reducing measurement precision leads to very little loss of information, right down to binary levels. In this paper we show how properties of binary spaces can be useful in making inferences from microarray data. In particular, we use the Tanimoto similarity metric for binary vectors, which has been used effectively in the Chemoinformatics literature for retrieving chemical compounds with certain functional properties. This measure, when incorporated in a kernel framework, helps recover any information lost by quantization. By implementing a spectral clustering framework, we further show that a second reason for high performance from the Tanimoto metric can be traced back to a hitherto unnoticed systematic variability in array data: Probe level uncertainties are systematically lower for arrays with large numbers of expressed genes. While we offer no molecular level explanation for this systematic variability, that it could be exploited in a suitable similarity metric is a useful observation in itself. We further show preliminary results that working with binary data considerably reduces variability in the results across choice of algorithms in the preprocessing stages of microarray analysis.

Share and Cite:

Tuna, S. and Niranjan, M. (2009) Classification with binary gene expressions. Journal of Biomedical Science and Engineering, 2, 390-399. doi: 10.4236/jbise.2009.26056.

Cited by

[1] Latent representation of the human pan-celltype epigenome through a deep recurrent neural network
2021
[2] the Distribution of standard Deviations Applied to High throughput screening
2019
[3] Quantitative Assessment of Flow Regime Alteration Using a Revised Range of Variability Methods
Water, 2018
[4] Knowledge about the presence or absence of miRNA isoforms (isomiRs) can successfully discriminate amongst 32 TCGA(isomiRs) can successfully discriminate amongst 32 TCGA cancer types
Computational Medicine Center Faculty Papers, 2017
[5] New inhibitor targeting human transcription factor HSF1: effects on the heat shock response and tumor cell survival
2017
[6] Knowledge about the presence or absence of miRNA isoforms (isomiRs) can successfully discriminate amongst 32 TCGA cancer types
Nucleic Acids Research, 2017
[7] The presence or absence alone of miRNA isoforms (isomiRs) successfully discriminate amongst the 32 TCGA cancer types
2016
[8] Modelling at the transcriptome-proteome interface
2015
[9] A machine learning framework of functional biomarker discovery for different microbial communities based on metagenomic data
Systems Biology (ISB), 2012 IEEE 6th International Conference on. IEEE, 2012
[10] Utilizing Universal Probability of Expression Code (UPC) to Identify Disrupted Pathways in Cancer Samples
2011
[11] Comparison and performance enhancement of modern pattern classifiers
2010
[12] 比較貝氏二元迴歸 (BBR) 以及微陣列預測分析 (PAM) 方法於基因表現量之分類功能
清華大學統計學研究所學位論文, 2010
[13] Reducing the algorithmic variability in transcriptome-based inference
Bioinformatics, 2010
[14] Cross-platform analysis with binarized gene expression data
Pattern Recognition in Bioinformatics, 2009
[15] Cross-Platform Analysis with Binarized Gene Expression Data.
Lecture Notes in Computer Science book series, 2009

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.