Learning similarity functions for multi-platform gene expression data

この論文をさがす

抄録

The existence of several technologies for measuring gene expression and the growing number of available large-scale gene expression microarrays motivate the need for cross-platform analysis tools. Cross-platform analysis of microarray data is an important problem, which heavily relies on the choice of a similarity function. For a classification task, a good similarity function should improve the prediction performance. It should also be easy to compute, and provide new biological insights of the data. However in practice, choosing a good similarity function for multi-platform microarray data is a difficult problem. In this work, our goal is to improve the performance of microarray search engines such as CellMontage. Therefore, we focus the ranking task rather than the classification task. Our ranking-based approach compares favourably to several similarity functions, including the Pearson and Spearman Correlation coefficients, the Euclidean distance, Linear Discriminant Analysis, and Neighbourhood Component Analysis. Experiments show that our method can be used to differentiate different types of cells with high accuracy, including induced pluripotent stem cells, embryonic stem cells, and cancer cells.The existence of several technologies for measuring gene expression and the growing number of available large-scale gene expression microarrays motivate the need for cross-platform analysis tools. Cross-platform analysis of microarray data is an important problem, which heavily relies on the choice of a similarity function. For a classification task, a good similarity function should improve the prediction performance. It should also be easy to compute, and provide new biological insights of the data. However in practice, choosing a good similarity function for multi-platform microarray data is a difficult problem. In this work, our goal is to improve the performance of microarray search engines such as CellMontage. Therefore, we focus the ranking task rather than the classification task. Our ranking-based approach compares favourably to several similarity functions, including the Pearson and Spearman Correlation coefficients, the Euclidean distance, Linear Discriminant Analysis, and Neighbourhood Component Analysis. Experiments show that our method can be used to differentiate different types of cells with high accuracy, including induced pluripotent stem cells, embryonic stem cells, and cancer cells.

収録刊行物

詳細情報 詳細情報について

  • CRID
    1570291227079466496
  • NII論文ID
    110008584214
  • NII書誌ID
    AA12055912
  • 本文言語コード
    en
  • データソース種別
    • CiNii Articles

問題の指摘

ページトップへ