Joint Nonnegative Matrix Factorization Based on Sparse and Graph Laplacian Regularization for Clustering and Co-Differential Expression Genes Analysis

Dai, Ling-Yun; Zhu, Rong; Wang, Juan

doi:https://doi.org/10.1155/2020/3917812

Complexity

On this page

Abstract Introduction Materials and Methods Results and Discussion Conclusions Data Availability Conflicts of Interest Acknowledgments References Copyright Related Articles

Special Issue

Deep Structure Representation and Learning for Complex Information Networks

View this Special Issue

Research Article | Open Access

Volume 2020 | Article ID 3917812 | https://doi.org/10.1155/2020/3917812

Joint Nonnegative Matrix Factorization Based on Sparse and Graph Laplacian Regularization for Clustering and Co-Differential Expression Genes Analysis

Ling-Yun Dai,¹Rong Zhu,¹and Juan Wang¹

Academic Editor: Jia Wu

Received12 Jun 2020

Revised08 Oct 2020

Accepted12 Oct 2020

Published16 Nov 2020

Abstract

The explosion of multiomics data poses new challenges to existing data mining methods. Joint analysis of multiomics data can make the best of the complementary information that is provided by different types of data. Therefore, they can more accurately explore the biological mechanism of diseases. In this article, two forms of joint nonnegative matrix factorization based on the sparse and graph Laplacian regularization (SG-jNMF) method are proposed. In the method, the graph regularization constraint can preserve the local geometric structure of data. -norm regularization can enhance the sparsity among the rows and remove redundant features in the data. First, SG-jNMF1 projects multiomics data into a common subspace and applies the multiomics fusion characteristic matrix to mine the important information closely related to diseases. Second, multiomics data of the same disease are mapped into the common sample space by SG-jNMF2, and the cluster structures are detected clearly. Experimental results show that SG-jNMF can achieve significant improvement in sample clustering compared with existing joint analysis frameworks. SG-jNMF also effectively integrates multiomics data to identify co-differentially expressed genes (Co-DEGs). SG-jNMF provides an efficient integrative analysis method for mining the biological information hidden in heterogeneous multiomics data.

1. Introduction

With the development of state-of-the-art sequencing technology, a large quantity of effective experimental data has been collected. These data may imply some unknown molecular mechanisms. Bioinformatics is faced with the task of analyzing massive omics data. The Cancer Gene Atlas (TCGA, https://tcgadata.nci.nih.gov/tcga/) includes gene expression profile data (GE), DNA methylation data (DM), copy number variation data (CNV), protein expression data, and drug sensitivity data. These data are from approximately 15,000 clinical samples of more than 30 kinds of cancers [1]. These massive data enable researchers to study the mechanisms of cancer production, diagnosis, and treatment at different biological levels.

The joint analysis of multiomics data can make up for lost or unreliable information in single omics data. In recent years, scientists have performed considerable research on the cancer mechanisms based on the joint analysis of cancer multiomics data. For example, Christina et al. integrated the gene expression data and copy number variations of breast cancer, identified possible pathogenic genes, and discovered new subtypes of breast cancer [2]. Wang and Wang used similarity network fusion to jointly analyze mRNA, DM, and microRNA (miRNA) data and identify cancer subtypes further [3]. In the existing joint analysis methods, those based on matrix decomposition are remarkable. Liu et al. integrated mRNA, somatic cell mutation, DNA methylation, and copy number variation data. They established a block constraint-based RPCA model to identify differentially expressed genes (DEGs) [4]. Integration and analysis of these heterogeneous multiomics data provide an in-depth understanding of the pathogenesis of cancer and promote the development of precision medicine. Recently, unsupervised integrative methods based on matrix decomposition have attracted considerable attention among the existing methods for integrating and analyzing multiomics data. Zhang et al. constructed a joint matrix factorization framework (jNMF) to discover multidimensional modules of genomic data [5]. Yang and Michailidis introduced a new method named integrative NMF (iNMF) for heterogeneous multiomics data [6]. Strazar et al. incorporated orthogonality regularization into iNMF (iONMF) to integrate and analyze multiple data sources [7]. Joint nonnegative matrix decomposition meta-analysis (jNMFMA) [8], multiomics factor analysis (MOFA) [9], and Bayesian joint analysis [10] have been successfully applied to the integration and analysis of cancer omics data. To avoid the influence of redundant information, many sparse modeling methods have been proposed. Typical applications are as follows: The weighted sparse representation classifier (WSRC) model combined with global coding (GE) [11] was used to predict interactions between proteins based on protein sequence information. The network regularization sparse logic regression model (NSLR) [12] was used to predict survival risk and discover biomarkers. Sparse coregularization matrix decomposition was used to find mutant driver genes and so on [13].

In recent years, graph/network-based analysis as a powerful data representation tool has been applied to the modeling and analysis of complex systems [14–17]. In general, entities can be regarded as nodes, and the interaction between entities can be regarded as edges in the graph. Graph-based approaches can explore the local subspace structure and obtain the low-dimensional representation of high-dimensional data. Zhang and Ma proposed a subspace clustering algorithm based on a graph to detect the common modules highly correlated with cancer by jointly analyzing the gene expression and protein interaction networks [18]. Mixed-norm Laplacian regularized low-rank representation (MLLRR) was used to cluster samples [19]. Cui proposed an improved graph-based method to predict drug-target interactions [20]. Liu et al. introduced the contributions of deep neural networks, deep graph embedding, and graph neural networks along with the opportunities and challenges they faced [21]. Wu et al. proposed a multigraph learning algorithm called gMGFL that search and choose a group of decision subgraphs as features to move bags and bag labels to the instance [22].

Recently, sparse regularization has played a very important role in data analysis. The -norm, -norm, -norm, etc. are all typical sparse regularization methods. Among these many sparse constraints, -norm regularization stands out in terms of computational time and performance. The -norm can obtain a sparse projection matrix in rows to learn discriminative features in the subspace. Zhang used the -norm constraint on the coefficients to ensure that they are sparse in rows [23]. The -norm was applied to the predictor to ensure that it is robust to noise and outliers [24].

Considering the role of graph regularizations and - norm constraints in matrix factorization, we propose joint nonnegative matrix factorization based on sparse and graph Laplacian regularization (SG-jNMF). SG-jNMF can make the best of the potential associations and complementary information among multiomics data. The main highlights of this approach are as follows.(1)Graph regularization is incorporated into the joint nonnegative matrix factorization model, and undirected graphs are constructed for input data in this method. Local graph regularization can preserve the local geometrical structure of the data space. Therefore, SG-jNMF can use the low-dimensional characteristics of the observed data to find intrinsic laws and improve the performance of the integrated analysis method.(2)-norm regularization can deal with each row of the matrix as a whole and can enhance the sparsity among the rows. Therefore, involving the -norm can remove redundant features and noise in the data and further explore the clear cluster structure.(3)Two forms of SG-jNMF are proposed. SG-jNMF1 projects multiomics data into a fusion feature space. The fusion matrix contains complementary and differential information provided by multiomics data, so that more accurate results can be obtained when identifying Co-DEGs. SG-jNMF2 projects multiomics data into a common sample space, which results in more accurate clustering results.

The rest of this paper is arranged as follows: In Section 2, we start with a brief review of jNMF. Next, we introduce the SG-jNMF method, optimization process, and computational complexity analysis. Section 3 gives out the experimental results of clustering and feature selection. Finally, we summarize the whole paper and give some suggestions for future work in Section 4.

2. Materials and Methods

2.1. Joint Nonnegative Matrix Factorization

The jNMF method was first proposed by Zhang et al. [5]. It can project multiple input data matrices into a common subspace, to integrate the information of each input data for analysis. Each type of genomic data as original data can be denoted as . is the common basis matrix, and is the corresponding coefficient matrix. The objective function of jNMF can be written as

Obviously, jNMF is the same as NMF when . Therefore, jNMF is the generalization model of NMF for multiple input datasets. Similar to NMF, multiplicative update rules are used to minimize the objective function. and are iteratively updated according to the following rules.

The jNMF method can be used to integrate and analyze multiomics data. It decomposes multiomics data matrices into multiple independent coefficient matrices and a common fusion matrix at the same time and projects high-dimensional omics data into low-dimensional spaces. Therefore, the abundant differential and complementary information of cancer multiomics data can be efficiently used, and multiomics datasets are analyzed simultaneously to obtain hidden information with biological significance.

2.2. Joint Nonnegative Matrix Factorization Based on Sparse and Graph Laplacian Regularization

Manifold learning has become a popular research topic in the domain of information science since it was first proposed in science in 2000 [25, 26]. Assuming that the data are uniformly sampled in a high-dimensional space, manifold learning can find the low-dimensional structure in the high-dimensional space and obtain the corresponding embedding mapping. Manifold learning looks for the essence of things from observed phenomena and finds the internal laws of data. The manifold assumption states that data points that are geometrically adjacent usually have similar characteristics. Therefore, an undirected weighted network/graph is constructed. is the vertex set, is the edge set, and is the weight set. Edge weight is associated with in . The graph regularization with is as follows:where is the trace of the matrix, is the graph Laplacian matrix, and . is a diagonal matrix and . Intuitively, the smaller the value is, the closer the two data points are. By minimizing , we can obtain a sufficiently smooth mapping function on the data manifold.

To decrease the influence of noise and outliers on real data, sparse regularization is usually used to penalize the coefficient matrix. The -norm, -norm, and -norm are all typical sparse regularization methods. The solution of -norm is a NP-hard problem. -norm is widely used because it has better optimization solution characteristics than -norm. -norm will tend to produce a small number of features, while the other features are all 0. Therefore, it can be used for feature selection. However, -norm regularization is usually time-consuming. -norm regularization on the coefficient matrix can generate a row sparse result, and the calculation of the -norm is simple and convenient [23]. In this article, the penalty is incorporated in SG-jNMF [27]. The -norm of a matrix is defined as

2.2.1. SG-jNMF1

There are two forms of SG-jNMF methods in this article. As shown in Figure 1, the SG-jNMF1 method projects multiomics data into a common feature space. Graph regularization and a sparse penalty are applied to the fusion feature matrix. The feature matrix is constrained by graph regularization, and as much intrinsic geometric information of the original multiomics data are preserved as possible. The -norm is used to constrain the feature matrix to reduce the influence of outliers and noise, and the objective function of integrating nonnegative matrix decomposition is constructed. The optimization problem can be expressed aswhere is the Laplacian matrix. , where is a symmetric matrix, which is the weight matrix constructed in graph regularization. is a diagonal matrix, and its diagonal elements are equal to the sum of the corresponding row elements or the sum of the column elements of the matrix; i.e., .

With randomly positive initializing matrices and , the following update rules are executed until the algorithm converges:where is a diagonal matrix, the diagonal element is , and is an infinitesimal positive number.

2.2.2. SG-jNMF2

As seen from Figure 1, the SG-jNMF2 method projects multiomics data into a common sample space. Constraints are enforced on the common sample matrix. This method can be used to cluster multiomics data. The model can be shown by the following expression:

Similarly, the algorithm iterates until it converges according to the following rules:where is the Laplacian matrix. , where is a symmetric matrix, which is the weight matrix constructed in graph regularization. is a diagonal matrix, and its diagonal elements are equal to the sum of the corresponding row elements or the sum of the column elements of the matrix; i.e., . is a diagonal matrix, and the diagonal element is . Obviously, the objective functions of the two kinds of SG-jNMF method are both nonconvex. We can obtain the optimal solutions by minimizing the objective functions. The optimization process is shown as follows.

2.3. Optimization of SG-jNMF

Since the optimization processes of the two forms of SG-jNMF method are very similar, we only provide that of the first method. We use the multivariable alternating update rules to solve the optimization problem. Specifically, the following update steps are repeated until the algorithm converges.

2.3.1. Optimization of

When is fixed, the optimization of is performed by minimizing the following objective function:

The corresponding Lagrangian function is as follows:where and are the Lagrangian multipliers of and , respectively. Next, we take the first partial derivative of this Lagrangian function with respect to :

According to the KKT conditions [28], the following updating rule can be obtained:

2.3.2. Optimization of

When is fixed, the optimization of is performed by minimizing the following objective function.

The corresponding Lagrangian function is as follows:and runs to convergence according to the following formula:

2.4. Convergence and Running Time

In this paper, we also demonstrate the convergence of the method through experiments. Taking the pancreatic adenocarcinoma (PAAD) dataset as an example, the convergence of the five methods is shown in Figure 2. The error function used in this article is defined as follows:

Compared with the other four methods, SG-jNMF can converge to the smallest error value with the fastest speed.

Besides, we also tested the running time of the above methods on the PAAD dataset. The means of these five methods running 10 times on a PC are shown in Table 1. As seen in Table 1, iGMFNA has the shortest running time, followed by SG-jNMF. This is due to the introduction of sparse constraints in SG-jNMF. The running time of iNMF, iGMFNA, jNMF, and SG-jNMF methods is satisfactory.

2.5. Computational Complexity Analysis

In this part, we discuss the extra computational complexity of SG-jNMF compared to jNMF. We use big symbol to represent the computational complexity of the algorithm. On the basis of the updating rules (3) and (4), we can easily count the arithmetic operations of each iteration in jNMF. Obviously, the cost for each iteration in jNMF is . It should be noted that is a sparse matrix for SG-jNMF. In addition to the multiplicative updates, constructing a K-nearest neighbor graph requires operations [28]. Assume that the update stops after iterations, and the overall cost for jNMF is . The overall cost for SG-jNMF is .

3. Results and Discussion

3.1. Data Processing

TCGA project includes a lot of gene expression profile data, DNA methylation data, copy number variation data, protein expression data, drug sensitivity data, and so on. In-depth study of these data can help us to master the mechanism of cancer occurrence and development and provide technical support for prevention, diagnosis, and treatment of cancer. In this article, four cancer datasets which are all downloaded from TCGA (https://tcgadata.nci.nih.gov/tcga/), namely, PAAD, esophageal carcinoma (ESCA), cholangiocarcinoma (CHOL), and colon adenocarcinoma (COAD), are used in these experiments. Details are listed in Table 2. To avoid the matrix dimension problem in algorithm execution, the number of genes in the four datasets is aligned to 19,876. First, RPCA is used to reduce the effects of noise and redundant information [29]. Second, the same number of samples and characteristics is retained for multiomics data of the same kind of cancer. Then, the matrices are normalized according to the standard deviation of the data such that each element of the matrix is evaluated between 0 and 1.

3.2. Clustering

When SG-jNMF2 method projects multiomics data into a common sample space, it contains all the sample information provided by the input multiomics data. To assess the clustering performance of this method, SG-jNMF2 is used to cluster the tumor samples on CHOL, PAAD, COAD, and ESCA datasets. There are four methods (iNMF, iONMF, iGMFNA, and jNMF) that perform the same experiments on the same datasets.

3.2.1. Selection of Parameters

For SG-jNMF2, clustering performance is affected by the regularization parameters. In this experiment, we empirically set the same value for with different omics data from the same cancer [30]. Therefore, there are three parameters, , and , that need to be adjusted. is the graph regularization parameter, controls the sparsity of factorization, and is the number of nodes in the undirected graph constructed in the manifold. From Figure 3, when is set to 3, the accuracy on the four datasets reaches a maximum. As seen from Figure 4, should be set to 1,000 on PAAD. When is equal to 0.1, the accuracy on COAD can achieve the maximum. When is equal to , , and 1, the accuracy on CHOL can achieve the maximum. When is equal to , the accuracy on ESCA can achieve the maximum. From Figure 5, when is set from to for PAAD, the accuracy reaches the maximum. For ESCA and COAD, should be set from to . For CHOL, the value of does not matter much.

3.2.2. Evaluation Indicators

Several indicators are used to evaluate the clustering performance of SG-jNMF2: accuracy, recall, precision, and F1-score. Accuracy is defined aswhere N is the total number of samples in the dataset and is a singular function. When is equal to , the value of the function is equal to 1; otherwise, it is equal to 0. maps the clustering label to the real label . The other three indicators used to evaluate clustering performance are defined as follows:where TP means the number of true positives, FP is the number of false positives, and FN denotes the number of false negatives.

3.2.3. Results

In this experiment, each algorithm was run fifty times to reduce the impact of random initialization on the clustering results. We compared the accuracy, recall, precision, and F1-score of the four methods with SG-jNMF2. The mean and variance in the results are shown in Table 3. As seen in Table 3, SG-jNMF2 achieves the highest values on the four indicators mentioned above, except the recall value on the ESCA dataset. The contributions of sparse and graph regularization constraints of the algorithm are listed in Table 4. Performance improvements are measured by , where is the indicator of SG-jNMF and is that of the comparison method. In particular, sparse constraints improve accuracy by 49.70%, and sparse and graph regularization constraints improve accuracy by 78.87% on the PAAD dataset. Recall and F1-score achieve more than 50% improvement on the CHOL dataset. When sparse constraints are introduced, only the recall on ESCA is reduced by 0.53%. The results on other datasets have also improved to varying degrees. In summary, the performance of the integrated NMF in analyzing multiomics data greatly improves by introducing sparse constraints and graph regularization constraints.

3.3. Identifying Co-DEGs

First, three matrices (DM, GE, and CNV of PAAD) are input into the SG-jNMF1 model and are projected into a common feature space. Second, we sum the common feature matrix in rows. Finally, we sort the elements in the sum vector in descending order. The top 100 genes are selected as Co-DEGs. These 100 genes are compared with pancreatic cancer genes exported from GeneCards (URL:http://www.genecards.org). Co-DEGs with relevance scores above 4 are listed in Table 5. CDKN2A is frequently mutated or deleted in many tumors. It plays an important role as a tumor suppressor gene. Studies have shown that the mutation of CDKN2A is closely related to the development of pancreatic cancer in families [31]. It is frequently seen in many tumors that mutation and overexpression of CCDN1 can alter the process of the cell cycle. Wang et al. identified pancreatitis-associated genes and found that CCND1 was involved in the pathway of pancreatic cancer [32]. Research on transcriptome sequencing shows that PTF1A maintains the expression of genes in all cellular processes. Deletion of PTF1A leads to an imbalance, cell damage, and acinar metaplasia, which is directly related to the development of pancreatic cancer [33]. Scientists have explored the effects of GRP on human intestinal and pancreatic peptides. Therefore, SG-jNMF1 can effectively integrate the information of multiomics data to identify Co-DEGs closely related to the disease.

We also use SG-jNMF1 to integrate three gene expression datasets from ESCA, CHOL, and COAD to identify Co-DEGs associated with all three diseases. Partially Co-DEGs and their relevance scores with ESCA, CHOL, and COAD are shown in Table 6. The relevance score of CHEK2 with ESCA is up to 77.66. Allelic variation in CHEK2 has a strong relationship with the risk of esophageal cancer [34]. Relevance score of CHEK2 with COAD is 29.65. The germline variation in CHEK2 is also closely related to the risk of colorectal cancer [35]. Frequent mutations in BRPA have been widely reported in human malignancies, including esophageal cancer, cholangiocarcinoma, and colon cancer [36–38]. This provides a computational method for the study of Co-DEGs in multiple diseases.

4. Conclusions

In this paper, we propose an integrative matrix factorization method (SG-jNMF) used to analyze heterogeneous multiomics data. The novel method jointly projects multiomics data matrices into a common low-dimensional space. Two forms of SG-jNMF enable multiomics data to be analyzed from both the sample and feature perspectives. This integrative analysis method can consider the local association of data and decrease the interference of noise and redundant information in the heterogeneous multiomics data. Experimental results show that the new method is superior to existing methods in analyzing heterogeneous multiomics data. Another significant advantage of SG-jNMF is that it can flexibly handle multiple input data of various types. This flexibility means that the input data can be different types of data (GE, ME, CNV, etc.) for the same disease or the same type of data for different diseases. We can use this method to identify Co-DEGs associated with a particular disease and detect common Co-DEGs associated with several diseases. This provides an efficient calculation method for biological and medical research. Next, we will use the correlation between Co-DEGs to build a gene coexpression correlation network, and further study the function of gene modules and related pathways.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported in part by the grants from the National Natural Science Foundation of China, nos. 61902215 and 61702299.

References

J. N. Weinstein, E. A. Collisson, G. B. Mills et al., “The cancer genome atlas pan-cancer analysis project,” Nature Genetics, vol. 45, no. 10, pp. 1113–1120, 2013.
View at: Publisher Site | Google Scholar
C. Curtis, S. P. Shah, S.-F. Chin et al., “The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups,” Nature, vol. 486, no. 7403, pp. 346–352, 2012.
View at: Publisher Site | Google Scholar
Y. L. Wang and X. Wang, “Fault diagnosis of wind turbine’s converter based on memristive neural network,” Applied Mechanics and Materials, vol. 705, no. 3, pp. 333–337, 2014.
View at: Publisher Site | Google Scholar
J.-X. Liu, Y.-L. Gao, C.-H. Zheng, Y. Xu, and J. Yu, “Block-constraint robust principal component analysis and its application to integrated analysis of tcga data,” IEEE Transactions on Nanobioscience, vol. 15, no. 6, pp. 510–516, 2016.
View at: Publisher Site | Google Scholar
S. Zhang, C.-C. Liu, W. Li, H. Shen, P. W. Laird, and X. J. Zhou, “Discovery of multi-dimensional modules by integrative analysis of cancer genomic data,” Nucleic Acids Research, vol. 40, no. 19, pp. 9379–9391, 2012.
View at: Publisher Site | Google Scholar
Z. Yang and G. Michailidis, “A non-negative matrix factorization method for detecting modules in heterogeneous omics multi-modal data,” Bioinformatics, vol. 30, no. 1, pp. 1–8, 2015.
View at: Google Scholar
M. Strazar, M. Zitnik, B. Zupan, J. Ule, and T. Curk, “Orthogonal matrix factorization enables integrative analysis of multiple RNA binding proteins,” Bioinformatics, vol. 32, no. 10, pp. 1527–1535, 2016.
View at: Google Scholar
H.-Q. Wang, C.-H. Zheng, and X.-M. Zhao, “jnmfma: a joint non-negative matrix factorization meta-analysis of transcriptomics data,” Bioinformatics, vol. 31, no. 4, pp. 572–580, 2014.
View at: Publisher Site | Google Scholar
R. Argelaguet, B. Velten, D. Arnol et al., “Multi-omics factor analysisâ˘A ˇTa framework for unsupervised integration of multi-omics data sets,” Molecular Systems Biology, vol. 14, no. 6, pp. 1–13, 2018.
View at: Publisher Site | Google Scholar
P. Ray, L. Zheng, J. Lucas, and L. Carin, “Bayesian joint analysis of heterogeneous genomics data,” Bioinformatics, vol. 30, no. 10, pp. 1370–1376, 2014.
View at: Publisher Site | Google Scholar
Y.-A. Huang, Z.-H. You, X. Chen, K. Chan, and X. Luo, “Sequence-based prediction of protein-protein interactions using weighted sparse representation model combined with global encoding,” BMC Bioinformatics, vol. 171, no. 1, p. 184, 2016.
View at: Google Scholar
W. Min, J. Liu, and S. Zhang, “Network-regularized sparse logistic regression models for clinical risk prediction and biomarker discovery,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 15, no. 3, pp. 944–953, 2018.
View at: Publisher Site | Google Scholar
J. Xi, M. Wang, and A. Li, “Discovering mutated driver genes through a robust and sparse co-regularized matrix factorization framework with prior information from mirna expression patterns and interaction network,” BMC Bioinformatics, vol. 19, no. 1, p. 214, 2018.
View at: Publisher Site | Google Scholar
Y. Pei, N. Chakraborty, and K. Sycara, “Nonnegative matrix tri-factorization with graph regularization for community detection in social networks,” ACM Transactions on Intelligent Systems and Technology, (TIST), vol. 8, no. 1, p. 1, 2016.
View at: Google Scholar
L.-Y. Dai, C.-H. Zheng, J.-X. Liu et al., “Integrative graph regularized matrix factorization for drug-pathway associations analysis,” Computational Biology and Chemistry, vol. 78, pp. 474–480, 2019.
View at: Publisher Site | Google Scholar
K. Zhan, J. Shi, J. Wang, and F. Tian, “Graph-regularized concept factorization for multi-view document clustering,” Journal of Visual Communication and Image Representation, vol. 48, pp. 411–418, 2017.
View at: Publisher Site | Google Scholar
J. Wu, S. Pan, X. Zhu et al., “Boosting for multi-graph classification,” IEEE Transactions on Cybernetics, vol. 45, no. 3, pp. 416–429, 2015.
View at: Google Scholar
E. Zhang and X. Ma, “Regularized multi-view subspace clustering for common modules across cancer stages,” Molecules, vol. 23, no. 5, p. 1016, 2018.
View at: Publisher Site | Google Scholar
J. Wang, J.-X. Liu, C.-H. Zheng, Y.-X. Wang, X.-Z. Kong, and C.-G. Wen, “A mixed-norm Laplacian regularized low-rank representation method for tumor samples clustering,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 16, no. 1, pp. 172–182, 2019.
View at: Publisher Site | Google Scholar
Z. Cui, “L 2, 1-grmf: an improved graph regularized matrix factorization method to predict drug-target interactions,” BMC Bioinformatics, vol. 20, no. 8, p. 287, 2019.
View at: Publisher Site | Google Scholar
F. Liu, S. Xue, J. Wu et al., “Deep learning for community detection: progress, challenges and opportunities,” 2020, https://arxiv.org/abs/2005.08225.
View at: Google Scholar
J. Wu, X. Zhu, C. Zhang, and P. S. Yu, “Bag constrained structure pattern mining for multi-graph classification,” IEEE Transactions on Knowledge and Data Engineering, vol. 26, no. 10, pp. 2382–2396, 2014.
View at: Publisher Site | Google Scholar
Z. Zhang, “Jointly learning structured analysis discriminative dictionary and analysis multiclass classifier,” IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 8, pp. 3798–3814, 2017.
View at: Google Scholar
Z. Zhang, Y. Zhang, S. Li et al., “Flexible auto-weighted local-coordinate concept factorization: a robust framework for unsupervised clustering,” IEEE Transactions on Knowledge and Data Engineering, vol. 1, p. 1, 2019.
View at: Publisher Site | Google Scholar
L.-Y. Dai, C.-M. Feng, J.-X. Liu, C.-H. Zheng, J. Yu, and M.-X. Hou, “Robust nonnegative matrix factorization via joint graph laplacian and discriminative information for identifying differentially expressed genes,” Complexity, vol. 2017, Article ID 4216797, 11 pages, 2017.
View at: Publisher Site | Google Scholar
H. S. Seung and D. D. Lee, “The manifold ways of perception,” Science, vol. 290, no. 5550, pp. 2268-2269, 2000.
View at: Publisher Site | Google Scholar
F. Nie, H. Huang, X. Cai, and C. H. Ding, “Efficient and robust feature selection via joint l2, 1-norms minimization,” in Proceedings of Advances in Neural Information Processing Systems, pp. 1813–1821, British Columbia, Canada, December 2010.
View at: Google Scholar
D. Cai, “Graph regularized nonnegative matrix factorization for data representation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 8, pp. 1548–1560, 2010.
View at: Google Scholar
M.-X. Hou, Y.-L. Gao, J.-X. Liu, L.-Y. Dai, X.-Z. Kong, and J. Shang, “Network analysis based on low-rank method for mining information on integrated data of multi-cancers,” Computational Biology and Chemistry, vol. 78, pp. 468–473, 2019.
View at: Publisher Site | Google Scholar
N. Yu, Y.-L. Gao, J.-X. Liu, J. Shang, R. Zhu, and L.-Y. Dai, “Co-differential gene selection and clustering based on graph regularized multi-view NMF in cancer genomic data,” Genes, vol. 9, no. 12, p. 586, 2018.
View at: Publisher Site | Google Scholar
D. B. Zhen, K. G. Rabe, S. Gallinger et al., “Brca1, brca2, palb2, and cdkn2a mutations in familial pancreatic cancer: a pacgene study,” Genetics in Medicine, vol. 17, no. 7, p. 569, 2015.
View at: Publisher Site | Google Scholar
D. Wang, Z.-M. Zhu, Y.-L. Tu et al., “Identfication of key miRNAs in pancreatitis using bioinformatics analysis of microarray data,” Molecular Medicine Reports, vol. 14, no. 6, pp. 5451–5460, 2016.
View at: Publisher Site | Google Scholar
C. Q. Hoang, M. A. Hale, Ana C. Azevedo-Pouly et al., “Transcriptional maintenance of pancreatic acinar identity, dierentiation, and homeostasis by ptf1a,” Molecular and Cellular Biology, vol. 36, no. 34, pp. 3033–3047, 2016.
View at: Publisher Site | Google Scholar
H. Gu, W. Qiu, Y. Wan et al., “Variant allele of CHEK2 is associated with a decreased risk of esophageal cancer lymph node metastasis in a Chinese population,” Molecular Biology Reports, vol. 39, no. 5, pp. 5977–5984, 2012.
View at: Publisher Site | Google Scholar
L. H. Williams, D. Choong, S. A. Johnson, and I. G. Campbell, “Genetic and epigenetic analysis of CHEK2 in sporadic breast, colon, and ovarian cancers,” Clinical Cancer Research, vol. 12, no. 23, pp. 6967–6972, 2006.
View at: Publisher Site | Google Scholar
C. H. Maeng, J. Lee, P. van Hummelen et al., “High-throughput genotyping in metastatic esophageal squamous cell carcinoma identifies phosphoinositide-3-kinase and BRAF mutations,” PloS One, vol. 7, no. 8, Article ID e41655, 2012.
View at: Publisher Site | Google Scholar
S. Robertson, O. Hyder, R. Dodson et al., “The frequency of KRAS and BRAF mutations in intrahepatic cholangiocarcinomas and their correlation with clinical outcome,” Human Pathology, vol. 44, no. 12, pp. 2768–2773, 2013.
View at: Publisher Site | Google Scholar
S. Ogino, K. Nosho, G. J. Kirkner et al., “CpG island methylator phenotype, microsatellite instability, BRAF mutation and clinical outcome in colon cancer,” Gut, vol. 58, no. 1, pp. 90–96, 2009.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2020 Ling-Yun Dai et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

480

Downloads

957

Citations