publications
- Abdullah Karaaslanli, and Selin Aviyente, Dynamic Signed Graph Learning, In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023
An important problem in graph signal processing (GSP) is to infer the topology of an unknown graph from a set of observations on the nodes of the graph, i.e. graph signals. Recently, graph learning (GL) approaches have been extended to learn dynamic graphs from temporal graph signals. However, existing work primarily focuses on unsigned graphs and cannot learn signed graphs, which are important data structures that can represent the similarity and dissimilarity of the nodes. In this paper, we propose a dynamic signed GL (dynSGL) method based on the assumptions that (i) at each time point signals are smooth with respect to the signed graph, i.e. signal values at two nodes connected with a positive (negative) edge are similar (dissimilar) and (ii) evolution of the graph structures is smooth across time. The performance of dynSGL is evaluated on simulated data and shown to have higher accuracy compared to static signed and dynamic unsigned GL techniques. Application of the proposed method to a financial dataset gives important insights to the time-varying changes to the interactions between stocks.
- Abdullah Karaaslanli, Satabdi Saha, Tapabrata Maiti, and Selin Aviyente, Multiple Signed Graph Learning for Gene Regulatory Network Inference, In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023
Many real-world data are represented through the relations between data samples, i.e., a graph structure. Although many datasets come with a pre-existing graph, there is still a large number of applications where the graph structure is not readily available. An essential task for such cases is graph learning (GL), which infers the graph structure from a set of graph signals. Existing GL techniques mostly focus on learning a single graph structure; however, samples are usually connected in multiple different ways. Furthermore, existing works can only handle unsigned graphs, while contemporary tasks require inference of signed graphs, which are better at representing similarity and dissimilarity of samples. In this paper, we propose a framework (mvSGL) for joint estimation of multiple related signed graphs. mvSGL optimizes the total variation of graph signals with respect to graphs while ensuring that the graphs are similar to each other through a consensus graph. mvSGL is employed in the inference of multiple gene regulatory networks (GRN) from single cell datasets that include multiple cell types. Performance evaluation using simulated and real datasets demonstrates the effectiveness of mvSGL in the inference of multiple related GRNs.
- Abdullah Karaaslanli, Satabdi Saha, Tapabrata Maiti, and Selin Aviyente, Kernelized multiview signed graph learning for single-cell RNA sequencing data, BMC bioinformatics, 2023
Background: Characterizing the topology of gene regulatory networks (GRNs) is a fundamental problem in systems biology. The advent of single cell technologies has made it possible to construct GRNs at finer resolutions than bulk and microarray datasets. However, cellular heterogeneity and sparsity of the single cell datasets render void the application of regular Gaussian assumptions for constructing GRNs. Additionally, most GRN reconstruction approaches estimate a single network for the entire data. This could cause potential loss of information when single cell datasets are generated from multiple treatment conditions/disease states. Results: To better characterize single cell GRNs under different but related conditions, we propose the joint estimation of multiple networks using multiple signed graph learning (scMSGL). The proposed method is based on recently developed graph signal processing (GSP) based graph learning, where GRNs and gene expressions are modeled as signed graphs and graph signals, respectively. scMSGL learns multiple GRNs by optimizing the total variation of gene expressions with respect to GRNs while ensuring that the learned GRNs are similar to each other through regularization with respect to a learned signed consensus graph. We further kernelize scMSGL with the kernel selected to suit the structure of single cell data. Conclusions: scMSGL is shown to have superior performance over existing state of the art methods in GRN recovery on simulated datasets. Furthermore, scMSGL successfully identifies well-established regulators in a mouse embryonic stem cell differentiation study and a cancer clinical study of medulloblastoma.
- Abdullah Karaaslanli, and Selin Aviyente, Simultaneous Graph Signal Clustering and Graph Learning, In International Conference on Machine Learning (ICML), 2022
Graph learning (GL) aims to infer the topology of an unknown graph from a set of observations on its nodes, i.e., graph signals. While most of the existing GL approaches focus on homogeneous datasets, in many real world applications, data is heterogeneous, where graph signals are clustered and each cluster is associated with a different graph. In this paper, we address the problem of learning multiple graphs from heterogeneous data by formulating an optimization problem for joint graph signal clustering and graph topology inference. In particular, our approach extends spectral clustering by partitioning the graph signals not only based on their pairwise similarities but also their smoothness with respect to the graphs associated with the clusters. The proposed method also learns the representative graph for each cluster using the smoothness of the graph signals with respect to the graph topology. The resulting optimization problem is solved with an efficient block-coordinate descent algorithm and results on simulated and real data indicate the effectiveness of the proposed method.
- Selin Aviyente, and Abdullah Karaaslanli, Explainability in Graph Data Science: Interpretability, replicability, and reproducibility of community detection, IEEE Signal Processing Magazine, 2022
In many modern data science problems, data are represented by a graph (network), e.g., social, biological, and communication networks. Over the past decade, numerous signal processing and machine learning (ML) algorithms have been introduced for analyzing graph structured data. With the growth of interest in graphs and graph-based learning tasks in a variety of applications, there is a need to explore explainability in graph data science. In this article, we aim to approach the issue of explainable graph data science, focusing on one of the most fundamental learning tasks, community detection, as it is usually the first step in extracting information from graphs. A community is a dense subnetwork within a larger network that corresponds to a specific function. Despite the success of different community detection methods on synthetic networks with strong modular structure, much remains unknown about the quality and significance of the outputs of these algorithms when applied to real-world networks with unknown modular structure. Inspired by recent advances in explainable artificial intelligence (AI) and ML, in this article, we present methods and metrics from network science to quantify three different aspects of explainability, i.e., interpretability, replicability, and reproducibility, in the context of community detection.
- Abdullah Karaaslanli, Satabdi Saha, Selin Aviyente, and Tapabrata Maiti, scSGL: kernelized signed graph learning for single-cell gene regulatory network inference, Bioinformatics, 2022
Motivation: Elucidating the topology of gene regulatory networks (GRNs) from large single-cell RNA sequencing datasets, while effectively capturing its inherent cell-cycle heterogeneity and dropouts, is currently one of the most pressing problems in computational systems biology. Recently, graph learning (GL) approaches based on graph signal processing have been developed to infer graph topology from signals defined on graphs. However, existing GL methods are not suitable for learning signed graphs, a characteristic feature of GRNs, which are capable of accounting for both activating and inhibitory relationships in the gene network. They are also incapable of handling high proportion of zero values present in the single cell datasets.
Results: To this end, we propose a novel signed GL approach, scSGL, that learns GRNs based on the assumption of smoothness and non-smoothness of gene expressions over activating and inhibitory edges, respectively. scSGL is then extended with kernels to account for non-linearity of co-expression and for effective handling of highly occurring zero values. The proposed approach is formulated as a non-convex optimization problem and solved using an efficient ADMM framework. Performance assessment using simulated datasets demonstrates the superior performance of kernelized scSGL over existing state of the art methods in GRN recovery. The performance of scSGL is further investigated using human and mouse embryonic datasets. - Abdullah Karaaslanli, Meiby Ortiz-Bouza, Tamanna TK Munia, and Selin Aviyente, Community Detection in Multi-frequency EEG Networks, arXiv, 2022
Objective: In recent years, the functional connectivity of the human brain has been studied with graph theoretical tools. One such approach is community detection which is fundamental for uncovering the localized networks. Existing methods focus on networks constructed from a single frequency band while ignoring multi-frequency nature of functional connectivity. Therefore, there is a need to study multi-frequency functional connectivity to be able to capture the full view of neuronal connectivity.
Methods: In this paper, we use multilayer networks to model multi-frequency functional connectivity. In the proposed model, each layer corresponds to a different frequency band. We then extend the definition of modularity to multilayer networks to develop a new community detection algorithm. Results: The proposed approach is applied to electroencephalogram data collected during a study of error monitoring in the human brain. The differences between the community structures within and across different frequency bands for two response types, i.e. error and correct, are studied.
Conclusion: The results indicate that following an error response, the brain organizes itself to form communities across frequencies, in particular between theta and gamma bands while a similar cross-frequency community formation is not observed for the correct response. Moreover, the community structures detected for the error response were more consistent across subjects compared to the community structures for correct response.
Significance: The multi-frequency functional connectivity network models combined with multilayer community detection algorithms can reveal changes in cross-frequency functional connectivity network formation across different tasks and response types.
- Abdullah Karaaslanli, and Selin Aviyente, Graph Learning From Noisy and Incomplete Signals on Graphs, In 2021 IEEE Statistical Signal Processing Workshop (SSP), 2021
Learning the graph structure underlying observed graph signals is important in many graph signal processing (GSP) applications. This problem has been extensively addressed as graph Laplacian learning with the constraint that the graph signals have smooth variations on the resulting topology. The current approaches focus primarily on the case that the signals are observed across all nodes and possibly corrupted by additive Gaussian noise. In this paper, we propose a general framework for graph learning where the graph signal is partially observed and corrupted by sparse outliers in addition to Gaussian noise. We present a general optimization framework that addresses this problem and show how this formulation encapsulates a variety of problems in GSP including Laplacian learning and graph regularized low-rank matrix completion. The proposed optimization is solved with ADMM and the resulting algorithms are evaluated on both simulated graphs with different topology and real world graph-based data clustering.
- Abdullah Karaaslanli, and Selin Aviyente, Community detection in dynamic networks: Equivalence between stochastic blockmodels and evolutionary spectral clustering, IEEE Transactions on Signal and Information Processing over Networks, 2021
Community detection aims to identify densely connected groups of nodes in complex networks. Although a variety of methods have been proposed for community detection, the relationship between them is not well understood. Recently, researchers have shown the equivalence between modularity optimization and likelihood maximization in stochastic block models (SBMs) for static networks. Showing this equivalence is important for both understanding the different community detection methods and selecting the hyperparameters in the different algorithms in a more principled way. In this paper, we extend this equivalence for dynamic community detection algorithms. In particular, we show the equivalence of evolutionary spectral clustering to a variant of dynamic stochastic blockmodel. For this purpose, we first introduce a novel dynamic SBM where the evolution of communities over time is modeled with pairwise Markov random fields. We then show that the log-posterior of the proposed model is equivalent to the quality function of evolutionary spectral clustering. This equivalence is used to determine the forgetting factor in evolutionary spectral clustering and to develop two new algorithms for dynamic community detection. Compared to original evolutionary spectral clustering, the forgetting factor is time-dependent and derived directly from the parameters of the proposed dynamic SBM. The proposed algorithms are shown to be superior to state-of-the-art dynamic community detection methods for both simulated and real-world dynamic networks.
- Abdullah Karaaslanli, Satabdi Saha, Selin Aviyente, and Tapabrata Maiti, Multiview Graph Learning for single-cell RNA sequencing data, bioRxiv, 2021
Characterizing the underlying topology of gene regulatory networks is one of the fundamental problems of systems biology. Ongoing developments in high throughput sequencing technologies has made it possible to capture the expression of thousands of genes at the single cell resolution. However, inherent cellular heterogeneity and high sparsity of the single cell datasets render void the application of regular Gaussian assumptions for constructing gene regulatory networks. Additionally, most algorithms aimed at single cell gene regulatory network reconstruction, estimate a single network ignoring group-level (cell-type) information present within the datasets. To better characterize single cell gene regulatory networks under different but related conditions we propose the joint estimation of multiple networks using multiview graph learning (mvGL). The proposed method is developed based on recent works in graph signal processing (GSP) for graph learning, where graph signals are assumed to be smooth over the unknown graph structure. Graphs corresponding to the different datasets are regularized to be similar to each other through a learned consensus graph. We further kernelize mvGL with the kernel selected to suit the structure of single cell data. An efficient algorithm based on prox-linear block coordinate descent is used to optimize mvGL. We study the performance of mvGL using synthetic data generated with a diverse set of parameters. We further show that mvGL successfully identifies well-established regulators in a mouse embryonic stem cell differentiation study and a cancer clinical study of medulloblastoma.
- Abdullah Karaaslanli, and Selin Aviyente, Constrained spectral clustering for dynamic community detection, In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020
Networks are useful representations of many systems with interacting entities, such as social, biological and physical systems. Characterizing the meso-scale organization, i.e. the community structure, is an important problem in network science. Community detection aims to partition the network into sets of nodes that are densely connected internally but sparsely connected to other dense sets of nodes. Current work on community detection mostly focuses on static networks. However, many real world networks are dynamic, i.e. their structure and properties change with time, requiring methods for dynamic community detection. In this paper, we propose a new stochastic block model (SBM) for modeling the evolution of community membership. Unlike existing SBMs, the proposed model allows each community to evolve at a different rate. This new model is used to derive a maximum a posteriori estimator for community detection, which can be written as a constrained spectral clustering problem. In particular, the transition probabilities for each community modify the graph adjacency matrix at each time point. This formulation provides a relationship between statistical network inference and spectral clustering for dynamic networks. The proposed method is evaluated on both simulated and real dynamic networks.
- Abdullah Karaaslanli, and Selin Aviyente, Strength Adjusted Multilayer Spectral Clustering, In 2019 IEEE 29th International Workshop on Machine Learning for Signal Processing (MLSP), 2019
Network representations are useful for describing the structure of a variety of complex systems. One of the important problems in the analysis of such complex systems is inferring the network structure from the available data. Community detection has been a common way to infer network topology. In recent years, network representation has been extended to describe multi-relational or multiview data sets resulting in multilayer networks. These types of networks arise in different applications including social, ecological, biological and brain networks. Even though different graph theoretic metrics have been extended to study multilayer networks, approaches for network topology inference, e.g. community detection, have been limited. In this paper, we introduce a multilayer community detection method based on a modified graph cut definition and propose a corresponding algorithm. The proposed multilayer community detection method is evaluated on both simulated network models and a multilayer brain network, where the layers correspond to different frequency bands.
- Göktekin Durusoy, Abdullah Karaaslanli, Demet Yüksel Dal, Zerrin Yıldırım, and Burak Acar, Multi-modal brain tensor factorization: Preliminary results with AD patients, In International Workshop on Connectomics in Neuroimaging, 2018
Global brain network parameters suffer from low classification performance and fail to provide an insight into the neurodegenerative diseases. Besides, the variability in connectivity definitions poses a challenge. We propose to represent multi-modal brain networks over a population with a single 4D brain tensor (B) and factorize B to get a lower dimensional representation per case and per modality. We used 7 known functional networks as the canonical network space to get a 7D representation. In a preliminary study over a group of 20 cases, we assessed this representation for classification. We used 6 different connectivity definitions (modalities). Linear discriminant analysis results in 90-95% accuracy in binary classification. The assessment of the canonical coordinates reveals Salience subnetwork to be the most powerful in classification consistently over all connectivity definitions. The method can be extended to include functional networks and further be used to search for discriminating subnetworks.