Elucidation of Developmental Dynamics in Nematodes

Research Summary (CREST Project)

In a cooperative project with RIKEN and Kyoto University Graduate School of Medicine, we are carrying out research with the aim of developing data-driven research methods to elucidate the mechanisms through which multicellular organisms occur using nematodes as a model system.

Owing to the fast-paced technological developments in recent years, in the field of life sciences, rapid advances have occurred in the accumulation of data spatiotemporally measuring the dynamics of cells and individuals under conditions in which various genes are inactivated. Analyzing such data is making it possible to carry out system-level analyses of individual occurrences, such as deriving the causal relationships between various morphological/biochemical characteristics expressed in the course of individual occurrences, or deriving groups of genes involved in these causal relationship mechanisms. Data-driven research methods are expected to become a major driving force in the life sciences of the 21st century. With this project, we aim to proactively develop a research method for data-driven research, which is expected to become a major research strategy in research on the generation and regeneration of mammalian and human stem cells in the future, and to create an environment that encourages scientific discoveries that will contribute to paradigm shifts through visualization techniques for big data.

To realize the above goals, our laboratory is conducting research aimed at “building a visual analysis environment for supporting the discovery of important causal relationships in the scientific method using big data from basic life sciences.”

We plan to conduct research and development on coarse-graining techniques to promote the discovery of latent variables in causal graphs created from extremely large datasets acquired from both dry and wet environments. Coarse graining means taking the average of the scalar quantity in a particular unit and controlling the unit detail level using this average value. Scalar quantities are often statistical analysis results, and as an example, we are utilizing statistical analysis techniques. Based on their nature, biological phenomena are often hierarchical structures, and coarse graining is an appropriate method for the analyses of living things. To support biologists in gaining an understanding of the whole picture of nematode embryogenesis, we will construct an environment for interactively manipulating multiple hierarchical levels (layers) obtained through coarse graining.

To support data driven modeling using big data as well as an integrated analysis of biochemical and trait expression networks, we will construct an environment for switching display spaces, such as array spaces and entity spaces, and realizing changes or similar elements in the visualized target section in an interactive manner. To borrow the words of the French scientist Louis Pasteur (from his inaugural lecture to the University of Lille in 1854), scientific discoveries will come only to les esprits préparés (the prepared minds). We aim to realize favorable and rich research environments as represented by these words through the power of information science.

Furthermore, we are designing visualization techniques to search for causal relationships from the correlation between the spatial density of microRNAs in cells and the spatial density of the related molecules.

Research Achievements

To enhance the visibility of a dense graph with a large number of edges, such as a causal relationship network with phenotype characteristics, we developed a new edge concentration algorithm that reduces the edge crossing when visualizing a dense graph, and succeeded in reducing the edge crossing of causal relationship networks with phenotype characteristics by 53%.

Achievement Details

It is essential to visualize causal relationship networks with phenotype characteristics in an easy-to-understand manner so as to promote the discovery of new knowledge from such networks. To visualize the causal relationships between time-varying feature quantities, a hierarchical graph layout may be considered; however, because a causal relationship network is a dense graph with a large number of edges, a large number of edge crossings appear when using a hierarchical graph layout for visualization. The occurrence of edge crossings not only impairs the beauty of the visualization result but is also known to adversely affect how the result is read. Therefore, the problems we faced were to eliminate edge crossings appearing in dense hierarchical graphs and to produce visualization results with high visibility.

Fig. 1 Visualization results of causal relationship network with phenotype characteristics using hierarchical graph layout

(left, before application of edge concentration; right, after application of edge concentration)

We focused on edge concentration as a method for removing such edge crossings. Edge concentration involves extracting bicliques (complete bipartite subgraphs) included in the hierarchical graph, and replacing them with centralized nodes to eliminate edge crossings. However, existing edge concentration algorithms have failed to adequately achieve the elimination of edge crossings in large-scale graphs such as causal relationship networks with phenotype characteristics. Therefore, we developed a new edge concentration algorithm based on a minimization of the number of edges after the edge concentration, and applied this algorithm to such networks. Compared with existing methods, the proposed method can effectively eliminate edge crossings even for large-scale graphs, and is also superior in terms of the computation time. Figure 1 shows the results of the visualization of a causal relationship network with phenotype characteristics; the image on the left shows the results before applying the edge concentration, and the one on the right shows the results after applying the edge concentration. The number of edge crossings before the application of the edge concentration was 8,663, whereas the number of edge crossings after the edge concentration was 4,035, achieving an edge crossing removal rate of approximately 53%. This method has been acknowledged as making a valuable contribution to the field of graph drawing, and was described in a paper published in IEEE Transactions on Visualization and Computer Graphics.

To search for causal relationships from correlations between mRNA and related-molecule spatial densities in cells, a data analysis along the intracellular flow field is important. For this purpose, we developed a system that extracts the trajectory lines from the flow field, then samples the scalar data, and supports model creation using the samples. In addition, we developed a method for accelerating particle-rendering techniques with excellent expression and scalability so as to visualize multiple density fields for analysis in an easily understandable manner while bringing the fields together. Specifically, we developed an algorithm to appropriately change the particle radius according to the opacity set by the user.


Comprehensive Visualization

Causal Relationship Visualization