Discpher key facts from large-scale biological data

The future of Bioinformatics will be based on the latest results of BIGDATA analysis and machine learning (Data Science) to incorporate and analyze the data. Today's Biology deals with a wide range of measurement data such as next-generation sequencers, mass spectrometers, and imaging technologies, and each of them produces a large amount of omics information, and much of that data only make sense when compared to existing BIGDATA. Using cloud and its related technology is becoming the only solution to make such a large scale data analysis feasible. In addition, machine learning technology represented by deep learning is indispensable for analyzing multidimensional data out of experimental noise. We are developing Cloud native Bioinformatics software using cloud technology such as Distributed Computing (Apache Spark), Container Virtualization (Kubernetes/Docker) and Cloud Object Storage, to analyze large scale cancer genome and single cell data. In addition, we are aiming to advance the epitranscriptome analysis, which was often overlooked until now, in an informatic manner by utilizing deep learning for base modification analysis using the nanopore sequencer.