The objective of this short report is to reconsider the subject of bioinformatics as just being a tool of experimental biological science. To do that, we introduce three examples to show how bioinformatics could be considered as an experimental science. These examples show how the development of theoretical biological models generates experimentally verifiable computer hypotheses, which necessarily must be validated by experiments in vitro or in vivo.
What do we mean by "Bioinformatics"? According to the Collins' dictionary definition, Bioinformatics is defined as "the branch of information science concerned with large databases of biochemical or pharmaceutical information." Although great advances have recently occurred in information technology, the term Bioinformatics is quite often described as just being the process of inserting and storing biological information in a database. This parochial definition limits the term Bioinformatics and possibly condemns it to oblivion rather than indicating an evolution from the definition proposed by Paulien Hogeweg in 1978 [1], modified by Luscombe et al [2], and later remodified by Huerta [3].
Bioinformatics is a science in which the holistic way to understand biology is to intermingle health sciences with information technology without limits or an addressed definition. This means that it is not possible to differentiate the omic analysis with respect to the information technology that enables it as is evidenced by a growing body of journals on the subject.
More formally, we suggest that Bioinformatics is conceptualizing biology in terms of interactions too numerically taxing or complex to be organized or analyzed without applying information techniques (themselves derived from applied mathematics, computer science, and statistics). It thus allows the elucidation of relevant information associated with these interactions, furthering our understanding of their relationships and our general scientific knowledge. Therefore a numerical analysis, mathematical transformations and extrapolations using a scientifically systematic approach based on observed phenomena in or between macromolecules, other interacting agents, or populations, can reveal biologically relevant information. The deduced or inferred information can be submitted to experimental verification. Bioinformatics are thus in themselves scientific experiments since they generate hypotheses that can be experimentally validated or falsified.
Let us introduce three examples to show why Bioinformatics could be considered as an experimental science without forgetting that experimental evidence is necessary to support these hypotheses.
Firstly let’s take a look at the evolution from a traditional tool into the gateway to new experimental knowledge. Remember that the first challenge in Bioinformatics was the PCR primer design. Programs that coded a set of observed phenomena on oligonucleotide behavior were developed (for example [4]- [6]). These commonplace programs predicted non-observed sequences that allow an amplification under isothermal conditions. Known as the Loop-mediated isothermal amplification (LAMP) PCR [7], this is a rapid and highly sensitive technique with the advantage of not requiring the extraction of DNA or sophisticated equipment such as a thermocycler. Some examples are useful to confirm a diagnosis in visceral leishmaniasis [8], malaria [9], and so on. This example shows how new knowledge is obtained based on a priori Bioinformatics knowledge.
Secondly let us discuss in silico inferences that are accepted as the most probable cause in the diffusion process of genetic material through a virus capsid plant, Cowpea Chlorotic Mottle Virus -CCMV. This is a complicated dynamic process studied by snapshots that were obtained from X-ray diffraction data. The nuclear magnetic resonance (NMR) used to register the dynamics of atoms is nearly impossible in view of the tremendous amount of necessitated data processing. Isea et al [10] hypothesized that there was a spontaneous release of genetic material through the CCMV capsid without any energy input and that the infectious mechanism arising from genetic material is thus spontaneous. These results should be validated with experiments in vivo similar to the recent work about the prediction in human receptor specificity of H5N1 Influenza A Viruses [11], were the authors developed a phylogenetic algorithm to identify candidate pandemic influenza viruses.
Finally we consider data-driven modeling. Data-driven modeling departs from traditional physical or mathematical modeling in which sets of point observations and time series data are analyzed through one or more different possible approaches drawn from the field of computational intelligence (a sub-branch of Artificial Intelligence that studies adaptive mechanisms to enable or facilitate intelligent behavior in complex and changing environments [12]). The empirical models obtained will be refined by inferring the parameters from the data through iteration cycles that can include new found data, derived from laboratory experiments that test the model based on its own output.
This way to infer and validate multidimensional relationships and interactions is expected to have a great impact in multiple areas in the generation of knowledge. This knowledge is useful for the integration of metabolic pathways in mammalian cells such as hepatocytes [13], f
This content is AI-processed based on open access ArXiv data.