Title: Managing workflows on top of a cloud computing orchestrator for using heterogeneous environments on e-Science
Authors: Abel Carrión; Miguel Caballer; Ignacio Blanquer; Nelson Kotowski; Rodrigo Jardim; Alberto Martin Rivera Dávila
Addresses: GRyCAP (Grupo deăGrid y Computación de Altas Prestaciones), Instituto de Instrumentación para Imagen Molecular (I3M), Centro mixto CSIC - Universitat Politècnica de València - CIEMAT, Camino de Vera s/n, 46022 Valencia, Spain ' GRyCAP (Grupo deăGrid y Computación de Altas Prestaciones), Instituto de Instrumentación para Imagen Molecular (I3M), Centro mixto CSIC - Universitat Politècnica de València - CIEMAT, Camino de Vera s/n, 46022 Valencia, Spain ' GRyCAP (Grupo deăGrid y Computación de Altas Prestaciones), Instituto de Instrumentación para Imagen Molecular (I3M), Centro mixto CSIC - Universitat Politècnica de València - CIEMAT, Camino de Vera s/n, 46022 Valencia, Spain ' Computational and Systems Biology Laboratory, Oswaldo Cruz Institute, Rio de Janeiro, RJ 21040-360, Brazil ' Computational and Systems Biology Laboratory, Oswaldo Cruz Institute, Rio de Janeiro, RJ 21040-360, Brazil ' Computational and Systems Biology Laboratory, Oswaldo Cruz Institute, Rio de Janeiro, RJ 21040-360, Brazil
Abstract: Scientific workflows (SWFs) are widely used to model processes in e-Science. SWFs are executed by means of workflow management systems (WMSs), which orchestrate the workload on top of computing infrastructures. The advent of cloud computing infrastructures has opened the door of using on-demand infrastructures to complement or even replace local infrastructures. However, new issues have arisen, such as the integration of hybrid resources or the compromise between infrastructure reutilisation and elasticity. In this article, we present an ad hoc solution for managing workflows exploiting the capabilities of cloud orchestrators to deploy resources on demand according to the workload and to combine heterogeneous cloud providers (such as on-premise clouds and public clouds) and traditional infrastructures (clusters) to minimise costs and response time. The work does not propose yet another WMS but demonstrates the benefits of the integration of cloud orchestration when running complex workflows. The article shows several configuration experiments from a realistic comparative genomics workflow called Orthosearch, to migrate memory-intensive workload to public infrastructures while keeping other blocks of the experiment running locally. The article computes running time and cost suggesting best practices.
Keywords: cloud computing; cloud orchestrator; comparative genomics; e-Science; multi-platform; workflow; workflow management systems.
DOI: 10.1504/IJWGS.2017.087326
International Journal of Web and Grid Services, 2017 Vol.13 No.4, pp.375 - 402
Received: 21 Sep 2016
Accepted: 04 Oct 2016
Published online: 13 Oct 2017 *