DSServe - Data Science using Serverless
Abstract
AI Applications uses various data science tools such as Jupyter notebook to prescribe a series of steps, commonly referred as workflow, for building AI Solutions. The steps in workflow can be as simple as loading the data from remote storage, visualize the data for better understanding or conducting data quality study, or it can be as complex as generating features for modeling, best model discovery processes, etc. Clearly, different steps of the data science workflow has varying requirement of compute resources. Moreover, the execution of steps in workflow are Adhoc and Subjective. With wider availability of various Serverless technology, in this paper, we demonstrate a generalized framework that can be used to provide on demand scale out capability for the Data Science Workflow. In particular, we selected the most common AI operation, namely Automatic Model Selection, as an example to demonstrate benefits of serverless computing. We conducted a detailed experimental results using IBM Code Engine technology to validate the benefits of our proposed approach.