Learning from Metadata: A Fuzzy Token Matching Based Configuration File Discovery Approach
Abstract
Discovery of configuration files is one of the prerequisite activities for a successful workload migration to the cloud. The complicated and super-sized file systems, the considerable variance of configuration files, and the multiple-presence of configuration items make configuration file discovery very difficult. Traditional approaches usually highly rely on experts to compose software specific scripts or rules to discover configuration files, which is very expensive and labor-intensive. In this paper, we propose a novel learning based approach named MetaConf to convert configuration file discovery to a supervised file classification task using the file metadata as learning features such that it can be conducted automatically, efficiently, and independently of domain expertise. We report our evaluation with extensive and real-world case studies, and the experimental results validate that our approach is effective and it outperforms our baseline method.