A hybrid data-driven and knowledge-driven methodology for estimating the effect of completion parameters on the cumulative production of horizontal wells
Abstract
We present a new methodology for modeling the cumulative oil and gas production of horizontal wells in a shale play given two types of well completion parameters: lateral length and proppant intensity, we also consider the location of the wells and the shut-in days of wells. We show experimental results evaluating the predictive accuracy of this methodology on hold-out data and compare it to standard data-driven estimation procedures. Our approach is based on the use of Generalized Additive Models (GAMs). The main advantage of using GAMs is that we can easily obtain effect plots that allow us to quantify the effect of each completion parameter and the location of the well on the production. Furthermore, it is possible to explicitly model the interaction between variables and impose shape constraints based on physical knowledge. We have implemented and tested this methodology in R and compared its predictive accuracy against a variety of standard data-driven modeling procedures available in the Caret package, using leave-one-out cross validation. We present experimental results using data from a shale play with 152 horizontal wells that have cumulative production for more than 12 months. Our experiments show that using GAMs in most cases leads to better predictive accuracy than the standard data-driven estimation procedures available in Caret, possibly because they allow us to explicitly model the interaction between input variables that are known to be are physically related and to impose shape constraints based on known physical relationships between input variables and target variables. We conclude that it is possible to accurately estimate the effect of the completion parameters on the cumulative production when we ensure that the models obey physical constraints, i.e., that the cumulative production is a monotonic increasing function of lateral length and of proppant intensity. We also obtain effect plots showing the "sweet-spot" effect, i.e., the relationship between the location (x, y) in the field and the cumulative production. The novelty of this hybrid data-driven and knowledge-driven methodology is to allow the quantification of the effect of completion parameters and location on well production through the combination, in a principled way, of data available from existing wells with prior/expert knowledge regarding known physical constraints.