
PROJECT 5
UNLEASHING BIG DATA
A Deep Learning Framework for High-Fidelity Signal Processing in DAS
Distributed Acoustic Sensing (DAS) is used in numerous scientific and engineering applications, including reservoir engineering, earthquake detection, subsurface imaging, and industrial and traffic monitoring. Despite its broad utility, DAS remains underutilized in many of these industries, particularly in analyzing and interpreting microseismic events observed from various sources such as hydraulic fracturing in oil/gas or geothermal operations or blast monitoring in mining. DAS reaches meter-scale spatial resolution with sampling rates up to 10 kHz over tens of kilometers of fiber. The extensive use of traditional sensing systems (nodal arrays, geophones, seismometers) is limited by space constraints, land access issues, extreme temperatures, or power limitations. In contrast, DAS allows for long-term time-lapse monitoring, deployment in formerly inaccessible sites (e.g., urban, high temperatures, power limitations, land access barriers), and relatively electromagnetic noise-insensitive measurements. Within the last five years, applications have grown in near-surface geophysics for engineering, infrastructure, and environmental studies, particularly those requiring long-term monitoring.

In the past, DAS was primarily used for short-duration active source experiments, while it is now increasingly involved in long-term monitoring experiments. However, collecting relatively high sample rate data for long durations leads to more extensive data volumes per experiment than traditional seismic experiments, thus making DAS data storage and analysis challenging. For example, a sample of just nine experiments (a mix of lower-rate and higher-rate experiments) accumulated roughly 800 TB of data. Therefore, more efficient computation methods are becoming increasingly crucial to the analysis, and modifications to existing algorithms are necessary. Due to these remote sensing advances in the geosciences, traditional methods of human review of entire datasets are no longer feasible.
Currently, a major challenge in DAS analysis of microseismic events lies in unlabeled events, necessitating advanced machine-learning techniques to generate useful data annotations.
Building on a semi-supervised learning framework for earthquake data, we aim to adapt and refine this approach for microseismic event detection in oil/gas and geothermal environments. The method involves generating pseudo picks using a pre-trained model, filtering out false picks, and training a new model using refined data. While promising, several key issues arise when applying this methodology to large recordings of microseismic events. To address these limitations, we proposed a modified semi-supervised learning pipeline. In the first step, rather than using PhaseNet, we employ PhaseNet-DAS as the pre-trained model for phase picking in DAS data. We integrate an Earthquake Transformer model as a global detector for identifying microseismic events and interpret Akaike Information Criterion (AIC) results to refine the arrival times. We then retain the use of the Gaussian Mixture Model Associator (GaMMA) to filter false picks. Finally, we diverge from previous efforts’ use of U-Net and explore alternative models such as Transformers and Convolutional Networks (ConvNets) to train a new phase-picking model.
The proposed work is iterative and scalable. We anticipate that this recursive methodology will yield a model capable of detecting reservoir-related seismic events more accurately than existing approaches. This work will contribute to expanding the use of DAS in energy activity monitoring, advancing the field by adapting state-of-the-art machine learning models to handle unlabeled data.

