EQTransformer.core.predictor module

Created on Wed Apr 25 17:44:14 2018

@author: mostafamousavi last update: 05/27/2021

EQTransformer.core.predictor.predictor(input_dir=None, input_model=None, output_dir=None, output_probabilities=False, detection_threshold=0.3, P_threshold=0.1, S_threshold=0.1, number_of_plots=10, plot_mode='time', estimate_uncertainty=False, number_of_sampling=5, loss_weights=[0.03, 0.4, 0.58], loss_types=['binary_crossentropy', 'binary_crossentropy', 'binary_crossentropy'], input_dimention=(6000, 3), normalization_mode='std', batch_size=500, gpuid=None, gpu_limit=None, number_of_cpus=5, use_multiprocessing=True, keepPS=True, allowonlyS=True, spLimit=60)[source]

Applies a trained model to a windowed waveform to perform both detection and picking at the same time.

Parameters:
  • input_dir (str, default=None) – Directory name containing hdf5 and csv files-preprocessed data.
  • input_model (str, default=None) – Path to a trained model.
  • output_dir (str, default=None) – Output directory that will be generated.
  • output_probabilities (bool, default=False) – If True, it will output probabilities and estimated uncertainties for each trace into an HDF file.
  • detection_threshold (float, default=0.3) – A value in which the detection probabilities above it will be considered as an event.
  • P_threshold (float, default=0.1) – A value which the P probabilities above it will be considered as P arrival.
  • S_threshold (float, default=0.1) – A value which the S probabilities above it will be considered as S arrival.
  • number_of_plots (float, default=10) – The number of plots for detected events outputed for each station data.
  • plot_mode (str, default='time') – The type of plots: ‘time’: only time series or ‘time_frequency’, time and spectrograms.
  • estimate_uncertainty (bool, default=False) – If True uncertainties in the output probabilities will be estimated.
  • number_of_sampling (int, default=5) – Number of sampling for the uncertainty estimation.
  • loss_weights (list, default=[0.03, 0.40, 0.58]) – Loss weights for detection, P picking, and S picking respectively.
  • loss_types (list, default=['binary_crossentropy', 'binary_crossentropy', 'binary_crossentropy']) – Loss types for detection, P picking, and S picking respectively.
  • input_dimention (tuple, default=(6000, 3)) – Loss types for detection, P picking, and S picking respectively.
  • normalization_mode (str, default='std') – Mode of normalization for data preprocessing, ‘max’, maximum amplitude among three components, ‘std’, standard deviation.
  • batch_size (int, default=500) – Batch size. This wont affect the speed much but can affect the performance. A value beteen 200 to 1000 is recommanded.
  • gpuid (int, default=None) – Id of GPU used for the prediction. If using CPU set to None.
  • gpu_limit (int, default=None) – Set the maximum percentage of memory usage for the GPU.
  • number_of_cpus (int, default=5) – Number of CPUs used for the parallel preprocessing and feeding of data for prediction.
  • use_multiprocessing (bool, default=True) – If True, multiple CPUs will be used for the preprocessing of data even when GPU is used for the prediction.
  • keepPS (bool, default=False) – If True, detected events require both P and S picks to be written. If False, individual P or S (see allowonlyS) picks may be written.
  • allowonlyS (bool, default=True) – If True, detected events with “only S” picks will be allowed. If False, an associated P pick is required.
  • spLimit (int, default=60) – S - P time in seconds. It will limit the results to those detections with events that have a specific S-P time limit.
Returns:

  • ./output_dir/STATION_OUTPUT/X_prediction_results.csv (A table containing all the detection, and picking results. Duplicated events are already removed.)
  • ./output_dir/STATION_OUTPUT/X_report.txt (A summary of the parameters used for prediction and performance.)
  • ./output_dir/STATION_OUTPUT/figures (A folder containing plots detected events and picked arrival times.)
  • ./time_tracks.pkl (A file containing the time track of the continous data and its type.)

Notes

Estimating the uncertainties requires multiple predictions and will increase the computational time.