EQTransformer.core.mseed_predictor module

Created on Sun Jun 21 21:55:54 2020

@author: mostafamousavi

last update: 05/27/2021

class EQTransformer.core.mseed_predictor.PreLoadGeneratorTest(list_IDs, inp_data, batch_size=32, norm_mode='std')[source]

Bases: tensorflow.python.keras.utils.data_utils.Sequence

Keras generator with preprocessing. For testing. Pre-load version.

Parameters:
  • list_IDsx (str) – List of trace names.
  • file_name (str) – Path to the input hdf5 file.
  • dim (tuple) – Dimension of input traces.
  • batch_size (int, default=32.) – Batch size.
  • n_channels (int, default=3.) – Number of channels.
  • norm_mode (str, default=max) – The mode of normalization, ‘max’ or ‘std’
Returns:

Batches of two dictionaries

Return type:

{‘input’: X}: pre-processed waveform as input {‘detector’: y1, ‘picker_P’: y2, ‘picker_S’: y3}: outputs including three separate numpy arrays as labels for detection, P, and S respectively.

on_epoch_end()[source]

Updates indexes after each epoch

EQTransformer.core.mseed_predictor.mseed_predictor(input_dir='downloads_mseeds', input_model='sampleData&Model/EqT1D8pre_048.h5', stations_json='station_list.json', output_dir='detections', detection_threshold=0.3, P_threshold=0.1, S_threshold=0.1, number_of_plots=10, plot_mode='time', loss_weights=[0.03, 0.4, 0.58], loss_types=['binary_crossentropy', 'binary_crossentropy', 'binary_crossentropy'], normalization_mode='std', batch_size=500, overlap=0.3, gpuid=None, gpu_limit=None, overwrite=False, output_probabilities=False)[source]

To perform fast detection directly on mseed data.

Parameters:
  • input_dir (str) – Directory name containing hdf5 and csv files-preprocessed data.
  • input_model (str) – Path to a trained model.
  • stations_json (str) – Path to a JSON file containing station information.
  • output_dir (str) – Output directory that will be generated.
  • detection_threshold (float, default=0.3) – A value in which the detection probabilities above it will be considered as an event.
  • P_threshold (float, default=0.1) – A value which the P probabilities above it will be considered as P arrival.
  • S_threshold (float, default=0.1) – A value which the S probabilities above it will be considered as S arrival.
  • number_of_plots (float, default=10) – The number of plots for detected events outputed for each station data.
  • plot_mode (str, default=time) – The type of plots: time only time series or time_frequency time and spectrograms.
  • loss_weights (list, default=[0.03, 0.40, 0.58]) – Loss weights for detection P picking and S picking respectively.
  • loss_types (list, default=['binary_crossentropy', 'binary_crossentropy', 'binary_crossentropy']) – Loss types for detection P picking and S picking respectively.
  • normalization_mode (str, default=std) – Mode of normalization for data preprocessing max maximum amplitude among three components std standard deviation.
  • batch_size (int, default=500) – Batch size. This wont affect the speed much but can affect the performance. A value beteen 200 to 1000 is recommanded.
  • overlap (float, default=0.3) – If set the detection and picking are performed in overlapping windows.
  • gpuid (int) – Id of GPU used for the prediction. If using CPU set to None.
  • gpu_limit (int) – Set the maximum percentage of memory usage for the GPU.
  • overwrite (Boolean, default=False) – Overwrite your results automatically.
  • output_probabilities (Boolean, default=False) – Write probability in output_dir/prob.h5 for future plotting Structure: prediction_probabilities.hdf5{begintime: {Earthquake: probability, P_arrival: probability, S_arrival: probability}} Notice: It you turn this parameter on, it will generate larges file (A test shows ~150 Mb file generated for a three-components station for 3 months)
Returns:

  • output_dir/STATION_OUTPUT/X_prediction_results.csv (A table containing all the detection, and picking results. Duplicated events are already removed.)
  • output_dir/STATION_OUTPUT/X_report.txt (A summary of the parameters used for prediction and performance.)
  • output_dir/STATION_OUTPUT/figures (A folder containing plots detected events and picked arrival times.)
  • time_tracks.pkl (A file containing the time track of the continous data and its type.)

Note

This does not allow uncertainty estimation or writing the probabilities out.