EQTransformer.core.mseed_predictor module¶

Created on Sun Jun 21 21:55:54 2020

@author: mostafamousavi

last update: 05/27/2021

class EQTransformer.core.mseed_predictor.PreLoadGeneratorTest(list_IDs, inp_data, batch_size=32, norm_mode='std')[source]¶

Bases: tensorflow.python.keras.utils.data_utils.Sequence

Keras generator with preprocessing. For testing. Pre-load version.

Parameters:	list_IDsx (str) – List of trace names. file_name (str) – Path to the input hdf5 file. dim (tuple) – Dimension of input traces. batch_size (int, default=32.) – Batch size. n_channels (int, default=3.) – Number of channels. norm_mode (str, default=max) – The mode of normalization, ‘max’ or ‘std’
Returns:	Batches of two dictionaries
Return type:	{‘input’: X}: pre-processed waveform as input {‘detector’: y1, ‘picker_P’: y2, ‘picker_S’: y3}: outputs including three separate numpy arrays as labels for detection, P, and S respectively.

on_epoch_end()[source]¶: Updates indexes after each epoch

EQTransformer.core.mseed_predictor.mseed_predictor(input_dir='downloads_mseeds', input_model='sampleData&Model/EqT1D8pre_048.h5', stations_json='station_list.json', output_dir='detections', detection_threshold=0.3, P_threshold=0.1, S_threshold=0.1, number_of_plots=10, plot_mode='time', loss_weights=[0.03, 0.4, 0.58], loss_types=['binary_crossentropy', 'binary_crossentropy', 'binary_crossentropy'], normalization_mode='std', batch_size=500, overlap=0.3, gpuid=None, gpu_limit=None, overwrite=False, output_probabilities=False)[source]¶

To perform fast detection directly on mseed data.

Parameters:

input_dir (str) – Directory name containing hdf5 and csv files-preprocessed data.
input_model (str) – Path to a trained model.
stations_json (str) – Path to a JSON file containing station information.
output_dir (str) – Output directory that will be generated.
detection_threshold (float, default=0.3) – A value in which the detection probabilities above it will be considered as an event.
P_threshold (float, default=0.1) – A value which the P probabilities above it will be considered as P arrival.
S_threshold (float, default=0.1) – A value which the S probabilities above it will be considered as S arrival.
number_of_plots (float, default=10) – The number of plots for detected events outputed for each station data.
plot_mode (str, default=time) – The type of plots: time only time series or time_frequency time and spectrograms.
loss_weights (list, default=[0.03, 0.40, 0.58]) – Loss weights for detection P picking and S picking respectively.
loss_types (list, default=['binary_crossentropy', 'binary_crossentropy', 'binary_crossentropy']) – Loss types for detection P picking and S picking respectively.
normalization_mode (str, default=std) – Mode of normalization for data preprocessing max maximum amplitude among three components std standard deviation.
batch_size (int, default=500) – Batch size. This wont affect the speed much but can affect the performance. A value beteen 200 to 1000 is recommanded.
overlap (float, default=0.3) – If set the detection and picking are performed in overlapping windows.
gpuid (int) – Id of GPU used for the prediction. If using CPU set to None.
gpu_limit (int) – Set the maximum percentage of memory usage for the GPU.
overwrite (Boolean, default=False) – Overwrite your results automatically.
output_probabilities (Boolean, default=False) – Write probability in output_dir/prob.h5 for future plotting Structure: prediction_probabilities.hdf5{begintime: {Earthquake: probability, P_arrival: probability, S_arrival: probability}} Notice: It you turn this parameter on, it will generate larges file (A test shows ~150 Mb file generated for a three-components station for 3 months)

Returns:

output_dir/STATION_OUTPUT/X_prediction_results.csv (A table containing all the detection, and picking results. Duplicated events are already removed.)
output_dir/STATION_OUTPUT/X_report.txt (A summary of the parameters used for prediction and performance.)
output_dir/STATION_OUTPUT/figures (A folder containing plots detected events and picked arrival times.)
time_tracks.pkl (A file containing the time track of the continous data and its type.)

Note

This does not allow uncertainty estimation or writing the probabilities out.