EQTransformer.core.mseed_predictor module¶
Created on Sun Jun 21 21:55:54 2020
@author: mostafamousavi
last update: 05/27/2021
-
class
EQTransformer.core.mseed_predictor.
PreLoadGeneratorTest
(list_IDs, inp_data, batch_size=32, norm_mode='std')[source]¶ Bases:
tensorflow.python.keras.utils.data_utils.Sequence
Keras generator with preprocessing. For testing. Pre-load version.
Parameters: - list_IDsx (str) – List of trace names.
- file_name (str) – Path to the input hdf5 file.
- dim (tuple) – Dimension of input traces.
- batch_size (int, default=32.) – Batch size.
- n_channels (int, default=3.) – Number of channels.
- norm_mode (str, default=max) – The mode of normalization, ‘max’ or ‘std’
Returns: Batches of two dictionaries
Return type: {‘input’: X}: pre-processed waveform as input {‘detector’: y1, ‘picker_P’: y2, ‘picker_S’: y3}: outputs including three separate numpy arrays as labels for detection, P, and S respectively.
-
EQTransformer.core.mseed_predictor.
mseed_predictor
(input_dir='downloads_mseeds', input_model='sampleData&Model/EqT1D8pre_048.h5', stations_json='station_list.json', output_dir='detections', detection_threshold=0.3, P_threshold=0.1, S_threshold=0.1, number_of_plots=10, plot_mode='time', loss_weights=[0.03, 0.4, 0.58], loss_types=['binary_crossentropy', 'binary_crossentropy', 'binary_crossentropy'], normalization_mode='std', batch_size=500, overlap=0.3, gpuid=None, gpu_limit=None, overwrite=False, output_probabilities=False)[source]¶ To perform fast detection directly on mseed data.
Parameters: - input_dir (str) – Directory name containing hdf5 and csv files-preprocessed data.
- input_model (str) – Path to a trained model.
- stations_json (str) – Path to a JSON file containing station information.
- output_dir (str) – Output directory that will be generated.
- detection_threshold (float, default=0.3) – A value in which the detection probabilities above it will be considered as an event.
- P_threshold (float, default=0.1) – A value which the P probabilities above it will be considered as P arrival.
- S_threshold (float, default=0.1) – A value which the S probabilities above it will be considered as S arrival.
- number_of_plots (float, default=10) – The number of plots for detected events outputed for each station data.
- plot_mode (str, default=time) – The type of plots: time only time series or time_frequency time and spectrograms.
- loss_weights (list, default=[0.03, 0.40, 0.58]) – Loss weights for detection P picking and S picking respectively.
- loss_types (list, default=['binary_crossentropy', 'binary_crossentropy', 'binary_crossentropy']) – Loss types for detection P picking and S picking respectively.
- normalization_mode (str, default=std) – Mode of normalization for data preprocessing max maximum amplitude among three components std standard deviation.
- batch_size (int, default=500) – Batch size. This wont affect the speed much but can affect the performance. A value beteen 200 to 1000 is recommanded.
- overlap (float, default=0.3) – If set the detection and picking are performed in overlapping windows.
- gpuid (int) – Id of GPU used for the prediction. If using CPU set to None.
- gpu_limit (int) – Set the maximum percentage of memory usage for the GPU.
- overwrite (Boolean, default=False) – Overwrite your results automatically.
- output_probabilities (Boolean, default=False) – Write probability in output_dir/prob.h5 for future plotting Structure: prediction_probabilities.hdf5{begintime: {Earthquake: probability, P_arrival: probability, S_arrival: probability}} Notice: It you turn this parameter on, it will generate larges file (A test shows ~150 Mb file generated for a three-components station for 3 months)
Returns: - output_dir/STATION_OUTPUT/X_prediction_results.csv (A table containing all the detection, and picking results. Duplicated events are already removed.)
- output_dir/STATION_OUTPUT/X_report.txt (A summary of the parameters used for prediction and performance.)
- output_dir/STATION_OUTPUT/figures (A folder containing plots detected events and picked arrival times.)
- time_tracks.pkl (A file containing the time track of the continous data and its type.)
Note
This does not allow uncertainty estimation or writing the probabilities out.