I am trying to make a measurement using the CAEN DT5742 16-channel digitizer using the library CAENPy which is basically just a wrapper around the actual CAENDigitizer.
My program scans an area with a laser using stepper motors and reads out the data coming from an analog readout board via the digitizer.
It's been working more or less well, but I noticed that my program randomly becomes unresponsive and 2 processes (my program is multiprocessed) draw 100% CPU (single core).
The code I use for acquiring data with the digitizer:
def read_and_save_events(self, max_num_events: int = 1):
"""Reads a specified number of events from the digitizer.
Arguments
---------
max_num_events: int, default 1
Number of events to read.
Returns
-------
nevts: int
Number of events read.
"""
nevts: int = 0
data = []
retries = 0
while retries < MAX_RETRIES:
retries += 1
try:
with self.device:
self.log.info("Reading %d events...", max_num_events)
while nevts < max_num_events:
time.sleep(0.05)
waveforms = self.get_waveforms()
current_nevts = len(waveforms)
nevts += current_nevts
data += waveforms
self.log.info(
"Read %d out of %d events...", nevts, max_num_events
)
break
except RuntimeError:
self.log.error("Encountered error during read. Retrying...")
self.hard_reset(self._device_id)
self.close()
self.device = CAEN_DT5742_Digitizer(self._device_id)
self.init()
time.sleep(RETRY_TIMEOUT)
else:
self.log.error("Too many retries, aborting read...")
if self._save_path is None:
self.log.warning("No save path specified, waveforms not saved!")
return 0
# Disentangle data and save to file
df = pd.DataFrame(data)
timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
data_file = os.path.join(self._save_path, f"waveforms_{timestamp}.h5")
self.curr_savefile = data_file
with pd.HDFStore(data_file, "w") as store:
for channel in df.columns:
channel_df = []
for eventid, event in enumerate(df[channel]):
event_df = pd.DataFrame(event)
for column in event_df.columns:
col = pd.Series(
event_df[column].values,
name=f"{eventid}_{column.split()[0]}",
)
channel_df.append(col)
channel_df = pd.concat(channel_df, axis=1)
store.put(channel, channel_df)
I profiled the program using py-spy and got the attached call stack for one of the heavy duty processes. So, apparently the problem is with the _GetNumEvents method from the library.
My question: Can I even solve this bug? If not, how would I monitor my program to get out of this error state?
_GetNumEvents- it's Python after all._GetNumEventsbut it is basically just calling the function from the c library... I don't know how to look at that let alone modifying and recompiling it...