I'm using a Python library that logs messages using the standard logging module. The library code includes the following at the top of its modules:
import logging
logger = logging.getLogger(__name__)
In my own script, I also set up logging in the same way:
import logging
logger = logging.getLogger(__name__)
I'm calling a library function in a loop. The function is part of a class and looks something like this:
class MathOperations:
@classmethod
def square_number(cls, n):
logger.info("You want to square a number.")
logger.warning("Squaring Number")
result = n * n
logger.info(f"Result: {result}")
return result
When I run this method in a loop in my script:
for i in range(5):
MathOperations.square_number(i)
I get a lot of log messages that are essentially duplicates in terms of their content. This makes the logs verbose and hard to read.
[29/Nov/2024 12:32:27] INFO [library.module_name:9] You want to square a number.
[29/Nov/2024 12:32:27] WARNING [library.module_name:10] Squaring Number
[29/Nov/2024 12:32:27] INFO [library.module_name:12] Result: 0
[29/Nov/2024 12:32:28] INFO [library.module_name:9] You want to square a number.
[29/Nov/2024 12:32:28] WARNING [library.module_name:10] Squaring Number
[29/Nov/2024 12:32:28] INFO [library.module_name:12] Result: 1
[29/Nov/2024 12:32:29] INFO [library.module_name:9] You want to square a number.
[29/Nov/2024 12:32:29] WARNING [library.module_name:10] Squaring Number
[29/Nov/2024 12:32:29] INFO [library.module_name:12] Result: 4
[29/Nov/2024 12:32:30] INFO [library.module_name:9] You want to square a number.
[29/Nov/2024 12:32:30] WARNING [library.module_name:10] Squaring Number
[29/Nov/2024 12:32:30] INFO [library.module_name:12] Result: 9
[29/Nov/2024 12:32:31] INFO [library.module_name:9] You want to square a number.
[29/Nov/2024 12:32:31] WARNING [library.module_name:10] Squaring Number
[29/Nov/2024 12:32:31] INFO [library.module_name:12] Result: 16
The messages "You want to square a number." and "Squaring Number" are duplicates every time the function is called in the loop.
Desired Log Output:
I would like the logs to be summarized to reduce the number of duplicate messages. Ideally, the output would look something like:
[29/Nov/2024 12:32:27] INFO [library.module_name:9] You want to square a number.
-- The above message repeated 4 more times --
[29/Nov/2024 12:32:27] WARNING [library.module_name:10] Squaring Number
-- The above message repeated 4 more times --
[29/Nov/2024 12:32:27] INFO [library.module_name:12] Result: 0
[29/Nov/2024 12:32:28] INFO [library.module_name:12] Result: 1
[29/Nov/2024 12:32:29] INFO [library.module_name:12] Result: 4
[29/Nov/2024 12:32:30] INFO [library.module_name:12] Result: 9
[29/Nov/2024 12:32:31] INFO [library.module_name:12] Result: 16
This way, the logs are less cluttered, and I can still see the important information without the repetitive messages.
Constraints:
I cannot modify the library code. I want to suppress or summarize duplicate log messages only from the library when called in a loop. I prefer to keep any additional code in my script to a minimum.
What I've tried:
I attempted to create a custom logging.Filter to suppress duplicate messages:
import logging
class SuppressDuplicatesFilter(logging.Filter):
def __init__(self):
super().__init__()
self.logged_messages = set()
def filter(self, record):
message = record.getMessage()
if message in self.logged_messages:
return False # Suppress
else:
self.logged_messages.add(message)
return True # Allow
def enable_duplicate_suppression():
library_logger = logging.getLogger('library.module_name')
suppress_filter = SuppressDuplicatesFilter()
library_logger.addFilter(suppress_filter)
# Store for later removal
enable_duplicate_suppression.filter = suppress_filter
def disable_duplicate_suppression():
library_logger = logging.getLogger('library.module_name')
suppress_filter = enable_duplicate_suppression.filter
library_logger.removeFilter(suppress_filter)
del enable_duplicate_suppression.filter
enable_duplicate_suppression()
for i in range(5):
MathOperations.square_number(i)
disable_duplicate_suppression()
However, this doesn't seem to work—the duplicate log messages are still appearing in the logs.
I also tried enhancing the filter to count duplicates and log a summary:
class SummarizingFilter(logging.Filter):
def __init__(self):
super().__init__()
self.last_message = None
self.repeat_count = 0
def filter(self, record):
message = record.getMessage()
if message == self.last_message:
self.repeat_count += 1
return False # Suppress
else:
if self.repeat_count > 0:
# Log a summary of the repeated message
summary = f'"{self.last_message}" repeated {self.repeat_count} times'
#? How to log this appropriately?
self.last_message = message
self.repeat_count = 0
return True # Allow
But I'm not sure what approach I've missed to get this working, and how to properly log the summary message so that it appears in the logs with the correct formatting and logger name.
How can I effectively suppress or summarize duplicate log messages, even if they are non consecutive, from the library function when called in a loop without modifying the library code? Perhaps by customizing the logging module?
Additional Info:
The library's logger names are based on their module paths (e.g., 'library.module_name'). I'm open to applying the filter to the root logger if necessary, as long as it doesn't interfere with other logging. I need a solution that works with the standard logging module since I cannot change the library's logging setup. After the duplicate suppression is activated, it may be that every new unique log gets let through, but then if it's been seen before it is not allowed, and just stored. Then when the duplicate suppression is disabled, the duplicates are released (in a summarised single line each).
Similar SO Question: Suppress multiple messages with same content in Python logging module AKA log compression
I've taken a look a the above, but I don't think it covers what I'm looking for: it would only suppress logs that are immediately, consecutively duplicate. While I'd like to suppress and summarise even if the same warning is logged non-consecutively.
Thank you for your help!