2

I'm using a Python library that logs messages using the standard logging module. The library code includes the following at the top of its modules:

import logging
logger = logging.getLogger(__name__)

In my own script, I also set up logging in the same way:

import logging
logger = logging.getLogger(__name__)

I'm calling a library function in a loop. The function is part of a class and looks something like this:

class MathOperations:
    @classmethod
    def square_number(cls, n):
        logger.info("You want to square a number.")
        logger.warning("Squaring Number")
        result = n * n
        logger.info(f"Result: {result}")
        return result

When I run this method in a loop in my script:

for i in range(5):
    MathOperations.square_number(i)

I get a lot of log messages that are essentially duplicates in terms of their content. This makes the logs verbose and hard to read.

[29/Nov/2024 12:32:27] INFO [library.module_name:9] You want to square a number.
[29/Nov/2024 12:32:27] WARNING [library.module_name:10] Squaring Number
[29/Nov/2024 12:32:27] INFO [library.module_name:12] Result: 0
[29/Nov/2024 12:32:28] INFO [library.module_name:9] You want to square a number.
[29/Nov/2024 12:32:28] WARNING [library.module_name:10] Squaring Number
[29/Nov/2024 12:32:28] INFO [library.module_name:12] Result: 1
[29/Nov/2024 12:32:29] INFO [library.module_name:9] You want to square a number.
[29/Nov/2024 12:32:29] WARNING [library.module_name:10] Squaring Number
[29/Nov/2024 12:32:29] INFO [library.module_name:12] Result: 4
[29/Nov/2024 12:32:30] INFO [library.module_name:9] You want to square a number.
[29/Nov/2024 12:32:30] WARNING [library.module_name:10] Squaring Number
[29/Nov/2024 12:32:30] INFO [library.module_name:12] Result: 9
[29/Nov/2024 12:32:31] INFO [library.module_name:9] You want to square a number.
[29/Nov/2024 12:32:31] WARNING [library.module_name:10] Squaring Number
[29/Nov/2024 12:32:31] INFO [library.module_name:12] Result: 16

The messages "You want to square a number." and "Squaring Number" are duplicates every time the function is called in the loop.

Desired Log Output:

I would like the logs to be summarized to reduce the number of duplicate messages. Ideally, the output would look something like:

[29/Nov/2024 12:32:27] INFO [library.module_name:9] You want to square a number.
-- The above message repeated 4 more times --
[29/Nov/2024 12:32:27] WARNING [library.module_name:10] Squaring Number
-- The above message repeated 4 more times --
[29/Nov/2024 12:32:27] INFO [library.module_name:12] Result: 0
[29/Nov/2024 12:32:28] INFO [library.module_name:12] Result: 1
[29/Nov/2024 12:32:29] INFO [library.module_name:12] Result: 4
[29/Nov/2024 12:32:30] INFO [library.module_name:12] Result: 9
[29/Nov/2024 12:32:31] INFO [library.module_name:12] Result: 16

This way, the logs are less cluttered, and I can still see the important information without the repetitive messages.

Constraints:

I cannot modify the library code. I want to suppress or summarize duplicate log messages only from the library when called in a loop. I prefer to keep any additional code in my script to a minimum.

What I've tried:

I attempted to create a custom logging.Filter to suppress duplicate messages:

import logging

class SuppressDuplicatesFilter(logging.Filter):
    def __init__(self):
        super().__init__()
        self.logged_messages = set()

    def filter(self, record):
        message = record.getMessage()
        if message in self.logged_messages:
            return False  # Suppress
        else:
            self.logged_messages.add(message)
            return True  # Allow


def enable_duplicate_suppression():
    library_logger = logging.getLogger('library.module_name')
    suppress_filter = SuppressDuplicatesFilter()
    library_logger.addFilter(suppress_filter)
    # Store for later removal
    enable_duplicate_suppression.filter = suppress_filter


def disable_duplicate_suppression():
    library_logger = logging.getLogger('library.module_name')
    suppress_filter = enable_duplicate_suppression.filter
    library_logger.removeFilter(suppress_filter)
    del enable_duplicate_suppression.filter



enable_duplicate_suppression()


for i in range(5):
    MathOperations.square_number(i)


disable_duplicate_suppression()

However, this doesn't seem to work—the duplicate log messages are still appearing in the logs.

I also tried enhancing the filter to count duplicates and log a summary:

class SummarizingFilter(logging.Filter):
    def __init__(self):
        super().__init__()
        self.last_message = None
        self.repeat_count = 0

    def filter(self, record):
        message = record.getMessage()
        if message == self.last_message:
            self.repeat_count += 1
            return False  # Suppress
        else:
            if self.repeat_count > 0:
                # Log a summary of the repeated message
                summary = f'"{self.last_message}" repeated {self.repeat_count} times'
                #? How to log this appropriately?
            self.last_message = message
            self.repeat_count = 0
            return True  # Allow

But I'm not sure what approach I've missed to get this working, and how to properly log the summary message so that it appears in the logs with the correct formatting and logger name.

How can I effectively suppress or summarize duplicate log messages, even if they are non consecutive, from the library function when called in a loop without modifying the library code? Perhaps by customizing the logging module?

Additional Info:

The library's logger names are based on their module paths (e.g., 'library.module_name'). I'm open to applying the filter to the root logger if necessary, as long as it doesn't interfere with other logging. I need a solution that works with the standard logging module since I cannot change the library's logging setup. After the duplicate suppression is activated, it may be that every new unique log gets let through, but then if it's been seen before it is not allowed, and just stored. Then when the duplicate suppression is disabled, the duplicates are released (in a summarised single line each).

Similar SO Question: Suppress multiple messages with same content in Python logging module AKA log compression

I've taken a look a the above, but I don't think it covers what I'm looking for: it would only suppress logs that are immediately, consecutively duplicate. While I'd like to suppress and summarise even if the same warning is logged non-consecutively.

Thank you for your help!

18
  • What are the rules for when the duplicated line should be shown again? Or for when the "repeated N times" message should be shown? Commented Nov 29, 2024 at 13:16
  • When the duplicate suppression is disabled again. It may be that no logs are printed while this is active, and then all at once when disabled. Alternatively print the first instance of every log but only when it is a duplicate don't print it again. Then when disabled the duplicates are released. I've extended the question to clarify this point, thank you mkrieger1 Commented Nov 29, 2024 at 13:43
  • 4
    To get that result, you'd have to buffer the output for some time, which introduces its own set of problems. Adding some simple filter which remembers the last couple of messages and suppresses repeated messages isn't all that difficult; adding the "repeated x times" is. Commented Nov 29, 2024 at 13:50
  • 1
    First of all, not sure where exactly you'd hook into the logging system to produce that. Filtering is easy, manipulating the writing system… dunno. But let's say you work that out, you'll need to decide when to start outputting the logs. You can only output a conclusive "repeats x times" when you know there won't be any more coming. When is that? And until then, you can't output anything. You'll need to decide on some thresholds, like 1 second buffer at most. But then slower processes can still produce duplicate lines. Also the order of messages is changed, maybe complicating debugging. Commented Nov 29, 2024 at 14:16
  • 2
    When the complexity of the desired adapter outweighs the complexity of the library, its very hard to justify the effort for that solution. e.g. you could just as easily suppress logging while the function is looped, and then have the invoker post a summary of the invocations. Each use-case is likely to be different, and trying to solve this for the general case over all methods is an effort in madness. The logger library is a hammer, but not everything is a nail. Commented Dec 10, 2024 at 7:55

1 Answer 1

-1

I am not sure why the approach you tried doesn't work for you. I tried exactly what you wrote in your question, and it works as intended.

However, if the problem is with the filter, you can just not use it and write the functionality yourself:

def enable_duplicate_suppression():
library_logger = logging.getLogger('library.module_name')
suppress_filter = SuppressDuplicatesFilter()
old_log = library_logger._log

def new_log(level, msg, args, exc_info=None, extra=None, stack_info=False, stacklevel=1):
    if suppress_filter.filter(msg):
        old_log(level, msg, args, exc_info, extra, stack_info, stacklevel)

library_logger._log = new_log
# Store for later removal
enable_duplicate_suppression.old_log = old_log

I trust you can figure out the rest if this even solves anything.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.