Avoiding expensive calculations in Python IRC bot

Question

I'm using this calculator in a public IRC bot. Given that Python uses arbitrary precision by default, this would allow any user to execute something like calc 10000**10000**10000 or calc factorial(1000000) and effectively "kill" the bot.

What I'd like to know is if there is some way of avoiding this. I've tried casting all the terms in the expression to float but float(factorial(1000000) still takes a long time to finish in the Python interpreter, and I'm not sure if a multithreading approach is the right way of doing this.

I know it uses eval, but I've tried to execute dangerous commands for a long time and I couldn't do anything. I know this doesn't guarantee that it's 100% safe but the users that will be using are "semi-trusted", meaning that they're not going to do weird stuff with os but I'm sure they will just kill the bot for fun. — user1002327
– user1002327, Commented Jul 14, 2012 at 20:01
Here's the actual code of the calculator, I used that one only for making the question briefer. pastebin.com/auF1krdh — user1002327
– user1002327, Commented Jul 14, 2012 at 20:02
Having no idea about python I suppose you need the approach like the one of Loic: Start the calculator as independent process and kill it after a specified time. I am afraid by using such functions as factorial it is nearly impossible to force the user to insert only fast computable input, there are simply too many ways to shoot your calculator in the foot. — Thorsten S.
– Thorsten S., Commented Jul 14, 2012 at 20:13
@user1002327: See this link: eval is really dangerous. You should use ast.literal_eval instead. Also, run your user input in a separate thread with a timeout. this will avoid problems from doing something like factorial(1000000). — Joel Cornett
– Joel Cornett, Commented Jul 14, 2012 at 20:30
Heh, it feels like I've seen your comment somewhere else before. I've read that link and that's the reason I didn't let the user include any characters in the set []_;"'. I'm hoping that makes a lot harder to abuse eval. Oh, and the eval in my function is explicitly using {"__builtins__": None} instead of {'__builtins__':{}}, but I'm not sure if it makes a difference. Either way, I'll use ast.literal_eval and see what happens. — user1002327
– user1002327, Commented Jul 14, 2012 at 20:38

Loïc Faure-Lacroix · Accepted Answer · 2012-07-14 19:52:02Z

2

Not really an answer but I'd do it that way. Everything that you are going to run should be ran inside a different process. As far as I know, it's not possible to limit CPU usage or Memory usage of a single thread in a process.

That said, you have to create a new process which as with task to execute what the user entered and write it down to a file for exemple. You could do that with fork, create a new file using the PID and the main process will have to check until the child processes dies. Once the process dies, open the file "cool_calculator_[pid].out" and send it back to IRC.

Quite simple to do I guess.

Then using ulimit or other tools, you can limit the child process or even kill them using the master process. If the file with pid is empty just answer that there was an error or something. I guess you can even write some error like memory exceeded or cpu exceeded etc.

It all really depend on how you want to kill the bad processes.

In the end, your master process will have for job to spawn childs and kill them if needed and send back answers.

answered Jul 14, 2012 at 19:52

Loïc Faure-Lacroix

13.7k7 gold badges72 silver badges103 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

user1002327 Over a year ago

Oh well, since I get all kinds of errors in C for overflows, I'd like to know if detecting those overflows in Python is possible. As I said before, casting everything to float isn't enough, but it's fixing a lot of the problems. float(10000) ** float(9999999999) throws an OverflowError and so does float(factorial(12000)) but for some reason float(factorial(120000)) tries to complete the whole calculation.

DSM Over a year ago

float(factorial(120000)) does throw an OverflowError, it just takes a while. The OverflowError is occurring after the factorial is computed.

user1002327 Over a year ago

Oh, I didn't let it finish because I thought it would take much longer. Thanks for pointing that out.

user1002327 · Accepted Answer · 2012-07-14 21:26:41Z

It looks like the float() cast was the solution after all.

First of all, the inverse trigonometric functions don't take values outside of their domain, so they're completely safe and the exception can be caught.

>>> acos(5e100)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: math domain error

The same thing happens with the fmod() function.

The "normal" trigonometric functions don't seem to have any problem with big values unless they're really big, which makes the function return ValueError again.

The rounding functions (ceil(), floor() and round()) work fine and return inf if the value is too big. The same goes for the degrees(), log(), log10(), pow(), sqrt(), fabs(), hypot() and radians() functions.

The hyperbolic trigonometric functions and the exp() function throw OverflowErrors or return inf.

The atan2() function works perfectly fine with big values.

For simple arithmetic operations, the float cast makes the function throw an OverflowError (or an inf) instead of doing the calculation.

>>> float(10) ** float(100) ** float(100) ** float(1000)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OverflowError: (34, 'Numerical result out of range')
>>> float(5e500) * float(4e1000)
inf

Lastly, the problematic factorial() function. All I had to do was redefining the function in an iterative way and adding it to safe_dict.

import sys

def factorial(n):
    fact = 1
    while (n > 0):
        fact = float(fact) * float(n)
        n -= float(1)
        if float(fact) > sys.float_info.max:
            return "Too big"
    return str(fact)

print factorial(50e500)

While this is a very ugly and grossly inefficient way of calculating a factorial, it's enough for my needs. In fact, I think I added a lot of unnecessary float()s.

Now I need to figure out how to put float()s around all the terms in an expression so this happens automatically.

Collectives™ on Stack Overflow

Avoiding expensive calculations in Python IRC bot

2 Answers 2

3 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related