Thursday, May 22, 2008

Comment on article about 'vm attacks' at www.eusecwest.com

I was reading the following story:

http://www.eusecwest.com/justin-ferguson-interpreter-vm-attacks.html

I'll keep my subjective opinion about the article to myself and will focus on the following:

I think that the use of the function 'sys._getframe()' mentioned in the article as a way to 'obtain a heap address' is 'misleading' .

Python gives away memory addresses all the time, there's no need to call a 'weird' function (sys._getframe() is not weird anyways):

(from http://shell.appspot.com/, but applicable to any python deployment):

>>> a = 'mythbusters'
>>> id(a)
6912173043421908880
>>> hex(id(a))
0xe81da54d11f45f88L'
>> sys._getframe()
frame object at 0xe81da54d1ff6afc8

both addresses are clearly in the same 'range', so I can infer they 'refer' to the same 'thing', if the 'thing' is the 'heap', then both methods 'leak' a heap address,
or more importantly, they 'leak' the same 'thing' :)


or

(on a windows machine)

>>> class a:
... def test(self):
... print 'hola'
...
>>> j = a()
>>> j
__main__.a instance at 0x004AF0F8
>>> sys._getframe()
frame object at 0x00475960

and finally (done at from http://shell.appspot.com/)

>>> import os
>>> os.uname()
('Linux', '', '', '', '')

If you think I'm wrong, please comment!

4 comments:

jf said...

the point you missed was that python breaks if you disable sys._getframe(), whereas (AFAIK) disabling id() does no break the language. I specifically mentioned id() et al in my talk, that was just an example. Furthermore, pretty much all heap overflows corrupt frame objects, so its not just *any* address, but an address that points to your data quite likely.

Also, I don't think I called it weird, I believe I referred to it as being internal, which is exactly what its supposed to be.

jf said...

also, figure out how to do this: id(a).func_code.co_code (this is actually incredibly important, ill let you figure out why)

hernan said...

Hi jf!,

Interesting points!, I tried downloading the python vm and removing sys._getframe() altogether and everything still works.

Here's what I did, please let me know if I did something wrong:

-Downloaded python from http://www.python.org/ftp/python/2.5.2/Python-2.5.2.tgz

-Changed function 'static PyObject *sys_getframe(PyObject *self, PyObject *args)'
located at './Python/sysmodule.c' to always return NULL.

-Recompiled, run python and tried the following:

Python 2.5.2 (r252:60911, May 23 2008, 13:14:40)
[GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys._getframe()
Traceback (most recent call last):
File "stdin", line 1, in module
SystemError: error return without exception set
>>>

Now, I always get an error when accessing sys._getframe().

-Then I tried out several python scripts and everything worked fine.
For example, http://docs.python.org/lib/socket-example.html.

This is a quick and dirty hack of course, but it doesn't seem to break the language, scripts still run without problems. You might find problems with certain scripts (specifically with those that use sys._getframe() directly of course), but you can still do a better job and changing what sys._getframe() returns (using facades, whatever, I don't know) and perhaps also eliminate those potential issues.

Did you experience different results?

Thanks so much for taking the time to comment!.

jf said...

I guess I should rephrase, the _getframe() method is supposed to be internal, but there is no real scoping. It's used through a few modules in Lib/ which will of course break if you remove that functionality. After actually going through the source and seeing what would break, I stand mostly corrected in that it appears that nothing overly serious would break. I would be mindful of disabling it however just because future revisions might depend on it more.

It looks like it gets used mostly to obtain variables and such local to the frame.