- cpython/Objects/frameobject.c
- cpython/Include/frameobject.h
The PyFrameObject is the stack frame in the Python virtual machine. It contains space for the currently executing code object, parameters, variables in different scopes, try block info, and more
for more information please refer to stack frame strategy
Every time you make a function call, a new PyFrameObject will be created and attached to the current function call.
It's not intuitive to trace a frame object in the middle of a function. I will use a generator object to do the explanation.
You can always get the frame of the current environment by executing sys._current_frames().
If you need the meaning of each field, please refer to Junnplus' blog or read the source code directly
PyFrameObject object is variable-sized object, it can be cast to type PyVarObject, the real ob_size is decided by the code object
Py_ssize_t extras, ncells, nfrees;
ncells = PyTuple_GET_SIZE(code->co_cellvars);
nfrees = PyTuple_GET_SIZE(code->co_freevars);
extras = code->co_stacksize + code->co_nlocals + ncells + nfrees;
/* omit */
if (free_list == NULL) { /* omit */
f = PyObject_GC_NewVar(PyFrameObject, &PyFrame_Type, extras);
}
else { /* omit */
PyFrameObject *new_f = PyObject_GC_Resize(PyFrameObject, f, extras);
}
extras = code->co_nlocals + ncells + nfrees;
f->f_valuestack = f->f_localsplus + extras;
for (i=0; i<extras; i++)
f->f_localsplus[i] = NULL;the ob_size is the sum of code->co_stacksize, code->co_nlocals, code->co_cellvars and code->co_freevars
code->co_stacksize: an integer that represents the maximum amount stack space that the function will use. It's computed when the code object generated
code->co_nlocals: number of local variables
code->co_cellvars: a tuple containing the names of all variables in the function that are also used in a nested function
code->co_freevars: the names of all variables used in the function that is defined in an enclosing function scope
for more information about PyCodeObject please refer to What is a code object in Python? and code object
let's see an example
def g2(a, b=1, c=2):
yield a
c = str(b + c)
yield c
new_g = range(3)
yield from new_gThe dis result
# ./python.exe -m dis frame_dis.py
1 0 LOAD_CONST 5 ((1, 2))
2 LOAD_CONST 2 (<code object g2 at 0x10c495030, file "frame_dis.py", line 1>)
4 LOAD_CONST 3 ('g2')
6 MAKE_FUNCTION 1 (defaults)
8 STORE_NAME 0 (g2)
10 LOAD_CONST 4 (None)
12 RETURN_VALUE
Disassembly of <code object g2 at 0x10c495030, file "frame_dis.py", line 1>:
2 0 LOAD_FAST 0 (a)
2 YIELD_VALUE
4 POP_TOP
3 6 LOAD_GLOBAL 0 (str)
8 LOAD_FAST 1 (b)
10 LOAD_FAST 2 (c)
12 BINARY_ADD
14 CALL_FUNCTION 1
16 STORE_FAST 2 (c)
4 18 LOAD_FAST 2 (c)
20 YIELD_VALUE
22 POP_TOP
5 24 LOAD_GLOBAL 1 (range)
26 LOAD_CONST 1 (3)
28 CALL_FUNCTION 1
30 STORE_FAST 3 (new_g)
6 32 LOAD_FAST 3 (new_g)
34 GET_YIELD_FROM_ITER
36 LOAD_CONST 0 (None)
38 YIELD_FROM
40 POP_TOP
42 LOAD_CONST 0 (None)
44 RETURN_VALUELet's iterate through the generator
>>> gg = g2("param a")After the first next returns, the first opcode 0 LOAD_FAST 0 (a) will be executed and the current execution flow is in the middle of the second opcode 2 YIELD_VALUE.
The field f_lasti is 2, indicating that the virtual program counter is at 2 YIELD_VALUE.
The opcode LOAD_FAST will push the parameter to f_valuestack, and opcode YIELD_VALUE will pop the top element from f_valuestack. The definition of pop is #define BASIC_POP() (*--stack_pointer).
The value (address 0x100a5b538) in f_valuestack is the same as the previous step (previous picture), but the first element the address (0x100a5b538) points to is different. Currently, it's a pointer to a PyUnicodeObject('param a') or an invalid address (if the PyUnicodeObject is deallocated)
>>> next(gg)
'param a'>>> next(gg)
'3'
The opcodes 6 LOAD_GLOBAL 0 (str), 8 LOAD_FAST 1 (b), and 10 LOAD_FAST 2 (c) in line 3 push str (parameter str is stored in the frame->f_code->co_names field), b (int 1), and c (int 2) to f_valuestack. Opcode 12 BINARY_ADD pops off the top 2 elements in f_valuestack (b and c), sums these two values, and stores the result at the top of f_valuestack. This is what f_valuestack looks like after 12 BINARY_ADD
The opcode 14 CALL_FUNCTION 1 will pop the function and argument off the stack and delegate the actual function call.
After the function call, the result '3' is pushed onto the stack
Opcode 16 STORE_FAST 2 (c) pops off the top element in f_valuestack and stores it into the 2nd position of f_localsplus
Opcode 18 LOAD_FAST 2 (c) pushes the 2nd element in f_localsplus onto f_valuestack, and 20 YIELD_VALUE pops it and sends it to the caller.
Field f_lasti is 20, indicating that it's currently executing the opcode 20 YIELD_VALUE
after 24 LOAD_GLOBAL 1 (range) and 26 LOAD_CONST 1 (3)
after 28 CALL_FUNCTION 1
after 30 STORE_FAST 3 (new_g)
after 32 LOAD_FAST 3 (new_g)
The opcode 34 GET_YIELD_FROM_ITER makes sure the stack's top is an iterable object
36 LOAD_CONST 0 (None) pushes None onto the stack
>>> next(gg)
0Field f_lasti is 36, indicating that it's after 38 YIELD_FROM.
At the end of YIELD_FROM, the following code f->f_lasti -= sizeof(_Py_CODEUNIT); resets f_lasti to the beginning of YIELD_FROM Thanks to @RyanHe123
The frame object is deallocated after StopIteration is raised (the opcode 44 RETURN_VALUE is also executed)
>>> next(gg)
1
>>> next(gg)
2
>>> next(gg)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
>>> repr(gg.gi_frame)
'None'f_blockstack is an array. The element type is PyTryBlock and the size is CO_MAXBLOCKS (20).
The definition of PyTryBlock
typedef struct {
int b_type; /* what kind of block this is */
int b_handler; /* where to jump to find handler */
int b_level; /* value stack level to pop to */
} PyTryBlock;Let's define a generator with some blocks
def g3():
try:
yield 1
1 / 0
except ZeroDivisionError:
yield 2
try:
yield 3
import no
except ModuleNotFoundError:
for i in range(3):
yield i + 4
yield 4
finally:
yield 100
>>> gg = g3()In the first yield statement, the first try block is set up.
f_iblock is 1, indicating that there's currently one block
b_type 122 is the opcode SETUP_FINALLY, b_handler 20 is the opcode location of the except ZeroDivisionError, b_level 0 is the stack pointer's position to use
>>> next(gg)
1b_type 257 is the opcode EXCEPT_HANDLER, EXCEPT_HANDLER has a special meaning
/* EXCEPT_HANDLER is a special, implicit block type which is created when
entering an except handler. It is not an opcode but we define it here
as we want it to be available to both frameobject.c and ceval.c, while
remaining private.*/
#define EXCEPT_HANDLER 257b_handler set to -1, since already in the processing of the try block
b_level doesn't change
>>> next(gg)
2f_iblock is 3, the second try block comes from finally:(opcode position 116), and the third try block comes from except ModuleNotFoundError:(opcode position 62)
>>> next(gg)
3>>> next(gg)
4b_type of the third try block becomes 257 and b_handler becomes -1, means this block is currently being handling
The other two try blocks are handled properly
>>> next(gg)
5
>>> next(gg)
6
>>> next(gg)
4
>>> next(gg)
100Frame object is deallocated
>>> next(gg)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIterationf_back is a pointer that points to the previous frame. It makes the related frames a singly linked list
import inspect
def g4(depth):
print("depth", depth)
print(repr(inspect.currentframe()), inspect.currentframe().f_back)
if depth > 0:
g4(depth-1)
g4(3)Output
depth 3
<frame at 0x7fedc2f2e9a8, file '<input>', line 3, code g4> <frame at 0x7fedc2cab468, file '<input>', line 1, code <module>>
depth 2
<frame at 0x7fedc2de54a8, file '<input>', line 3, code g4> <frame at 0x7fedc2f2e9a8, file '<input>', line 5, code g4>
depth 1
<frame at 0x7fedc2ca6348, file '<input>', line 3, code g4> <frame at 0x7fedc2de54a8, file '<input>', line 5, code g4>
depth 0
<frame at 0x10c2c9930, file '<input>', line 3, code g4> <frame at 0x7fedc2ca6348, file '<input>', line 5, code g4>The first time a code object is attached to a frame object, after the execution of the code block, the frame object will not be freed. It becomes a "zombie" frame. The next time the code block executes again, it will reuse the same frame object.
This strategy saves malloc/realloc overhead and some field initialization
def g5():
yield 1
>>> gg = g5()
>>> gg.gi_frame
<frame at 0x10224c970, file '<stdin>', line 1, code g5>
>>> next(gg)
1
>>> next(gg)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
>>> gg3 = g5()
>>> gg3.gi_frame # id same as previous one, the same frame object in the same code block is reused
<frame at 0x10224c970, file '<stdin>', line 1, code g5>There's a singly linked list that stores the deallocated frame objects. It saves malloc/free overhead
static PyFrameObject *free_list = NULL;
static int numfree = 0; /* number of frames currently in free_list */
/* max value for numfree */
#define PyFrame_MAXFREELIST 200When a PyFrameObject is on the free list, only the following members have meaning
ob_type == &Frametype
f_back next item on free list, or NULL
f_stacksize size of value stack
ob_size size of localsplusThe creation process will check if the stack size is enough
if (Py_SIZE(f) < extras) {
PyFrameObject *new_f = PyObject_GC_Resize(PyFrameObject, f, extras);Let's see an example
import inspect
def g6():
yield repr(inspect.currentframe()), inspect.currentframe().f_back
>>> gg = g6()
>>> gg1 = g6()
>>> gg2 = g6()the frame attached to variable gg is deallocated, because it's the first frame execute the code block, it becomes the "zombie" frame of the code object
because the code object still contains reference count to the frame object("zombie" frame), the frame object won't go to the free_list or trigger gc
>>> next(gg)
("<frame at 0x1052d83a0, file '<stdin>', line 2, code g6>", <frame at 0x105225e50, file '<stdin>', line 1, code <module>>)
>>> next(gg)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration>>> next(gg1)
("<frame at 0x105620040, file '<stdin>', line 2, code g6>", <frame at 0x105474cc0, file '<stdin>', line 1, code <module>>)
>>> next(gg1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration>>> next(gg2)
("<frame at 0x105482d00, file '<stdin>', line 2, code g6>", <frame at 0x105225e50, file '<stdin>', line 1, code <module>>)
>>> next(gg2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration





















