Have you ever wondered- How the memory is managed in Python?
Or what is reference count in Python?
In this article, you can expect detail about following topics…
Table of Contents
Without wasting any further time, let’s start point by point…
For the sake of simplicity, the reference count is nothing but the number of times Python object is used.
How is reference count calculated?
getrefcount() is the function present inbuilt with the Python module
sys. This functions takes Python object as an input and returns the number of references present for the given Python object.
Here, input to the
getrefcount() can be a variable name, value, function, class and anything else comes under Python object.
Let’s take an example…
import sys print(sys.getrefcount(1556778))
This means the integer value ‘1556778’ is used 3 times.
You might be curious… how does it comes 3 times, even if you have used the value only once?
The reference count is calculated based on the two factors…
Let’s bend into some technical detail…
When you run any Python program, it gets interpreted into the bytecode. The reference count of the object is calculated based on the number of times object is used in the bytecode (not from your high-level program code).
You can also check the bytecode of your program using the dis module. It disassembles the Python bytecode.
Below is code to get the bytecode of the Python program.
import dis import sys print(compile("sys.getrefcount(1556778)", '', 'single').co_consts) print(dis.dis(compile("sys.getrefcount(1556778)", '', 'single'))) print(sys.getrefcount(1556778))
(1556778, None) 1 0 LOAD_NAME 0 (sys) 3 LOAD_ATTR 1 (getrefcount) 6 LOAD_CONST 0 (1556778) 9 CALL_FUNCTION 1 12 PRINT_EXPR 13 LOAD_CONST 1 (None) 16 RETURN_VALUE None 3
single is a mode of Python interpreter.
There are 3 references here- one from the
co_consts tuple on the code object, one on the stack (from the
LOAD_CONST instruction), and one for the
sys.getrefcount() method itself.
If the same object is used in the other part of the code, it will be counted in the reference count of the given object.
Even, there are multiple cumbersome operations goes running in background Python. It may possible that this object is used in the background of your running program. It is also counted as a reference to the object.
The output (reference count) may vary from system to system.
When you pass the variable as a parameter to the function, reference count for the variable object is incremented. When the control goes out of the function, the reference count is decremented.
import sys a =10 print(sys.getrefcount(a)) #17 def func(b): print(sys.getrefcount(a)) #19 func(a) print(sys.getrefcount(a)) #17
Note: Reference count is shown as 19 instead of 18 because of variable ‘a’ is used two times in function- as a parameter to the function
func(); as a parameter to the function
Along with the list object, every element in the list has a separate reference count.
When you delete a list or if the lifetime of the list expires, the reference count of each element in the list goes down by one.
import sys liAbc = ['a', 'b', 'c'] print(sys.getrefcount('a')) #14 print(sys.getrefcount('b')) #12 print(sys.getrefcount('c')) #23 del liAbc print(sys.getrefcount('a')) #13 print(sys.getrefcount('b')) #11 print(sys.getrefcount('c')) #22
More detail about the list, you can read Python list vs tuple.
Python uses dynamic memory allocation. While declaring a variable object, you don’t need explicitly allocate the memory. When the object is no more used in the program, the variable is deleted.
There are two questions arises…
And, here comes the use of reference count.
How does Python count references use for Memory Management?
Python count the reference for each object. When you use that object again, the reference count is incremented.
When the reference object comes out of scope, the reference count is decremented.
When the reference count reaches zero, means the Python object is not in use. The memory which is assigned to the object gets deleted.
Integer is one of the numeric data types in Python.
When you create an integer object, the value of the object is saved in memory to use in the program. The reference count is set.
When you assign the same integer value to another variable, the reference count increases.
It also saves the computing resources by using a single place to store the value and assigned to all the variable storing the same value in the program.
And we know that integer is immutable datatype in Python. So we can not change the value of the integer. The new value is stored in different memory with the new reference count.
import sys print(sys.getrefcount(55)) #4 var = 55 print sys.getrefcount(55) #5 var = var + 1 print sys.getrefcount(55) #4
In the above program, the value of the variable var is incremented (you can change it to any other value or delete the variable). As Integer is immutable, we can not update the integer value, instead, it stores at a different place and decrements the reference count of previous value by one.
Now, what if you use smaller integer value?
import sys print(sys.getrefcount(1)) #97 print(sys.getrefcount(2)) #76 print(sys.getrefcount(3)) #30
This, means integer value 1 is used 97 times, 2 is used 76 and 3 is used 30 times.
There are multiple cumbersome operations goes running on Python background. So these values are used. The output may vary from system to system.
To find the pattern for a number of times Python object is used, we can plot the graph for a range of input objects.
import sys import matplotlib.pyplot as plt #calculate the values for x and y axis x = range(500) y = [sys.getrefcount(i) for i in x] fig, ax = plt.subplots() plt.plot(x, y, '.') #set lable for x axis ax.set_xlabel("number") #set lable for y axis ax.set_ylabel("sys.
getrefcount(number)") #plot the graph plt.show()
From the graphs, it is clear that there are more numbers of reference count for smaller numbers. A couple of initial smaller values have a reference count more than 3000. Means, smaller numbers are used widely running Python in the background.
import sys import matplotlib.pyplot as plt #string with all character letters strLet = "abcdefghijklmnopqrstuvwxyz" refs = [sys.getrefcount(l) for l in strLet] y_pos = range(len(strLet)) plt.bar(y_pos, refs, align='center') plt.xticks(y_pos, letters) #set lable for x axis plt.xlabel("letter") #set lable for y axis plt.ylabel('sys.getrefcount(strLet)') #plot the graph plt.show()
We are more obsessed with the letter ‘x’ and it is used for many variable declarations. If you look at the graph it holds true. The ‘x’ as the object is used more than 1100 in Python.
As Python is case sensitive language, you will get the difference reference count (so the different graph) for small and big caps letters.
How can we exclude “Python” itself?
import sys for w in ["python", "version", "error", "var", "reference"]: print w, sys.getrefcount(w)
('python', 6) ('version', 12) ('error', 47) ('var', 9) ('reference', 6)
You can try some other keywords as well.
The understanding reference count is very for memory management. If you find this article fruitful, kindly share with your friends.
I have tried to address answers to multiple daunting questions about Python
getrefcount() function, reference count and memory management. If you have any doubt, feel free to write in the comment section.