File Handling in Python
Python File Handling: A Beginner's Guide to Manipulating Files with Ease
Before Deep Diving into this ocean of file-handling concepts let's understand some basic theory of data which is used for I/O
Text - '12345' as a sequence of Unicode chars (it is the representation of the human-readable text in a digital format)
Binary - 12345 as a sequence of bytes of its binary equivalent( Binary is the combination of 0 and 1 and each digit is called a bit, when there is a sequence of bits, the group of 8 bits is considered as 1 byte)
How File I/O is done in most programming languages
1st Step: Open a file
2nd Step: Read/Write data
3rd Step: Close the file
now let's see how to write in a file
CASE_1: if the file is not present
```
#creating a file handler object and opening the file
#by creating it
f = open('sample.txt','w')
#'w' is the writing mode which means we are in the writing mode
f.write('helloworld')
#we have writting a text in the file in writing mode
f.close()
#calling the close function from the file handler object
#now the file is closed
#now we cannot edit the file
f.write('Hello')
#the error would occur
#writing multiline strings
#creating a file handler object and making a file and writing
#in write mode
f = open('sample1.txt','w')
f.write('hello')
f.write('\heyy') #writing to the next line
now let's see the internal working of open() function :
when you open a file by open() function in python ,the operating system and file system works together to perform various functions and tasks on the file ,to improve the performance OS uses Buffer memory to temprorarily hold the data , the buffering reduces the no. of direct read and write operations which is very efficient , the buffer resides in RAM, allowing faster access to the file's data , In Buffering the file is read char by char from start and when you close the file ,file comes out from buffer memory and goes to the RAM
The Visual Diagram of the Buffering
However, the general idea is that file handling involves coordinating between RAM and secondary storage (such as ROM or a hard drive) to efficiently read from and write to files.
Problem with w mode : when we add a text in w mode in a existing file , the older content is removed that's why we use append mode
f = open('sample.txt','a')
f.write('i am fine')
f.close()
Using Context Manager(with)
It's a good idea to close a file after usage as it will free up the resources
if we don't close it , garbage collector would close ut
with keyword closes the file as soon as usage is over
with open('sample.txt','w') as f :
f.write('hello world')
Moving within a file -> 10 char then 10 char
with open('sample.txt','r') as f :
print(f.read(10))
print(f.read(10))
#in this next 10 char is printed all because of buffer memory
#it keep a track on the fule
#benifit of this is that we load the big file in chunks
big_L = ['hello world' for i in range(1000)]
#creating a file with large data
with open('big.txt','w') as f :
f.writelines(big_L)
#reading the file in chunks
with open('big.txt','r') as f:
chunk_size = 100
while len(f.read(chunk_size))>0:
print(f.read(chunk_size),end= ' ** ')
f.read(chunk_size)
As you can see how the file is loaded in chunkz
Serialization and Deserialization
Serialization : process of converting python data types to json format
Deserialization : process of converting JSON to python datatypes
#serialization using json module
#list
import json
L =[1,2,3,4]
with open('demo.json','w') as f:
json.dump(L,f)
#dict
d = {
'name' : 'name',
'age' : 39,
'gender' : 'male'
}
with open('demo.json','w') as f :
json.dump(d,f,indent=4)
#deserialization
import json
with open('demo.json','r') as f:
d = json.load(f)
print(d)
print(type(d))
#in tuple when you dump it gives list it cannot be stored in ntuple
with open('demo.json','r') as f:
d = json.load(f)
print(d)
print(type(d))