File Handling in Python

File Handling in Python

Python File Handling: A Beginner's Guide to Manipulating Files with Ease

·

3 min read

Before Deep Diving into this ocean of file-handling concepts let's understand some basic theory of data which is used for I/O

  • Text - '12345' as a sequence of Unicode chars (it is the representation of the human-readable text in a digital format)

  • Binary - 12345 as a sequence of bytes of its binary equivalent( Binary is the combination of 0 and 1 and each digit is called a bit, when there is a sequence of bits, the group of 8 bits is considered as 1 byte)

How File I/O is done in most programming languages

1st Step: Open a file

2nd Step: Read/Write data

3rd Step: Close the file

now let's see how to write in a file

CASE_1: if the file is not present

```
#creating a file handler object and opening the file 
#by creating it 
f = open('sample.txt','w') 
#'w' is the writing mode which means we are in the writing mode 
f.write('helloworld')
#we have writting a text in the file in writing mode 
f.close()
#calling the close function from the file handler object 
#now the file is closed 
#now we cannot edit the file 
f.write('Hello') 
#the error would occur 

#writing  multiline strings 
#creating a file handler object and making a file and writing 
#in write mode 
f = open('sample1.txt','w') 
f.write('hello')
f.write('\heyy') #writing to the next line

now let's see the internal working of open() function :

when you open a file by open() function in python ,the operating system and file system works together to perform various functions and tasks on the file ,to improve the performance OS uses Buffer memory to temprorarily hold the data , the buffering reduces the no. of direct read and write operations which is very efficient , the buffer resides in RAM, allowing faster access to the file's data , In Buffering the file is read char by char from start and when you close the file ,file comes out from buffer memory and goes to the RAM

The Visual Diagram of the Buffering

However, the general idea is that file handling involves coordinating between RAM and secondary storage (such as ROM or a hard drive) to efficiently read from and write to files.

Problem with w mode : when we add a text in w mode in a existing file , the older content is removed that's why we use append mode

f = open('sample.txt','a') 
f.write('i am fine')
f.close()

Using Context Manager(with)

  • It's a good idea to close a file after usage as it will free up the resources

  • if we don't close it , garbage collector would close ut

  • with keyword closes the file as soon as usage is over

with open('sample.txt','w') as f :
     f.write('hello world')

Moving within a file -> 10 char then 10 char

with open('sample.txt','r') as f :
    print(f.read(10))
    print(f.read(10))
#in this next 10 char is printed all  because of buffer memory 
#it keep a track on the fule 
#benifit of this is that we load the big file in chunks 
big_L = ['hello world' for i in range(1000)] 
#creating a file with large data 
with open('big.txt','w') as f :
    f.writelines(big_L) 

#reading the file in chunks 
with open('big.txt','r') as f:
    chunk_size = 100 
    while len(f.read(chunk_size))>0:
        print(f.read(chunk_size),end= ' ** ')
        f.read(chunk_size)

As you can see how the file is loaded in chunkz

Serialization and Deserialization

  • Serialization : process of converting python data types to json format

  • Deserialization : process of converting JSON to python datatypes

#serialization using json module 
#list 
import json 
L =[1,2,3,4]
with open('demo.json','w') as f:
    json.dump(L,f)
#dict 
d = {
     'name' : 'name',
    'age'  : 39,
    'gender' : 'male'
}
with open('demo.json','w') as f :
    json.dump(d,f,indent=4)
#deserialization 
import json 
with open('demo.json','r') as f:
    d = json.load(f)
    print(d)
    print(type(d))

#in tuple when you dump it gives list it cannot be stored in ntuple
with open('demo.json','r') as f:
    d = json.load(f)
    print(d)
    print(type(d))