Overview

Teaching: 10 min
Exercises: 5 min
Questions
  • How can I read data from a file?

  • How can I write data to a file?

Objectives
  • Use open, read, and readline to read data from a file.

  • Use a file in for loop.

  • Use write to save data to a file.

  • Use basic string operations to process text data.

Use open to open files for reading or writing.

reader = open('myfile.txt', 'r')
data = reader.read()
reader.close()
print('file contains', len(data), 'bytes')
file contains 47189 bytes

Usually read text files with for loops.

reader = open('myfile.txt', 'r')
count = 0
for line in reader:
    count = count + 1
reader.close()
print('file contains', count, 'lines')
file contains 261 lines

Python preserves end-of-line newlines.

Strip whitespace using string methods.

reader = open('myfile.txt', 'r')
count = 0
for line in reader:
    line = line.strip()
    if len(line) > 0:
        count = count + 1
reader.close()
print('file contains', count, 'non-blank lines')
file contains 225 non-blank lines

Using with to Guarantee a File is Closed

It is good practice to close a file after you have opened it. You can use the with keyword in Python to ensure this:

with open('myfile.txt', 'r') as reader:
    data = reader.read()
print('file contains', len(data), 'bytes')

The with statement has two parts: the expression to execute (such as opening a file) and the variable that stores its result (in this case, reader). At the end of the with block, the file that was assigned to reader will automatically be closed. with statements can be used with other kinds of objects to achieve similar effects.

Squeezing a File

  1. Write a small program that reads lines of text from a file called input.dat and writes those lines to a file called output.dat.

  2. Modify your program so that it only copies non-blank lines.

  3. Modify your program again so that a line is not copied if it is a duplicate of the line above it.

  4. Compare your implementation to your neighbor’s. Did you interpret the third requirement (copying non-duplicated lines) the same way?

Key Points