Ruby Notes - Working With Files

My next “Ruby Notes” post was going to be on arrays but in my last couple of mini projects, Text Munger problem on RubyQuiz, and building an image editor command line app, I had to work a lot with files and directories and realized I didn’t have a great handle on them. To remedy this I did what I always do, a bunch of reading, practice problems, and put it all down in my notebook. There still is a lot for me to learn but I think this lays a good foundation for understanding and working with files in Ruby.

IO Class

The IO class is the parent class for the File class and thus is where it gets a ton of its methods such as readlines and readline. IO stands for input/output, specifically input/output streams which are sequences of data that allow you to do things like play sound on your speakers and print output to a screen. The IO class allows you to initialize streams and do things with them.

Standard Output, Input, and Error

STDOUT, STDIN, and STDERR are ruby constants that are IO objects pointing to your programs output, input, and error streams. You can access these streams through the terminal without opening any files.

When you do something like call puts, output is sent to the IO object that STDOUT points to. Conversely when you call get, input is captured by the IO object that STDIN points to.

Further Reading: https://rubymonk.com/learning/books/1-ruby-primer/chapters/42-introduction-to-i-o/lessons/89-streams

File Class

According to the ruby doc, a File is an abstraction of any file object accessible by the program and is closely associated with the class IO (it’s a subclass of IO).

You use the File class to create files, read them, and write to them. There are various modes that can be given to the File class telling it what its behaviour is i.e. can read it, can write to it, can do both, etc. These modes are inherited from the IO class and are listed below.

Modes

Mode Meaning
“r” Read-only, starts at beginning of file (default mode).
“r+” Read-write, starts at beginning of file.
“w” Write-only, truncates existing file to zero length or creates a new file for writing.
“w+” Read-write, truncates existing file to zero length or creates a new file for reading and writing.
“a” Write-only, starts at end of file if file exists, otherwise creates a new file for writing.
“a+” Read-write, starts at end of file if file exists, otherwise creates a new file for reading and writing.
“b” Binary file mode (may appear with any of the key letters listed above). Suppresses EOL <-> CRLF conversion on Windows. And sets external encoding to ASCII-8BIT unless explicitly specified.
“t” Text file mode (may appear with any of the key letters listed above except “b”).


Writing to a File

Writing to a File
1
2
3
4
5
file = File.open(text.txt, w)
file.puts Hello from #{$0}”
file.close

#=> writes “Hello from io.rb” to the file text.txt

On the first line I’m calling the .open method on the File class and passing it the file text.txt and the mode I want the file to use, “w”. Next I’m using the .puts method to write to the file and passing it the text I want it to write to the file. Note, that If we didn’t have a file text.txt in our directory, this script would have created it.

Using Block Notation

Writing to a File with Block Notation
1
File.open(text.txt,w){|file| file.putsHola from $0}

Note that when passing a block to File you don’t have to close it because when the block is exited it closes the File for you.

Reading a File

Reading from a File
1
2
3
4
file = File.open(lib/I_have_a_dream.txt)
contents = file.read
puts contents
file.close

This is pretty simple. We’re opening the file we want to read with the .open method and storing it in the file variable. Then we call the .read method on file and store it in contents and then puts the contents. .read starts reading from the place the last .read operation stopped. Here we’ve read the entire file and thus if below puts contents we tried to read the file again there would be nothing to read because we’re at the end of the file.

Reading a File Block Notation

Reading from a File Block Notation
1
2
contents = File.open(lib/I_have_a_dream.txt, r){|file| file.read}
puts contents

Closing Files

If you open a file make sure you close it, unless you’re passing File a block and then the block will close the file when it ends.

The reason you need to close files is it forces a “flush”, which means it pushes the data-to-be-written to where you want it to be. This frees up memory for the rest of your program and ensures the file is available for other processes to access.

Further Reading: http://ruby.bastardsbook.com/chapters/io/

More File Methods

We’ve already seen some file methods like .open and .close but here are some more useful ones. Checkout the ruby doc for File and IO for the rest of them.

.readlines & .readline

These two methods can be very handy when you want to read one line at a time. This would be useful for instance if you are reading a comma delimited file.

.readlines

- takes in all the content of the file and stores each line as an element of an array. From here you can iterate over each line using each.

Using Readlines
1
2
3
File.open(read_file).readlines.each do |line|
puts line
end

.readline

- is a bit different in it only reads one line at a time and thus you need to keep advancing it forward in the file, which can be done with a while or until method.

Using Readline
1
2
3
4
5
6
file = File.open("lib/blood_sweat_tears.txt")
until file.eof?
   line = file.readline
   puts line
end
file.close

The reason you would want to .readline vs .readlines is because .readlines loads the entire contents of the file into memory. For a small script working with small files this isn’t a problem but if you are using large files and/or have multiple users this is bad.

.exists?

- checks for the existence of the file.

1
2
3
if File.exists?(file_name)
  #do something 
end

.absolute_path

- gets the absolute path for the.

1
2
puts File.absolute_path("lib/blood_sweat_tears.txt")
#=> “/Users/kyledoherty/Dropbox/Ruby/learn_to_program/working_w_files/lib/blood_sweat_tears.txt”

.basename

- gives you just the filename.

1
2
puts File.basename(/Users/kyledoherty/Dropbox/Ruby/learn_to_program/working_w_files/lib/blood_sweat_tears.txt)
#=> “blood_sweat_tears.txt’

.directory?

- returns true if the string passed to it is a directory.

1
2
3
Dir.open(Dir.pwd).each do |filename|
  next if File.directory? filename
end

Dir Class

The Directory class allows you to work with driectories as you’d expect. Most of the methods you can use on the directory class are the same as the commands you use in the console.

Some Dir Methods

.pwd

- tells you what directory you’re in.

1
2
puts Dir.pwd
#=> "/Users/kyledoherty/Dropbox/Ruby/image_edit"  

.chdir

- allows you to change to a new directory.

1
2
Dir.chdir("/Users/kyledoherty/Dropbox/Ruby/rubyquiz”
#=> 0

.mkdir

- makes a new directory named the string it is passed.

1
2
Dir.mkdir(stuff)
#=> 0

.rmdir

- removes an empty directory but throws an error if it contains files. To remove a directory with files you must use the FileUtils module.

1
2
Dir.rmdir(stuff)
#=> 0

Accessing Directory Content

There are two ways to grab content from directories, using .entries and .glob.

.entries

- returns an array with every single entry inside the diretory including “.” and hidden files.

1
2
Dir.entries(../rubyquiz)
#=> [".", "..", ".DS_Store", ".git", "README", "text_munger_76"] 

.glob

- can be passed a directory name or pattern such as *.txt and returns an array of just the visible files

1
2
Dir.entries(*)
#=> ["README", "text_munger_76"] 

Gives us the files in the current directory.

1
2
Dir.entries(**/*.txt)
#=> ["text_munger_76/lib/blood_sweat_tears.txt", "text_munger_76/lib/gettysburg_address.txt", "text_munger_76/lib/I_have_a_dream.txt", "text_munger_76/lib/pearl_harbor_address.txt", "text_munger_76/lib/strength_and_decency.txt"] 

Here we use */.txt to search the current directory and all it’s sub directories for any .txt files using a recursive search and passing it the pattern .txt.

FileUtils Module

I’m not going to go into FileUtils too much but it allows more control over files and mimics a lot of the command line commands and flags you can use such as rm -rf for removing directories that contain files.

Some Methods

.mkdir

- makes a directory

.touch

- makes a file

.rm_rf

- removes a directory whether it contains other files and directories or not

1
2
3
if File.exists?(file_name)
FileUtils.rm_rf(file_name)
end

Note: you need to require FileUtils in your files with require ‘fileutils’

Copyright © 2015 - Kyle Doherty. Powered by Octopress