10. Input and Output Streams

10.1 Input and output

We need to get data into our program to do interesting computations and we need to get the results out again.

In C++ Programming, I/O occurs in streams, which are sequences of bytes handled by the input/output library.

An input operation is when bytes flow from a device like a keyboard, a disk drive, or a network connection etc. to main memory while an output operation is when bytes flow from main memory to a device like a display screen, a printer, a disk drive, or a network connection, etc.

In this chapter, we’ll learn how to handle I/O consisting of streams using the C++ standard library.

10.2 The I/O stream model

istream:

  • The type istream deals with streams of input.
  • It turns character sequences into values of various types.
  • It gets those characters from somewhere (such as a console, a file, the main memory, or another computer).
  • istream is quite visible when used from keyboard; what you type is left in the buffer until you hit Enter (return/newline), and you can use the erase (Backspace) key “to change your mind” (until you hit Enter).

Graphical representation of istream:



ostream:

  • The type ostream to deal with streams of output.
  • It turns values of various types into character sequences.
  • Sends those characters “somewhere” (such as to a console, a file, the main memory, or another computer).

Graphical representation of ostream:



In I/O Streams, buffering is important for dealing with large amounts of data.

10.3 Files

At the most basic level, a file is simply a sequence of bytes numbered from 0 upward. A file has a format; i.e, it has a set of rules that determine what the bytes mean (Eg. character vs binary representation).


While reading a file , we should:

  • Know its name
  • Open it (for reading)
  • Read in the characters
  • Close it


While writing a file , we should:

  • Name it
  • Open it (for writing) or create a new file of that name
  • Write out our objects
  • Close it (though that is typically done implicitly)

10.4 Opening a file

  • An ifstream is an istream for reading from a file.
  • An ofstream is an ostream for writing to a file.
  • A fstream is an iostream that can be used for both reading and writing.


Before a file stream can be used it must be attached to a file:
Example:

                         
      cout << "Please enter input file name: ";
      string iname;
      cin >> iname;
      ifstream ist {iname}; // ist is an input stream for the file named name
      if (!ist) error("can't open input file ",iname);
                        
                      

Defining an ifstream with a name string opens the file of that name for reading. The test of !ist checks if the file was properly opened. After that, we can read from the file exactly as we would from any other istream. Output to files can be handled in a similar way by ofstreams.


Explicit open() and close() operations can be performed. However, relying on scope minimizes the chances of someone trying to use a file stream before it has been attached to a stream or after it was closed:
Example:

                        
                          ifstream ifs;
                          // . . .
                          ifs >> foo; // won’t succeed: no file opened for ifs
                          // . . .
                          ifs.open(name,ios_base::in); // open file named name for reading
                          // . . .
                          ifs.close(); // close file
                          // . . .
                          ifs >> bar; // won’t succeed: ifs’ file was closed
                          // . . .
                        
                      


Also, you can’t open a file stream a second time without first closing it:
Example:

                        
                        fstream fs;
                        fs.open("foo", ios_base::in) ; // open for input
                        // close() missing
                        fs.open("foo", ios_base::out); // won’t succeed: fs is already open
                        if (!fs) error("impossible");
                        
                      

10.5 Reading and writing a file

Consider these temperature readings from a weather station:
0 60.7
1 60.6
2 60.3
3 59.22
. . .

The hours are numbered 0 to 23 and the temperatures are in Fahrenheit. No further formatting is assumed.


Here, represent a temperature reading by a Reading type:

                        
  struct Reading { // a temperature reading
  int hour; // hour after midnight [0:23]
  double temperature; // in Fahrenheit
  };
                        
                      


Given that, we could read like this:

                        
  vector<Reading> temps; // store the readings here
  int hour;
  double temperature;
  while (ist >> hour >> temperature) {
        if (hour < 0 || 23 < hour) error("hour out of range");
        temps.push_back(Reading{hour,temperature});
  }
                        
                      


For writing, we can use the output file stream (ofstream) if we might want to output the readings with each pair of values in parentheses:

                        
for (int i=0; i<temps.size(); ++i)
      ost << '(' << temps[i].hour << ',' << temps[i].temperature << ")\n";
                        
                      


The file streams automatically close their files when they go out of scope, the complete program becomes:

                        
#include "std_lib_facilities.h"

struct Reading { // a temperature reading
    int hour; // hour after midnight [0:23]
    double temperature; // in Fahrenheit
};

int main()
{
    cout << "Please enter input file name: ";
    string iname;
    cin >> iname;
    ifstream ist {iname}; // ist reads from the file named iname
    if (!ist) error("can't open input file ",iname);

    string oname;
    cout << "Please enter name of output file: ";
    cin >> oname;
    ofstream ost {oname}; // ost writes to a file named oname
    if (!ost) error("can't open output file ",oname);

    vector<Reading> temps; // store the readings here
    int hour;
    double temperature;
    while (ist >> hour >> temperature) {
        if (hour < 0 || 23 < hour) error("hour out of range");
        temps.push_back(Reading{hour,temperature});
    }
    for (int i=0; i<temps.size(); ++i)
        ost << '(' << temps[i].hour << ','
              << temps[i].temperature << ")\n";
}
                        
                      

Drill

We want to convert a delimited file. Let's assume the delimiter in the existing file is a comma, and we want to convert it to a tab. Write a small program that accepts "file.comma" as input, and writes to "file.tab" as output.

10.6 I/O error handling

The possibilities for input errors are limitless! However, an istream reduces all to four possible cases, called the stream state:

Stream states
good() The operation succeeded
eof() We hit end of input ("end of file")
fail() Something unexpected happened (Eg. we looked for a digit and found 'x')
bad() Something unexpected and serious happened (Eg. a disk read error)


Example: Consider how to read a sequence of integers that may be terminated by the character * or an “end of file” into a vector.
1 2 3 4 5 *
This could be done using a function like this:

                        
  void fill_vector(istream& ist, vector<int>& v, char terminator)
  // read integers from ist into v until we reach eof() or terminator
  {
      for (int i; ist >> i; )
          v.push_back(i);
      if (ist.eof())
          return; // fine: we found the end of file

      // not good() and not bad() and not eof(), ist must be fail()
      ist.clear(); // clear stream state

      char c;
      ist>>c; // read a character, hopefully terminator

      if (c != terminator) { // ouch: not the terminator, so we must fail
          ist.unget(); // maybe my caller can use that character
          ist.clear(ios_base::failbit); // set the state to fail()
      }
  }
                        
                      

Since we cleared the state to be able to examine the character, we have to set the stream state back to fail(). We do that with ist.clear(ios_base::failbit).
The ios_base that appears here and there is the part of an iostream that holds constants such as badbit, exceptions such as failure.

10.7 Reading a single value

Lets say we are trying to read a single value from the user and we are trying to solve the simple problem of “how to get an acceptable value from the user.”
Here we deal with 3 different conditions/errors:

  • The user typing an out-of-range value
  • Getting no value (end of file)
  • The user typing something of the wrong type (here, not an integer)


For the above 3 errors we have 3 alternatives:

  • Handle the problem in the code doing the read.
  • Throw an exception to let someone else handle the problem (potentially terminating the program).
  • Ignore the problem.

10.7.1 Breaking the problem into manageable parts

    
      cout << "Please enter an integer in the range 1 to 10 (inclusive):\n";
      int n = 0;
      while (true)
      {
          cin >> n;
          if (cin)
          { // we got an integer; now check it
              if (1<=n && n<=10)
                  break;
              cout << "Sorry "
                    << n << " is not in the [1:10] range; please try again\n";
          }
          else if (cin.fail())
               { // we found something that wasn’t an integer
                  cin.clear(); // set the state back to good();
                              // we want to look at the characters
                  cout << "Sorry, that was not a number; please try again\n";
                  for (char ch; cin>>ch && !isdigit(ch); ) // throw away non-digits
                      /* nothing */ ;
                      if (!cin)
                          error("no input"); // we didn’t find a digit: give up
                      cin.unget(); // put the digit back, so that we can read the number
               }
               else
               {
                      error("no input"); // eof or bad: give up
               }
      }
      // if we get here n is in [1:10]
                            
                          


Few reasons that the code becomes messy are listed below:

  • Reading values
  • Prompting the user for input
  • Writing error messages
  • Skipping past “bad” input characters
  • Testing the input against a range



The code can be reduced to:

                            
      void skip_to_int()
      {
          if (cin.fail())
          { // we found something that wasn’t an integer
              cin.clear(); // we’d like to look at the characters
              for (char ch; cin>>ch; )
              { // throw away non-digits
                  if (isdigit(ch) || ch=="-")
                  {
                      cin.unget(); // put the digit back,
                                    // so that we can read the number
                      return;
                  }
              }
      }
      error("no input"); // eof or bad: give up
      }


      cout << "Please enter an integer in the range 1 to 10 (inclusive):\n";
      int n = 0;
      while (true)
      {
          if (cin>>n)
          { // we got an integer; now check it
              if (1<=n && n<=10)
                  break;
              cout << "Sorry " << n
                    << " is not in the [1:10] range; please try again\n";
          }
          else
          {
              cout << "Sorry, that was not a number; please try again\n";
              skip_to_int();
          }
      }
      // if we get here n is in [1:10]
                            
                          


But the above code is still long and messy. Further, we can make a function to do the above task.

                            
      int get_int(); // read an int from cin
      int get_int(int low, int high); // read an int in [low:high] from cin

      int get_int()
      {
          int n = 0;
          while (true)
          {
              if (cin >> n)
                  return n;
              cout << "Sorry, that was not a number; please try again\n";
              skip_to_int();
          }
      }

      int get_int(int low, int high)
      {
          cout << "Please enter an integer in the range "
                << low << " to " << high << " (inclusive):\n";
          while (true)
          {
              int n = get_int();
              if (low<=n && n<=high)
                  return n;
              cout << "Sorry "
                    << n << " is not in the [" << low << ':' << high
                    << "] range; please try again\n";
          }
      }
                            
                          


Now we can read integers reliably as follows:

                            
      int n = get_int(1,10);
      cout << "n: " << n << '\n';
      int m = get_int(2,300);
      cout << "m: " << m << '\n';
                            
                          

10.7.2 Separating dialog from function

In the above example, the get_int() functions still mix up reading with writing messages to user. Instead we can call get_int() like this:

                            
    int strength = get_int(1,10, "enter strength", "Not in range, try again");
    cout << "strength: " << strength << '\n';
    int altitude = get_int(0,50000,
                        "Please enter altitude in feet",
                        "Not in range, please try again");
    cout << "altitude: " << altitude << "f above sea level\n";


    int get_int(int low, int high, const string& greeting, const string& sorry)
    {
        cout << greeting << ": [" << low << ':' << high << "]\n";
        while (true)
        {
            int n = get_int();
            if (low<=n && n<=high)
                return n;
            cout << sorry << ": [" << low << ':' << high << "]\n";
          }
    }
                            
                          


The point here is that “utility functions” used in many parts of a program shouldn’t have messages “hardwired” into them.

10.8 User-defined output operators

The output operator << can be used in the following ways:
Simple output operator for Date

                        
  ostream& operator<<(ostream& os, const Date& d)
  {
      return os << '(' << d.year()
                  << ',' << d.month()
                  << ',' << d.day() << ')';
  }
                        
                      

This prints August 30, 2004, as (2004,8,30)

Given the definition of << for Date, the meaning of:
cout << d1;
where d1 is a Date is the call

operator<<(cout,d1);
Also, operator<<() takes an ostream& as its first argument and returns it again as its return value. We can "chain" output operations.

                        
cout << d1 << d2; // means operator<<(cout,d1) << d2;
                  // means operator<<(operator<<(cout,d1),d2);
                        
                      

10.9 User-defined input operators

The input operator >> can be used in the following ways:
Simple input operator for Date

                        
  istream& operator>>(istream& is, Date& dd)
  {
      int y, m, d;
      char ch1, ch2, ch3, ch4;
      is >> ch1 >> y >> ch2 >> m >> ch3 >> d >> ch4;
      if (!is) return is;
      if (ch1!='(' || ch2!=',' || ch3!=',' || ch4!=')')
      {   // oops: format error
          is.clear(ios_base::failbit);
          return is;
      }
      dd = Date{y,Date::Month(m),d}; // update dd
      return is;
  }
                        
                      

This reads items like (2004,8,20) and try to make a Date out of those 3 integers.
If the input operator>>() reads an invalid Date, say (2004,8,35), Date’s constructor will throw an exception, getting out of this operator>>().

10.10 A standard input loop

Here is how we can check our input file reads as we go along:

Assuming ist is an istream,

                        
for (My_type var; ist>>var; )
{   // read until end of file
    // maybe check that var is valid
    // do something with var
}
// we can rarely recover from bad; don’t try unless you really have to:
if (ist.bad())
    error("bad input stream");
if (ist.fail())
{
    // was it an acceptable terminator?
}
// carry on: we found end of file
                        
                      


Improving istream by letting it throw an exception of type failure,


// somewhere: make ist throw an exception if it goes bad:
ist.exceptions(ist.exceptions()|ios_base::badbit);
                        
                      


Also, deciding to designate a character as a terminator,

                        
for (My_type var; ist>>var; )
{   // read until end of file
    // maybe check that var is valid
    // do something with var
}
if (ist.fail())
{   // use '|' as terminator and/or separator
    ist.clear();
    char ch;
    if (!(ist>>ch && ch=='|'))
        error("bad termination of input");
}
// carry on: we found end of file or a terminator
                        
                      


Making the code simpler using a function:

                        
  // somewhere: make ist throw if it goes bad:
  ist.exceptions(ist.exceptions()|ios_base::badbit);
  void end_of_loop(istream& ist, char term, const string& message)
  {
      if (ist.fail())
      {   // use term as terminator and/or separator
          ist.clear();
          char ch;
          if (ist>>ch && ch==term)
              return; // all is fine
          error(message);
       }
  }

  for (My_type var; ist>>var; )
  {   // read until end of file
      // maybe check that var is valid
      // . . . do something with var . . .
  }
  end_of_loop(ist,'|',"bad termination of file"); // test if we can continue
  // carry on: we found end of file or a terminator
                        
                      

Unless the stream is in the fail() state, end_of_loop() does nothing.

10.11 Reading a structured file

Assume that you have a file of temperature readings that has been structured like this:

  • A file holds years (of months of readings).
    A year starts with { year followed by an integer giving the year, such as 1900, and ends with }.
  • A year holds months (of days of readings).
         A month starts with { month followed by a three-letter month name, such as jan, and ends with }.
  • A reading holds a time and a temperature.
        A reading starts with a ( followed by day of the month, hour of the day, and temperature and ends with a ).


Example:

                          
      { year 1990 }
      {year 1991 { month jun }}
      { year 1992 { month jan ( 1 0 61.5) } {month feb (1 1 64) (2 2 65.2) } }
      {year 2000
            { month feb (1 1 68 ) (2 3 66.66 ) ( 1 0 67.2)}
            {month dec (15 15 –9.2 ) (15 14 –8.8) (14 0 –2) }
      }
                          
                        


The above format is a little unusual. Though there is a move towards hierarchically structured files (HTML and XML files) but still we can hardly control the input structure of files that are read.

To suit our needs, we can choose the in-memory representation of data and we can often pick output formats.

10.11.1 In-memory representation

The first choice to represent this data in memory will be classes: Year, Month and Reading

But Reading (day of month, hour of day, temperature) is “odd” and makes sense only within a Month. It is also unstructured.

Better way of representation will be as,
Year as a vector of 12 Months,
Month as a vector of about 30 Days,
Day as 24 temperatures (one per hour)

Day, Month, and Year are simple data structures, each with a constructor.
We need to have a notion of “not a reading” for an hour of a day for which we haven’t (yet) read data,
const int not_a_reading = –7777; // less than absolute zero


For a month without data,
const int not_a_month = –1;

The 3 classes will be,

                            
                              struct Day
                              {
                                  vector<double> hour {vector<double>(24,not_a_reading)};
                              };

                              struct Month
                              {   // a month of temperature readings
                                  int month {not_a_month}; // [0:11] January is 0
                                  vector<Day> day {32}; // [1:31] one vector of readings per day
                              };

                              struct Year
                              {    // a year of temperature readings, organized by month
                                  int year;  // positive == A.D.
                                  vector<Month> month {12}; // [0:11] January is 0
                              };
                            
                          

10.11.2 Reading structured values

The Reading class,

                            
                              struct Reading
                              {
                                  int day;
                                  int hour;
                                  double temperature;
                              };

                              istream& operator>>(istream& is, Reading& r)
                              // read a temperature reading from is into r
                              // format: ( 3 4 9.7 )
                              // check format, but don’t bother with data validity
                              {
                                  char ch1;
                                  if (is>>ch1 && ch1!='(')
                                  {   // could it be a Reading?
                                      is.unget();
                                      is.clear(ios_base::failbit);
                                      return is;
                                  }
                                  char ch2;
                                  int d;
                                  int h;
                                  double t;
                                  is >> d >> h >> t >> ch2;
                                  if (!is || ch2!=')')
                                      error("bad reading"); // messed-up reading
                                  r.day = d;
                                  r.hour = h;
                                  r.temperature = t;
                                  return is;
                              }
                            
                          


The Month input operation,

                            
                    istream& operator>>(istream& is, Month& m)
                    // read a month from is into m
                    // format: { month feb . . . }
                    {
                        char ch = 0;
                        if (is >> ch && ch!='{')
                        {
                            is.unget();
                            is.clear(ios_base::failbit); // we failed to read a Month
                            return is;
                        }

                        string month_marker;
                        string mm;
                        is >> month_marker >> mm;
                        if (!is || month_marker!="month")
                            error("bad start of month");
                        m.month = month_to_int(mm);
                        int duplicates = 0;
                        int invalids = 0;
                        for (Reading r; is >> r; )
                        {
                            if (is_valid(r))
                            {
                                if (m.day[r.day].hour[r.hour] != not_a_reading)
                                    ++duplicates;
                                m.day[r.day].hour[r.hour] = r.temperature;
                            }
                            else
                                ++invalids;
                        }
                        if (invalids)
                            error("invalid readings in month",invalids);
                        if (duplicates)
                            error("duplicate readings in month", duplicates);
                        end_of_loop(is,'}',"bad end of month");
                        return is;
                    }
                            
                          


Month’s >> does a quick check that a Reading is plausible before storing it:

                            
                              constexpr int implausible_min = –200;
                              constexpr int implausible_max = 200;
                              bool is_valid(const Reading& r)
                              // a rough test
                              {
                              if (r.day<1 || 31<r.day)
                                  return false;
                              if (r.hour<0 || 23<r.hour)
                                  return false;
                              if (r.temperature<implausible_min|| implausible_max<r.temperature)
                                  return false;
                              return true;
                              }
                            
                          


Year’s >> is similar to Month’s >>,

                            
                              istream& operator>>(istream& is, Year& y)
                              // read a year from is into y
                              // format: { year 1972 . . . }
                              {
                                  char ch;
                                  is >> ch;
                                  if (ch!='{')
                                  {
                                      is.unget();
                                      is.clear(ios::failbit);
                                      return is;
                                  }

                                  string year_marker;
                                  int yy;
                                  is >> year_marker >> yy;
                                  if (!is || year_marker!="year")
                                      error("bad start of year");
                                  y.year = yy;
                                  while(true)
                                  {
                                      Month m; // get a clean m each time around
                                      if(!(is >> m))
                                          break;
                                      y.month[m.month] = m;
                                  }
                                  end_of_loop(is,'}',"bad end of year");
                                  return is;
                              }
                            
                          


But here the problem is that, operator>>(istream& is, Month& m) doesn’t assign a brand-new value to m; it simply adds data from Readings to m. We make a new Month to read into each time we do is>>m.

                            
                              for (Month m; is >> m; )
                              {
                                y.month[m.month] = m;
                                m = Month{}; // “reinitialize” m
                              }
                            
                          


Try using the above,

                            
                            // open an input file:
                            cout << "Please enter input file name\n";
                            string iname;
                            cin >> iname;
                            ifstream ist {iname};
                            if (!ifs)
                                error("can't open input file",iname);
                            ifs.exceptions(ifs.exceptions()|ios_base::badbit); // throw for bad()

                            // open an output file:
                            cout << "Please enter output file name\n";
                            string oname;
                            cin >> oname;
                            ofstream ost {oname};
                            if (!ofs)
                                error("can't open output file",oname);

                            // read an arbitrary number of years:
                            vector<Year> ys;
                            while(true)
                            {
                                Year y; // get a freshly initialized Year each time around
                                if (!(ifs>>y))
                                    break;
                                ys.push_back(y);
                            }
                            cout << "read " << ys.size() << " years of readings\n";
                            for (Year& y : ys)
                                print_year(ofs,y);
                            
                          

10.11.3 Changing representations

The tedious way to get Months's >> to work will be

                            
                            if (s=="jan")
                                m = 1;
                            else if (s=="feb")
                                     m = 2;
                            . . .
                            
                          


Instead we represent input as a vector<string> with an initialization function and a lookup function,

                            
                              vector<string> month_input_tbl = {
                              "jan", "feb", "mar", "apr", "may", "jun", "jul",
                              "aug", "sep", "oct", "nov", "dec"
                              };

                              int month_to_int(string s)
                              // is s the name of a month? If so return its index [0:11] otherwise –1
                              {
                                  for (int i=0; i<12; ++i)
                                      if (month_input_tbl[i]==s)
                                          return i;
                                  return –1;
                              }
                            
                          


While producing output, we would like the symbolic representation of month to be printed,

                            
                vector<string> month_print_tbl = {
                "January", "February", "March", "April", "May", "June", "July",
                "August", "September", "October", "November", "December"
                };

                string int_to_month(int i)
                // months [0:11]
                {
                    if (i<0 || 12<=i)
                        error("bad month index");
                    return month_print_tbl[i];
                }
                            
                          

Test Yourself!
  1. An istream fundamentally...
    1. hides all user input from the programmer
    2. hides the details of input devices from the programmer
    3. hides what data types are being input from the programmer
  2. A file is fundamentally...
    1. an icon inside a folder
    2. a list of records
    3. a sequence of bytes stored on some media
  3. Why is input usually harder than output?
    1. istream is a trickier class than ostream
    2. input devices require more assembly language code than output devices
    3. The user might input almost anything!
  4. What are the four steps for reading a file?
    1. delete the file, read from stdin, write the results there, and close stdin
    2. know its name, open it, read in the contents, and close it
    3. generate the file, write out its contents, close it, and verify a good result
  5. The ______ objects have values that can be tested for various conditions.
    1. ofstream
    2. ifstream
    3. stream
    4. osstream
  6. Where does a cin stops that extraction of data?
    1. by seeing >>
    2. none of the above
    3. both a and b
    4. by seeing a blankspace
  7. Which of the following is not a stream state (istream)?
    1. bad()
    2. eof()
    3. success()
    4. good()
    5. fail()
  8. Unformatted input functions are handled by
    1. instream
    2. bufstream
    3. ostream
    4. istream
  9. A class that defines cout, cerr and clog objects and stream insertion operator is
    1. istream
    2. fstream
    3. ostream
    4. kstream
Answers

1. b; 2. c; 3. c; 4. b; 5. b; 6. d; 7. c; 8. d; 9. c;

Drill