Large Numbers - File Input Lesson
(Page 3 of 5 )
When you start coding a project similar to this, the first question that pops into your head is how it ought to communicate with the user. Now I don't talk about how we'll use it in the C language per se (although we can't totally ignore that), but more importantly, how we enter these kinds of numbers into the program
One thing is sure, you don't want to use the standard way of entering it upon starting in the console. It would take too much time, and what if you mistype a number? Go figure! You can start all over from the beginning. It's a more elegant and practical solution to use files.
It shouldn't be too hard to implement, and reading from a file can be quite easy. I said "can be" and not "is" for one reason. When you start to read a file that's literally LARGE you will notice that it takes a while. And let's not forget that time is something that we can't afford to waste in the hectic life of the twenty-first century.
If you are really serious about this and take a deeper look at the problem, you will find out that there exists a solution that can drastically improve the reading speed of our class - application. The key lies deep in the mechanism of C++ and is often buried within the pages of dusty big books.
It's all about the File Input/Output. You see, by using the standard mode "<<" of reading files, C reads each byte from the buffer by checking it also, and you can see that for large files this can exponentially increase the reading time.
This is true for all methods that are white space sensitive (they read until a white space such as space, enter, tab, etcetera). So it's quite obvious that functions like scanf, cin, gets, getline and so forth aren't an option for us. Some of them are also copy functions, so they also copy each string before reading. This wastes time by allocating both a random-sized block of memory and making a copy in the CPU's cache.
The solution to our problem is to read the files in chunks and do the checking ourselves. Furthermore, there are a few platform-dependent improvement possibilities. Under Windows it is advisable to read the files in their raw, not translated mode, which is binary. Doing the same under UNIX or *NIX doesn't provide any performance increase at all; the good part is that no performance hit happens either.
So reading the data from the file should look something like this. Note that we open the file in binary mode, and that we read from it in BUFFER_SIZE chunks. The size I used is 512K, pretty large for fast reading and appropriate to today's systems.
You will also see that I implemented an input correctness checker after the reading. So if inappropriate data is introduced, that will be ignored; not_good is a function that returns true if the data is correct and, of course, false otherwise. Deleting these chars in STL must be done with the erase-remove idiom. This is because the remove only moves the removed chars to the file's end and consequently returns the iterator to the last item. Thus, with the erase we should delete the end of the file.
#define BUFFER_SIZE 524288
// The chunk in what the data will be read 512K section
...
ifstream inputFile( file.c_str(),std::ios_base::binary);
char block[BUFFER_SIZE ];
while ( inputFile.read( block, BUFFER_SIZE ) )
{
number += block;
number.erase(number.end()-3, number.end());
}
number.append( block, inputFile.gcount() );
inputFile.close(); //close the file
number.erase( remove_if( number.begin(), number.end(),not_good ),number.end() );
Additional constructors are also provided within the class: first, a default one that initializes a number with a zero value. Also we are going to use a little function that makes sure that we aren't wasting precious memory. It's called opt; it cuts the extra zeroes from the start and/or the end. So a number that's, for example, 000325.355650000 will be represented only as 325.35565.
A compare function was also included so we can easily compare two functions and determine which one of them is larger. The function returns 1 if the first is larger, while -1 if the second, and ultimately 0 if they are equal. With the help of the Set function we can assign a new value to our number at any time. There is also an abs function that changes the sign of the number to positive. Shifting the comma in both ways with the ShiftIt method is also important. These functions will all be used and required to write the arithmetic methods. But enough talk, let's get started with it.
Next: Addition and Subtraction >>
More C++ Articles
More By Gabor Bernat