(<--Previous) Back to Index

FB n00b: Tutorial 9

Note: This tutorial is for absolute beginners. If you aren't an absolute beginner and you dislike the tutorial, that's too bad. If you are an absolute beginner and there's something you don't understand, read through the tutorial a second time. If you STILL have difficulties, e-mail me (TheMysteriousStrangerFromMars@yahoo.com) and I'll try to help you out. I do assume you at least have a decent knowledge of how to use your computer - if not, there's not much I can do for you. By the way, you'll need FreeBasic to use this tutorial - you download that at freebasic.net, so be sure to download and install it before doing this tutorial. Now start reading!

File I/O

You've come far in FB, and there is yet a ways to go before you can truly know everything you need to know to write a really useful program. One thing that is essential us file I/O.

You know what files are. Everyone does. Files let you store information permanently. Generally when you start a program it starts in some "default" state - as a blank slate, so to speak. If you do something in the program, fill the slate so to speak, all your work will be lost when you close the program - unless you decide to save it in a file, so you can load it all back next time you need the same information. How the programmer does this in his program is what this chapter is all about.

First thing you need to understand is the concept of a handle. A handle is a number used to refer to something, like a pointer refers to a location in memory - handles may be pointers, but not necessarily. Sometimes they may just be indexes into a pointer array (the number that goes between []) or even a regular array.

When dealing with File I/O, we will generally be sending data to the file or pulling it out (output or input, of course). One way to simplify things for the programmer is to use a handle to refer to the file, rather than invoking the name of the file each time. (This also makes things more efficient for the operating system, which would otherwise have to access the filesystem each time it searches for the file on the disk to read from or write to it.)

So we use Open to setup a file handle for a file, then we can access the file using the handle until we decide to Close the file.

  Open "myfile.txt" For Input As #1
#1 is the file handle here; "myfile.txt" is the file being opened (of course, you can also use a string variable instead of a literal string). Right now we're studying text files; a little later we'll look at binary files.

You can actually use regular numbers like 1 as the handle; however, it's better to use a variable as in large programs you may have a lot of files open at once and you might get the wrong one (and furthermore, what if you need to open a variable number of files and the number of files being opened will depend on some factor unknown until you run the program?) You can use a variable and set it with the function FreeFile() which will give you the next handle that isn't being used yet:

  Dim As Integer myHandle

  '...

  'Set the variable to FreeFile() and open using the variable as the handle
  myHandle = FreeFile ()
  Open "myfile.txt" For Input As #myHandle

Once there's a handle to an open file, there are several functions you may use to get various information about the file. One thing you can find is the length in bytes of the file:

  lengthOfFile = Lof(myHandle)
Lof() gives the file length; if the Lof() is 0, you know the file is empty.

Since you normally go through a file from beginning to end, you can also check at any time whether you've reached the end of the file or not:

  If Eof(myHandle) Then
    'We've reached the end of the file
  Else
    'We haven't reached the end of the file
  End If
If Eof() returns zero, you haven't yet reached the end of the file. Otherwise, you're at the end of the file.

One thing you notice in the code

  Open "myfile.txt" For Input As #myHandle
is the fact that we're opening the file "For Input." This means you can only read information out of the file, you can't actually change anything in the file. We'll talk about Output in a moment, but first we need to cover actually how to get input from the file once you've opened it. You can, actually, use the Input command much the same as you use it to get input from the keyboard:
  Input #myHandle, someString
But this method can be somewhat awkward to use for various reasons I won't go into here. Myself I generally prefer to use Line Input for text file input. What it does is gets a whole line from the file, then the next time you call it it will get the next line from the file for you, and so on. You use it much like Input in the above example:
  Line Input #myHandle, someString
A simple example of using all this is simply to print the contents of a text file on the screen:
  'Reads a file and prints it on the screen

  Dim As Integer myHandle
  Dim As String fileName, thisLine

  'Ask for the name of the file to open
  Print "What file would you like to view?"
  Input fileName

  'Get the next free file handle and open it
  myHandle = FreeFile()
  Open fileName For Input As #myHandle

  'And until we reach the end of the file...
  Do Until Eof(myHandle)
    'Read a line
    Line Input #myHandle, thisLine
    'Then print it on the screen
    Print thisLine
  Loop

  'And when we're done with the whole thing, close it.
  Close #myHandle

  'And as usual, Sleep so we can read what it says
  Sleep

Bear in mind that the only way after to read a line again after you've read it is if you close the file and open it all over again. If you read the whole thing into a temporary buffer, you can go back and forth over it as much as you want, but of course this is complicated. Generally though, you can usually go through a text file just once or twice, handling each line as it comes to you and not worrying about the next line or the previous one.

And now it's time to look at output to text files. Output is rather different. You would of course open a file "For Output"; and you can use Print to output a line at a time to it in much the same way as you use Input or Line Input to get input from a file:

Print #myHandle, someString
This works as expected; and if you want to write all to the same line with several print statements, you can use the semicolon (;) just as you do with the screen Print.

However, there is something to note carefully about this method: when you open a file For Output anything already in the file is cleared. That is, even if the file you open already exists and contains data, once you open it For Output that data is lost forever. This is of course fine if you're creating a totally new document; it's not so great if a document by that name already exists and you don't want to lose the data inside it. One way around this is to open the file using the Append mode instead of Output; this makes it everything you write to the file is written after the information that is already in the file. This is great if you want to append the information to the end of the file, but it won't let you modify the original contents of the file. In any case, the best solution might be to open the file for Input first, check if there's anything in it. If not, you can close it, open it again for Output, and write your information. Otherwise, you'll either want to append or else, if you merely want to modify the contents of the file, you'll need to input to a memory buffer, modify the memory buffer, then close the file and open for output. Somewhat complicated either way, but the low-level way would be much more complicated, believe me!

Now this method of handling files might seem great, but it's kind of limited. This is for handling text files, and most files aren't text files. For example, a graphics image is not textual data - the data is all binary. An executable program, the machine code the compiler generates based on the programs you write, is also binary data. You can't create or modify binary files using text mode file handling, because text mode does things that binary mode doesn't. For example, text mode files have lines, and each line is separated by Chr(13), Chr(10), or both - depending on which operating system you're using. Whereas binary files don't have lines, because they just go on and on. Because of the differences, FreeBASIC provides an entirely different mode for accessing binary files. Unlike text files, which have several modes (Output, Input, and Append), Binary files just have one mode: Binary.

When you open a file in binary mode, you can access it byte-by-byte, reading or writing using Get and Put. There are other ways of accessing binary data, but we'll start with the byte-by-byte stuff. I'm not going to go into detail on this because at this point binary file I/O isn't nearly so useful to you. Once we cover some topics involving binary data, like graphics, we'll revisit this topic. I will give a couple of commented and carefully explained examples, though.

First, a simple program to write 256 bytes out to a file, followed by a program to read the bytes back in from the same file. The bytes written are simply the numbers from 0 to 255, in order. Very simple program really, it's just an example to show how it's done:

Dim As Integer myHandle

myHandle = FreeFile()
Open "testfile.bin" For Binary As #myHandle
  Dim b As uByte
  For i As Integer = 0 To 255
    b = i
    Put #myHandle, i+1, b
  Next i
Close #myHandle
Dim As Integer myHandle
Dim fileData As uByte Ptr
Dim fileSize As Integer

myHandle = FreeFile()
Open "testfile.bin" For Binary As #myHandle

  fileSize = Lof(myHandle)

  fileData = Allocate(fileSize)
  
  Get #myHandle, 0, *fileData, fileSize
  'Can also do this instead:
  'For i As Integer = 1 To fileSize
  '  Get #myHandle, i, fileData[i-1]
  'Next i

Close #myHandle

For i As Integer = 0 To fileSize-1
  Print Chr(fileData[i])
Next i

DeAllocate(fileData)

Sleep

Get and Put both work the same way. First, you have the file handle. Then you have the offset in the file to write to. This is always given in bytes (if you're using another type of data, you may wish to convert using the SizeOf() operator). Next, you have the actual file data. Finally, you may optinoally specify the number of bytes to read. This is for if you have a pointer to a buffer or a regular array. The second example shows how you can read in a variable-sized file this way. The first example simply shows the normal syntax. Read through the examples both, but especially the second one. It's a bit complicated, since it uses pointers, but I think you can figure it out on your own. Notice in both examples that we start at 1, not 0. Most of the time in programming you start at 0, but FreeBASIC's binary file I/O functions start at 1 in order to maintain compatibility with QBASIC's equivalent functions (which also started at 1). Just get used to it, but prepare to start at 0 when you switch to other languages or libraries.

In case Get and Put seem simple enough, let me point out that they aren't limited to bytes and buffers. You can load any standard data type using Get/Put, or even UDTs. Things can become more complicated this way, but it's doable. If you need to use UDTs with Get/Put, I recommend you read the manual thoroughly. There are a number of things to consider, for example the fact that when UDTs are stored in memory usually the cells are padded to multiples of four bytes because the bus can access them more quickly that way. When you write to a binary file, you may wish to remove padding to save space, or if you're using a file format that doesn't use padding. For this sort of thing, you may wish to look up the keyword Field. I personally tend to read in the UDT one field at a time (that is, reading in each variable in the UDT separately, rather than reading the entire UDT in) - this seems to be easiest for OOP purposes. However, if your UDTs have lots of variables you may prefer to read the whole UDT in at once.

As a final example of binary file I/O, I show you how to find the information from a BMP file. This may be helpful later on when we study graphics (we'll re-use this code to create a function that loads a BMP into memory intelligently - wait until we get to the graphics section to see why).

The BMP file format is very simple. Like most image formats, it consists of a file header, which contains information about the file (for example, the width and height of the picture in pixels), followed by the actual file data (in this case, the image data). We're mostly concerned about the file header; our function will get the information about the image itself, and we'll deal with the actual image data in a later chapter. Here is a description of the BMP file header:

NameOffset in fileSizeDescription
Magic Number02This is a signal to prove that the file is, in fact, a BMP file and not some other kind of file. These two bytes should always be the ASCII "BM" or &h424D. If not, the file isn't a BMP.
File Size24Why bother since we can always check using Lof()? Well, this is also to make sure the file is a valid BMP file. Anyone can make a file start with the two letters "BM" - even a text file. But if it starts with the two letters "BM" and the next four bytes are the file size, chances are it really is a valid BMP. This is all to make sure we don't try to load a file that isn't a valid BMP as a BMP, because if we do we could end up with some problems, such as segmentation faults - that's where your pointers mess something up. If the file is not a valid BMP, we can detect it and avoid loading it, thus preventing a fatal error which causes our proram to go bye-bye.
<RESERVED>64These are reserved areas. I'm not sure what they're for, but apparently they aren't used for anything and are always 0. They are actually two reserved areas, each two bytes, res1 and res2, but I've grouped them together since they aren't important.
Pixel Offset104This tells where the header ends and the pixel data actually starts. My tests show this as usually being &h36, or 54, when you're using a normal 24-bit BMP, but in some cases this is different. For 8-bit BMPs, the color table that maps 8-bit values into 24-bit colors takes up some extra space, so in the example file this value was 1078 (&h436). In any case, this is simply a guide for the loading function to know where the pixel data starts.
BMI Size144The fields up to this point were part of the "Bitmap File Header". Starting with this field we have the "Bitmap Information Header" or BMI. This field simply gives the size of the BMI. In both my tests it was 40 (&h28) but it could differ of course.
Width184This is the width of the image in pixels.
Height224This is the height of the image in pixels.
Planes262Sometimes R, G, and B may be stored separately in what are known as planes. In this case, BPP would be 8 and this would be 3. Or something else, depending on the actual bytes per pixel you're using. In general, multiply planes by BPP, then divide by 8 to get the number of bytes per pixel.
Bits per Pixel282The BPP, or bits per pixel. In a 24-bit image, this could either be 24 (and planes 1) or it might also be 8 (and planes 3). Depending on the actual BPP of the image, this value and the value of planes will differ. Multiply the two values together, then divide by 8 in order to get the actual number of bytes per pixel.

There are other fields in the BMP header, but these are the important ones. No need to memorize these or anything, just to understand them so you understand the following piece of code. All this does is load these fields from a BMP using binary file input (the Get command) so we can examine them. Then it gives a simple print-out both on the screen and also into a log file (just to demonstrate, once more, how text file I/O works). All-in-all a fairly simple program, but as usual it might take you a few read-throughs to get it.

''
''
'' Chapter 9 - File I/O example:  Loading and analyzing the header of a BMP file
''
''



''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
'The BMP Header format
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Type BMP_Header Field = 1
  'The fields listed earlier
  magicNum        As uShort
  fileSize        As uInteger
  res1            As uShort
  res2            As uShort
  pixelOffset     As uInteger
  
  bmiSize         As uInteger
  imgWidth        As uInteger
  imgHeight       As uInteger
  planes          As uShort
  bpp             As uShort
  
  'Other fields that don't matter
  biCompression   As uInteger
  biSizeImage     As uInteger
  horizontal_res  As uInteger
  vertical_res    As uInteger
  biClrUsed       As uInteger
  biClrImportant  As uInteger
End Type
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''




''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
'Load the header of a BMP, check its validity, and print out diagnostic messages
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Function checkBMP (fileName As String) As Integer

'File handle we'll use to load the BMP
Dim As Integer bmpfhndl
'File handle for the log file we'll write messages to
Dim As Integer outfhndl

'We'll use this to keep track of what byte offset we're at in the BMP
'Remember, binary file offsets in FreeBASIC start with 1, not 0 like everyone else.
Dim As Integer fpos = 1

'Where we'll load our BMP header to
Dim As BMP_Header bmphdr

  'Open the log file first
  outfhndl = FreeFile()
  Open "checkbmp.log" For Output As #outfhndl

  'Open the BMP
  bmpfhndl = FreeFile()
  Open fileName For Binary As #bmpfhndl
  
  'If an error occurs or the file is empty, return 0
  If Err Or Eof(bmpfhndl) Then
    'Print the error message, close the files, return 0 for failure
    Print "Error loading BMP - file inaccessible or empty."
    Print #outfhndl, "Error loading BMP - file inaccessible or empty."
    Close #outfhndl
    Close #bmpfhndl
    Return 0
  End If
  
  'This is how you load a binary file field-by-field.
  'Keep track of the file position and as you go through each field, increment
  'the counter by the size of the field.
  Get #bmpfhndl, fpos, bmphdr.magicNum
  fpos += SizeOf(bmphdr.magicNum)
  
  Get #bmpfhndl, fpos, bmphdr.fileSize
  fpos += SizeOf(bmphdr.fileSize)
  
  'Now we've loaded the magic number and the file size.  Before we do anything else,
  'let's make sure they match properly.
  If bmphdr.magicNum <> &h4D42 Then
    'Print the error message, close the files, return 0 for failure
    Print "Error loading BMP - file signature does not match ""BM""."
    Print #outfhndl, "Error loading BMP - file signature does not match ""BM""."
    Close #outfhndl
    Close #bmpfhndl
    Return 0
  End If
  If bmphdr.fileSize <> Lof(bmpfhndl) Then
    'Print the error message, close the files, return 0 for failure
    Print "Error loading BMP - header-registered file size does not match actual file size."
    Print #outfhndl, "Error loading BMP - header-registered file size does not match actual file size."
    Close #outfhndl
    Close #bmpfhndl
    Return 0
  End If
  
  'At this point, we're pretty sure it's a valid BMP, so we'll continue loading the other fields.
  
  'Just skip the fields we don't care about
  fpos += SizeOf(bmphdr.res1)
  fpos += SizeOf(bmphdr.res2)
  
  Get #bmpfhndl, fpos, bmphdr.pixelOffset
  fpos += SizeOf(bmphdr.pixelOffset)
  
  Get #bmpfhndl, fpos, bmphdr.bmiSize
  fpos += SizeOf(bmphdr.bmiSize)
  
  Get #bmpfhndl, fpos, bmphdr.imgWidth
  fpos += SizeOf(bmphdr.imgWidth)
  Get #bmpfhndl, fpos, bmphdr.imgHeight
  fpos += SizeOf(bmphdr.imgHeight)
  
  Get #bmpfhndl, fpos, bmphdr.planes
  fpos += SizeOf(bmphdr.planes)
  Get #bmpfhndl, fpos, bmphdr.bpp
  fpos += SizeOf(bmphdr.bpp)
  
  
  'Print information on the screen
  Print "Pixel Offset:            &h" + Hex(bmphdr.pixelOffset)
  Print "BMI Size:                &h" + Hex(bmphdr.bmisize)
  Print ""
  Print "Image Width:             " + Str(bmphdr.imgWidth)
  Print "Image Height:            " + Str(bmphdr.imgHeight)
  Print "Planes:                  " + Str(bmphdr.planes)
  Print "BPP:                     " + Str(bmphdr.bpp)
  Print "Actual bytes per pixel:  " + Str((bmphdr.planes)*(bmphdr.bpp)/8)
  
  'Now print it all over again into our log file
  Print #outfhndl, "Pixel Offset:            &h" + Hex(bmphdr.pixelOffset)
  Print #outfhndl, "BMI Size:                &h" + Hex(bmphdr.bmisize)
  Print #outfhndl, ""
  Print #outfhndl, "Image Width:             " + Str(bmphdr.imgWidth)
  Print #outfhndl, "Image Height:            " + Str(bmphdr.imgHeight)
  Print #outfhndl, "Planes:                  " + Str(bmphdr.planes)
  Print #outfhndl, "BPP:                     " + Str(bmphdr.bpp)
  Print #outfhndl, "Actual bytes per pixel:  " + Str((bmphdr.planes)*(bmphdr.bpp)/8)
  
  'Close the log file and the BMP itself
  Close #outfhndl
  Close #bmpfhndl
  
  'Return -1 for success
  Return -1
  
End Function
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''




''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
'The main program
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
'Most of the work is done in the function checkBMP(), thus this part is very short

Print "Checking BMP..."
Print ""

checkBMP("test.bmp")

Print ""
Print "Done."

Sleep
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

Whew! I won't discuss this much more, as the comments should be quite sufficient. One of our longest and most complicated programs yet, but I think you can understand it. Go through it and you'll see how it works. If you don't understand it at first, go through it several times. The methods used here are important for you to understand if you're doing binary file I/O. As a side note here, notice how many times the same basic code is repeated:

'If the file is invalid If SOMECONDITION Then 'Print the error message, close the files, return 0 for failure Print "Error loading BMP - SOME ERROR MESSAGE HERE." Print #outfhndl, "Error loading BMP - SOME ERROR MESSAGE HERE." Close #outfhndl Close #bmpfhndl Return 0 End If

The same basic code, with only a few modifications, is repeated three times. If we were actually loading the BMP, there might be even more errors to check for and we might have to repeat the code even more times. Is there a better way to do this? We could put the common code at the bottom of the function, below the "Return -1", and have a label and Goto. This isn't preferable (most people hate Goto in general, and for good reasons) but it's one of the few times Goto use might be justified (even more so when you're in a bunch of nested loops). However, when dealing with things like file I/O and memory allocation (with pointers) and other resource stuff, where you have to let go of a resource before you can quit, there's a very nice little solution which we'll hit in a few chapters. Don't worry about it now, but keep that thought in mind.

Well, that about covers it for binary file I/O. There is another file mode, called Random, but I don't use that nor have I seen anyone else use it much if at all. If you need it, it's there. Other than that, just play around with the various binary file functions, consulting the manual or this tutorial if you need to - remember, practice makes perfect! Once you find places you actually need file I/O, you'll get better at it as you use it. But a little practice writing random "useless" programs never hurts.

Tutorial 10 is probably going to be about graphics, but it isn't concrete yet so I'm not making any promises. I may decide to jump right into the OOP features of FreeBASIC, too, and put graphics off for another day. In any case, you can be assured that both of those topics will come up in future tutorials, along with any other topics I decide need to be covered. Given that these are all very complicated, you can expect great delays between each tutorial (particularly since school has started and I'll have schoolwork to do), but stay tuned! Great things are coming up.

If you would like to download the source code for the BMP checker program (so you don't have to mess around with selecting all that code) here it is: bmpchecker.bas


(<--Previous) Back to Index