User Defined Types (UDTs) and Dynamic Arrays
by notthecheatr
To the pros: please don't complain if the explanation given here is a little bit simplistic. This is intended to explain things to beginners in a way they can understand, not so much to show them precisely how everything works - after all, even I don't understand that.
A frequently asked question on the FB forum involves dynamic or variable-size arrays and UDTs. Basically, it's always a variation of the form "why can't I ReDim arrays inside of UDTs?" This is an important question and I want to settle it once and for all. This should help you to better understand how your compiler handles things in general, and it will explain why you can't simply use dynamic arrays within UDTs.
First, we have to understand memory layout. Every variable takes up memory space. A byte takes up a byte, a short takes up 2 bytes, an integer takes up 4, and so on. Everything takes up a certain amount of space in memory, and no more or less. As you create variables, the compiler has to allocate, or create space for them in the fixed space you have. The more variables you create, the more memory things will take up. You can also have dynamically sized variables (arrays, strings, etc.). These things don't necessarily have a fixed size; they may change in size. For example the string "Hi!" takes up only 3 bytes (or 6 if it's a wide string) but that figure will change if, during run-time, you decide to change it to "Hello!" This stuff is all taken care of the run-time library, and you don't have to worry about it. You can also create your own dynamically sized buffers using pointers; more on that later.
Likewise, a UDT is a structure in memory that takes up a fixed number of bytes. For example,
Code:
Type myType As uByte a, b As uShort c As uInteger d End Type |
should take up exactly 8 bytes, though padding might be done to make things more efficient. The point is, it never needs to take up more or less bytes. To add dynamic elements to UDTs, we would normally use pointers. Pointers always take up the same amount of space, normally 4 bytes, but they can point to another area in memory called a "buffer" that can take up more or less space - in fact, its size can change at any time without changing the size of the UDT, because the UDT only has the pointer to the buffer in it, not the buffer itself. Pointers and buffers will be described later, but for now let's look at what would happen if we put a dynamic array inside of a UDT.
In an array, on the other hand, there is no pointer - it's stored directly as it is. So for example the UDT
Code:
Type myType As uInteger myuIntArray (5) End Type |
should take 5*SizeOf(uInteger) which is normally 20 bytes (once again, give or take padding).
But what about dynamic arrays? Those can change size! Now internally these are handled in a special way, using array descriptors and such, but basically it's the same principle. So now if you have a dynamic array inside your UDT, it may have to change size, which means the UDT has to change size.
What's wrong with changing size? As long as we have enough memory, isn't that OK? Well, there is a slight problem: if items are allocated in a specific place in memory by the compiler, then if you try to make one of the items larger, it may have to move to a different place if there isn't enough room between it and the next item allocated after it in memory. Say we create two dynamic arrays of bytes, the first with three items and the second with four. Now the first array will take up the memory space between 0 and 2, while the the second will take up the space between 3 and 6. If you try to resize the first one to five items, there isn't enough room for it because it would have to go from 0 to 4 and part of that is taken up by the second array. So what do we do with it? We move it over to the end of the allocated space. So now the second array takes up from 3 to 6 and the first array takes up from 7 to 11. Now it's at a different place in memory! How will we know where it is?! Fortunately the run-time library takes care of that, but it's important to recognize that things will usually need to move if they are variable-sized and they get larger.
Now where the UDT is stored in memory - that is, where the space is allocated - is decided by the compiler when you compile the program. So if you try to change the size of the UDT at run-time, you're going to have problems, because the UDT is placed in a specific area. What if you tried to get a pointer to the UDT? If your UDT changed size, it might have to move to a different place in memory. What's wrong with that? Only that the place the UDT is stored in is specifically known by the compiler when the program is first compiled and the parts of the program that access members of the UDT directly access those places. So now if the UDT moves the program is accessing the wrong areas in memory! The run-time library could take care of that, but there's another problem. What if you make a pointer to the UDT? Now when the UDT moves the pointer is wrong! With dynamic arrays, this is OK because the pointer points to an array descriptor. But with UDTs, this is all wrong. We want the array to be stored as part of the UDT, and it can't be!
In theory it could be made to work, but it would require a lot of changes to the compiler and the run-time library, and then UDTs would act weird and it would be *really* strange to try to read/write the UDT to a file. In short, it doesn't make any sense.
So how DO we use dynamically-sized memory areas in UDTs? Well, as I hinted at earlier, we use pointers! So if you wanted to use
Code:
Type myType As uInteger myArray() End Type Dim As myType someObject ReDim someObject.myArray(1) someObject.myArray(1) = 1 ReDim someObject.myArray(3) someObject.myArray(1) = 1 someObject.myArray(2) = 2 someObject.myArray(3) = 3 For i As uInteger = 1 to UBound(someObject.myArray) Print someObject.myArray(i) Next i |
now we can do something like this:
Code:
Type myType As uInteger Ptr myBuffer As uInteger numItems End Type Dim As myType someObject someObject.myBuffer = Allocate(1*SizeOf(uInteger)) someObject.numItems = 1 someObject.myBuffer[1] = 1 someObject.myBuffer = ReAllocate(someObject.myBuffer, 3*SizeOf(uInteger)) someObject.numItems = 3 someObject.myBuffer[1] = 1 someObject.myBuffer[2] = 2 someObject.myBuffer[3] = 3 For i As uInteger = 0 to someObject.numItems-1 Print someObject.myBuffer[i] Next i DeAllocate(someObject.myBuffer) |
Study this code carefully - note the similarities and differences carefully. Remember, the pointer always takes up 4 bytes, so the size of the UDT never changes - only the size of the buffer pointed to by the pointer changes. So we have a dynamically sized memory area without any of the problems described above! One thing to note is that the first item is 0 instead of 1. Also, we have to store separately the number of items in the buffer - if you go beyond the number of items in the buffer (such as someObject.myBuffer[3] when there are only 3 items in the buffer), nothing will stop you, so you'll have a nasty bug as a result of accessing something outside of the area actually allocated. Also, if you forget to DeAllocate memory when you're done with it it's called a memory leak and your program wastes memory needlessly. These are things to watch for and note, but so long as you keep these things in mind pointers are very powerful and useful and solve a number of problems in Computer Science.
Hopefully now you understand why these things are the way they are. There is a reason
for everything! I've also introduced you to a new and better way of doing things that solves
some of the problems involved with doing things the old way. For a more thorough explanation
of pointers and buffers, look at my page here
. If
that answers all your questions and more, you're welcome If not, ask me via e-mail or the
forums and I'll try to explain better. Until my next article, I am, as always, notthecheatr.