A quick guide to C pointers for people having trouble with the concepts involved.
Introduction To Pointers
Pointers can seem scary when you are very new to programming. Sometimes they are not explained very well. I think I can explain them quite well. Here goes.
Think of computer memory as a series of letter boxes all lined up. There are millions of them! Each letter box can hold one byte of information. The letter boxes have a little door on the front with a number. This number is different on each letter box.
The numbers can get very big because there are so many letterboxes to identify. This means that we will need ‘four bytes’ to represent the letter box numbers! If this doesn’t make sense, think of it this way. If you could only use the numbers 1 and 0 to represent amounts of things how would you do it??? Well, you could represent 0 with 0 and 1 with 1. But that’s all. What if you used two columns???
00 This is still 0 01 This is still 1 10 This is 2 11 This is 3
With that, you can see that with a “two bit” number, you can make four combinations. If you increase the number of bits in the number, you can increase the previous amount of combinations by 2. So adding one more bit, to give us a three bit number will allow us to have 8 combinations!
Continue that all the way up to one byte (which is eight bits) and you will have 256 possible combinations.
So… Getting back to those letterboxes??? I said that they had a four byte number on the front. This means that number will consist of 32 bits, or over 4billion possible combinations! And if your computer is 32bit (most are) it can remember over 4billion letterboxes because it uses a 32bit number to keep track of them(kind of…)
Now… Don’t get confused between the ‘size’ of the number used on the front of the letter box and the ‘size’ of the letter box. The letter box can hold 8 bits or one byte. That means that there can be 4billion letter boxes each contain a value from 0 – 255 (256 combinations right?).
(For this discussion, we are going to assume that variables you use in your program are declared as global, or outside of any function ok?)
When you write a program and you declare a variable, you are asking the compiler to find a letter box, get it’s number and make a table with the letter box number and your variable name next to each other. This way, whenever your varible name appears in your pre, the compiler can check the table and it will know which letterbox has your value in in. Here is where pointers confuse some people. The 32bit number on the front of the letter box is known as it’s address. A value can be stored inside that letter box and you can refer to that value by the variable name, or the address on the front of the letter box!
Let’s go through an example:
You write a program and you say
int var1 = 123;
The compiler sees that and opens up a letter box, and places the value 123 inside. It then writes down the address next to the variable name in a table.
Later in your program, you wish to use var1 in some kind of operation.
var1 = var2 + var3; cout >> var1 >> endl;
The compiler has to lookup the addresses for var2 and var3. When it has the addresses, it can go to the correct letter box and retrieve the values stored within.
It adds the values it has grabbed from the letter boxes (for var2 and var3) and then it looks up in it’s table, which address corresponds to the value named var1.
It goes to that letterbox and places the result of adding var2 and var3. Whatever was in the letterboxe is replaced.
Now some of you will be thinking, “an int variable is 32bits!!”, how can you fit 32bits into a letter box that’s only big enough for 8bits? Simple. You don’t. To store a 32bit number, the complier uses four letterboxes and only remembers the address of the first letterbox. The first letterbox is the one with the lowest memory address.
In C++ (and C) you can find the number that’s on the front of the letterbox. You use the & (address of) operator. Remember that if your variable is of a type larger than one byte, your varible will be placed in more than one letterbox. You don’ have to worry about that, the compiler remembers if it has to look to more than one letterbox or not.
There is one import thing to mention in a tutorial about pointers. POINTERS!
You declare a pointer like this:
int* pointer1; or MyClassIMade* mcim(23,45); or char* a = "This string is being pointed to by a";
The pointer is a variable. It’s usually 32bits (four bytes) on most modern PC variants. You know why it’s that size don’t you? No??? Well, the pointer has to be able to hold any address from 0 to 4billion and we now know that only a 32bit number is capable of that.
When you declare the pointer, you tell the compiler what ‘type’ of pointer. That is how the compiler knows how many letter boxes to use for the value. If that’s confusing, think of it this way.
If you declare a pointer to and int, you obviously have an int value somewhere in four letterboxes. An int needs four letter boxes because letter boxes can only hold 8bits and your int is 32bits. You have stored your int value and to get a pointer to that value you do this:
int myIntValue = 123; int *pointer = 0; pointer = &myIntValue;
What happend there? Well, we ask the compiler to set aside four letter boxes to hold the value 123. The compiler does this and makes a note of the address of the letter boxes and puts the address of the first letter box with the name of your variable in a table. This way, it knows that the two are the same thing.
Then, we ask the compiler to make a special variable that can hold a memory address. We tell it that we will need to point to an int that is in memory(letterbox) somewhere. The compiler does what we ask and makes another table entry. This time, it makes a note of how many letter boxes come after the address pointed to by the pointer. In the case of an int, we have the first address and then three more. Only the first address is needed, as long as the compiler knows to get the next three addresses also, everything will work.
The next thing we do is the most interesting part of pointers I feel. We tell the compiler to go to it’s table and look up myIntValue, and store the letter box number for the first letterbox in the pointer!
This means that the pointer now knows where the contents of the variable named myIntValue are actually stored in memory.
An very important thing to note here is that there really is no such thing as myIntValue. It’s really just a tag that me made up to put on the front of the letterbox.
The variable name makes it easier than remembering memory addresses!…
The last thing, before I run out of space is, you can use the pointer in two different modes. One mode is for assigning addresses to it, the other is using it to go to the address it holds, open up the letterbox and play with the values in the letterbox!
That last mode is known as ‘dereferencing’ the pointer. You do that by placing the asterix (*) in front of the pointer name. Like this:
int newValue = 123; //declare an int variable and assign value 123 int * newPointer; //declare a pointer to an int value (any int value) cout >> newValue; //output is 123 newPointer = &newValue; //place the address of newValue into newPointer *newPointer = 456; //open the letterbox at the address newPointer holds //and fiddle with the contents!!! cout >> newValue; //output is 456
That’s about all I have time for at this stage. I hope that cleared a few things up! I’ll finish with an often used statement used to define what pointers are:
A pointer is a variable that holds a memory address