DISCLAIMER
Keep in mind that this is only an analogy to help newcomer understand pointer to array in C programming language. This does not mean there’s any sort of compression when we use pointer to array.
&arr vs &arr[0]
In case you didn’t know, we can get the address of an array using the
address-of operator like &array. Here’s an example:
#include <stdio.h>
int main(void)
{
int arr[2];
printf("&arr: %p\n", &arr); // Address to _whole array_.
printf("&arr[0]: %p\n", &arr[0]); // Address to _element of array_.
return 0;
}
From the example above, you might be thinking, “isn’t &arr and &arr[0]
print out the same value, what is the difference?”. The difference is not in
the value itself, but in the implication.
&arr and &arr[0] (or we can just use arr, because array name
represent the index 0 memory address) has the same value, which is the
memory address at index 0 because the beginning of the array is also at
the index 0 of the array. The difference is at the end.
When we use &arr, the end of that expression is the end on the last
element of array. Let’s say we store integer value in 4 bytes and each memory
address correspond to 1 byte. The whole array memory address can be
represented like this (the empty line used to give clarity, the actual memory
scheme is contiguous):
001 <- the beginning of index 0 __and__ the beginning of array
002
003
004 <- the end of index 0
005 <- the beginning of index 1
006
007
008 <- the end of index 1 __and__ the end of array
With the representation above, the expression &arr means that we got the
beginning of array until the end of array. Meanwhile, the expression
&arr[0] means that we got the beginning of index 0 until the end of index
0.
Pointer to Array
With that out of the way, let’s get into pointer to array. We can declare pointer to array like this:
char (*arr)[4];
What the above declaration means is that “a pointer to an array of 4 char”.
And then, you might be thinking, “how do we give value to those arr
variable?”. One of them is like this:
char anu[4] = {'a', 'b', 'c', '\0'};
char (*arr)[4] = &anu; // Assigning memory address of anu array.
Another one is like this:
char (*arr)[4] = &"abc"; // Assigning memory address of string literal.
Or we can split those initialization with declaration and assignment like this:
char (*arr)[4]; // Declaration of arr variable.
arr = &"abc"; // Assignment of arr variable.
Let’s talk about the first example. In the first example, we are using memory
address of anu array of 4 characters for the value of arr, which is
“a pointer to the array of 4 characters”.
Now the question is that, how do we get the character b from arr variable?
You might be thinking, “can’t we just use indexing like arr[1]?”. Remember
that we got the whole array inside arr variable and square bracket or
array subscript is another form of dereference. So when we are trying to
deference the arr variable like *arr or arr[0], we got the whole array
instead of element of array. And when we do arr[1] for char (*arr)[4];,
we are actually trying to access the next memory after the whole array which
is unallocated memory and result in undefined behavior. We can’t be sure
what is in there.
Keep in mind that
arr[1]is equivalent to*(arr + 1), which means that we are dereferencing 1 offset from thearrmemory address. 1 offset in here depends on how many bytes the variable type stored in memory.
As for the second and third example, we need to know that string literal have a lifetime of static storage duration, which means that string literal will exist the entire execution of the program.
With that in mind, the expression &"abc" means that we got the memory
address of the entire string literal, which in a way is an array of
characters with different storage type than char []. If i understand
correctly, char [] has automatic storage duration, so it will be
deallocated at some point during the execution, usually when the function
returns or the declaration goes out of scope.
The expression char (*arr)[4] = &"abc"; means that we are passing the
memory address of the entire string literal to arr variable.
So, how do we access the character in char (*arr)[4]? Well, we need to
dereferencing the arr twice. The first dereference, like *arr or arr[0],
is to get the entire array and the second dereference, like (*arr)[1] or
*((*arr) + 1) or arr[0][1], is to get the element of array.
Why do we use bracket around
arrvariable like(*arr)[1]? Because square bracket or array subscript have a higher priority than the dereference operator. If we write the expression like*arr[1], that expression is equal to*(arr[1]). We are parsing the square bracket or array subscript first and then parsing the dereference operator. By adding bracket, we can overwrite the default priority. This is called precedence.The precedence only control how expression are parsed and which operators are grouped with which operands. Precedence do not control the order of evaluation. To control the order of evaluation, we need to use sequence point.
Compression File Analogy
If you are new to C programming language, like i’m at the time of writing this blog post, you might get confused by the concept of “pointer to array”. If you are confused, this analogy might help you to understand what is going on.
Imagine we have a ZIP file, which is one of the compression file format.
Inside those ZIP file, we have directory called 001. Inside those directory,
we have a file called 001 and 005 (see the representation of whole array
at the beginning of the post).
Here’s the simple representation:
- The file
001and005represent the array’s elements. - The directory
001represent the array itself. - The ZIP file represent the pointer to the array.
Here’s the simple illustration:
pointer.zip
|_ 001/
|_ 001
|_ 005
So, if we want to get the element inside the array 001/, we need to extract
the pointer.zip first and then accessing the element inside the array
001/.
Bonus: accessing character of string literal
Here’s another interesting thing about string literal that you might want to try:
#include <stdio.h>
int main(void)
{
printf("%c\n", "abc"[0]);
return 0;
}
What is the output? Now, try to analyze what’s going on :)
Alright, that’s it. See you next time!