C: Writing 4 bytes into a region of size 3 overflows the destination?

My simple C program is as follows. Initially, I've defined variable buf1 with 3 char.

I don't have any problem with 2 char such as AB or XY

user@linux:~/c# cat buff.c; gcc buff.c -o buff; echo -e '\n'; ./buff
#include <stdio.h>
#include <string.h>

int main() {
        char buf1[3] = "AB";
        printf("buf1 val:  %s\n", buf1);
        printf("buf1 addr: %p\n", &buf1);
        strcpy(buf1,"XY");
        printf("buf1 val:  %s\n", buf1);
}

buf1 val:  AB
buf1 addr: 0xbfe0168d
buf1 val:  XY
user@linux:~/c# 

Unfortunately, when I add 3 char such as XYZ, I'm getting the following error message when compiling the program.

buff.c:8:2: warning: ‘__builtin_memcpy’ writing 4 bytes into a region of size 3 overflows the destination [-Wstringop-overflow=]
  strcpy(buf1,"XYZ");

Isn't XYZ considered as 3 bytes? Why does the error message said 4 bytes instead of 3 bytes

user@linux:~/c# cat buff.c; gcc buff.c -o buff; echo -e '\n'; ./buff
#include <stdio.h>
#include <string.h>

int main() {
        char buf1[3] = "AB";
        printf("buf1 val:  %s\n", buf1);
        printf("buf1 addr: %p\n", &buf1);
        strcpy(buf1,"XYZ");
        printf("buf1 val:  %s\n", buf1);
}buff.c: In function ‘main’:
buff.c:8:2: warning: ‘__builtin_memcpy’ writing 4 bytes into a region of size 3 overflows the destination [-Wstringop-overflow=]
  strcpy(buf1,"XYZ");
  ^~~~~~~~~~~~~~~~~~


buf1 val:  AB
buf1 addr: 0xbfdb34fd
buf1 val:  XYZ
Segmentation fault
user@linux:~/c# 

You're forgetting that C strings are null-terminated. The sizeof "AB" is 3 and sizeof "XYZ" is 4, due to the implicit terminating byte. (The type of the string literal "AB" is char[3] and the type of "XYZ" is char[4].)

Had you not specified any length for buf1, it would also had been sized 3 bytes long:

char buf1[] = "AB";  // here exactly the same as char buf1[3] = "AB";

The memory layout would be

buf1
  v
  +-------+-------+-------+
  |  [0]  |  [1]  |  [2]  |
  +-------+-------+-------+
  |  'A'  |  'B'  |  '\0' |
  +-------+-------+-------+

Now, strcpy copies the terminating null character (C11 7.24.2.3p2):

  1. The strcpy function copies the string pointed to by s2 (including the terminating null character) into the array pointed to by s1. If copying takes place between objects that overlap, the behavior is undefined.

which means that 4 bytes are copied in total, but there are space for only 3 characters, therefore the code has undefined behaviour and the compiler produces the diagnostics messages. C11 7.1.4 Use of library functions p.2:

[...] If a function argument is described as being an array, the pointer actually passed to the function shall have a value such that all address computations and accesses to objects (that would be valid if the pointer did point to the first element of such an array) are in fact valid.[...]

In the actual code the implicit access to the buf1[3] is in fact not valid.

Memory layout after strcpy:

buf1
  v
  +-------+-------+-------+-------+
  |  [0]  |  [1]  |  [2]  |  ???  |
  +-------+-------+-------+-------+
  |  'X'  |  'Y'  |  'Z'  |  '\0' |
  +-------+-------+-------+-------+

The reason why the warning comes from __builtin_memcpy is because the C compiler heavily optimized this code - it replaced the strcpy of a string of known length with memcpy of known length as memcpy would be generating more efficient code.


And finally, you can fit 3 characters into char buf1[3]; by using strncpy, but the buffer cannot fit the terminating null character, and therefore it cannot be printed using printf("%s"), but you can print it with specifying explicit field width that is less than or equal to the length of the array - however the printed out value would be padded:

#include <stdio.h>
#include <string.h>

int main() {
    char buf1[3] = "AB";
    printf("buf1 val:  >%-3s<\n", buf1);
    printf("buf1 addr: %p\n", &buf1);
    strncpy(buf1, "XYZ", 3);
    printf("buf1 val:  >%-3s<\n", buf1);
}

And compiling, running it:

% gcc strncpy.c -Wall -Wextra
% ./a.out
buf1 val:  >AB <
buf1 addr: 0x7ffd7f6aecc5
buf1 val:  >XYZ<

but there is one extra space character printed after AB

79220 – missing -Wstringop-overflow= on a memcpy overflow with a , writing 8 bytes into a region of size 3 overflows the destination In function 'f': pr79220.c:7:3: warning: 'memcpy' forming offset [4, 8] is out of  After managing the copying of any unaligned trailing bytes, it then jumps into the main copy cycle at [4]. At [5], the function starts to lower the destination buffer address stored in RCX . Next, at [6], it copies eight bytes of data at a time into the RAX register, and at [7], it stores the data back in the destination buffer.

If you glance at implementation(s) of strcpy, you see that it depends on null character.

char *strcpy(char *d, const char *s)
{
   char *saved = d;
   while (*s != '\0')
   {
       *d++ = *s++;
   }
   *d = 0;
   return saved;
}

So, for char arr[3], if you try to put three characters sequence, you overwrite on '\0'. Moreover, it may iterate through forever causing stack overflow. A character sequence without null terminator gives rise to also undefined behaviour.

gcc 8 will "improve" buffer overflow checking · Issue #3127 , scrypt_fmt.c: In function 'get_binary': scrypt_fmt.c:243:2: error: 'strncpy' output between 3 and 129 bytes into a destination of size 124 return output may be truncated writing 4 bytes into a region of size between 1 and 128  Warnings in prepare() and valid() are really worth looking into, whereas get_binary() and get_salt() normally assume they can't get over-long strings and so on, because with a correct valid() they actually can't.

7.24.2.3p2 on strcpy:

The strcpy function copies the string pointed to by s2 (including the terminating null character) into the array pointed to by s1.

3 chars + '\0' == 4 chars

You'll also get 4 from:

printf("%zu\n", sizeof "ABC");

as string literals are basically anonymous char-array literals with static storage, basically equivalent to:

 static char const __anonymous[]="ABC"; /*the size gets inferred*/

or

 (char const[]){"ABC"};

(with a historical caveat about the const which isn't really there, but for all intents and purposes you should pretend it is)

New warnings generated by GCC 7 · Issue #2533 · magnumripper , While building JtR jumbo with GCC 7 I see, dynamic_fmt.c: In function nul past the end of the destination [-Wformat-overflow=] sprintf(Fld, "$$F%d", i); ^~~~~~~. may be truncated writing up to 127 bytes into a region of size between 57 ~~​~~~ sip_fmt_plug.c:261:3: note: 'snprintf' output 6 or more bytes  It can be silenced by appending -Wno-format-truncation to the CFLAGS of the Makefiles located in ./lib/ and ./perf_examples/.. I'm running on an up-to-date Arch-Linux System

1431678 – gcc7, frivolous warning messages for snprintf, ./src/desnew.c:449:17: note: 'snprintf' output between 9 and 21 bytes into a destination of size 8 snprintf(desparms.iv I would expect a warning message if I had used sprintf() Now I am obliged to: a) write to a second area and do a memcpy Please clarify C99 STMT https://linux.die.net/man/3/snprintf The  Do remember in off-by-one vulnerability we arent overwriting actual return address stored in stack (like we do in stack based buffer overflows) instead a 4 byte memory region inside the attacker controlled destination buffer ‘buf’ will be treated as return address location (after off-by-one overflow).

improve out-of-bounds pointer warning (PR 88771), With the patch, for a call like: memcpy (d, s, -1); where d and s are pointers with unknown gcc.dg/Warray-bounds-40.c: New test. s, SIZE_MAX - 3); /* { dg-​warning ".strncat. pointer overflow between offset 0 and size and 2 bytes into a region of size 0 overflows the destination" "excessive pointer offset"  only last byte written is used in the address since we incrementally write each byte of the destination. See assigned reading for writing an arbitrary 4-byte value to

C programming tips, A buffer is a region of memory acting as a container for data. You write a buffer of 123 bytes to a file using code like this: To get the compiler to allocate space for buffers, you must declare the buffer with a size that the C inserts a 3-byte pad here so b can start on a 4-byte boundary int b; // uses 4 bytes unsigned short c;  When defining structures in C, members are aligned by their size. So for example, a uint16_t which is 2 bytes in size will always be aligned on a 2 byte boundary. This means there is often some padding bytes inserted into the struct to achieve this alignment.

Comments
  • You're forgetting the null terminator which is the 4th byte!