Why are C character literals ints instead of chars?

char to ascii int in c
ascii int c
c escape characters
c string to int ascii
char to int using ascii
convert a string containing ascii characters in decimal to an int c
ascii convert char to int
how to convert character to ascii in c

In C++, sizeof('a') == sizeof(char) == 1. This makes intuitive sense, since 'a' is a character literal, and sizeof(char) == 1 as defined by the standard.

In C however, sizeof('a') == sizeof(int). That is, it appears that C character literals are actually integers. Does anyone know why? I can find plenty of mentions of this C quirk but no explanation for why it exists.

discussion on same subject

"More specifically the integral promotions. In K&R C it was virtually (?) impossible to use a character value without it being promoted to int first, so making character constant int in the first place eliminated that step. There were and still are multi character constants such as 'abcd' or however many will fit in an int."

Type difference of character literals in C and C++, Such literal has type char and the value equal to the representation of c-char in the In C, character constants such as 'a' or '\n' have type int, rather than char. Yet another wrong answer. The issue here is why character literals and char variables have different types. Automatic promotions, which reflect the hardware, aren't relevant -- they're actually anti-relevant, because char variables are automatically promoted so that's no reason for character literals not to be of type char. The real reason is multibyte literals, which are now obsolete.

The original question is "why?"

The reason is that the definition of a literal character has evolved and changed, while trying to remain backwards compatible with existing code.

In the dark days of early C there were no types at all. By the time I first learnt to program in C, types had been introduced, but functions didn't have prototypes to tell the caller what the argument types were. Instead it was standardised that everything passed as a parameter would either be the size of an int (this included all pointers) or it would be a double.

This meant that when you were writing the function, all the parameters that weren't double were stored on the stack as ints, no matter how you declared them, and the compiler put code in the function to handle this for you.

This made things somewhat inconsistent, so when K&R wrote their famous book, they put in the rule that a character literal would always be promoted to an int in any expression, not just a function parameter.

When the ANSI committee first standardised C, they changed this rule so that a character literal would simply be an int, since this seemed a simpler way of achieving the same thing.

When C++ was being designed, all functions were required to have full prototypes (this is still not required in C, although it is universally accepted as good practice). Because of this, it was decided that a character literal could be stored in a char. The advantage of this in C++ is that a function with a char parameter and a function with an int parameter have different signatures. This advantage is not the case in C.

This is why they are different. Evolution...

Character literal, Is the size of character literals different in C and C++? Uncategories Tutorial :Why are C character literals ints instead of chars? Tutorial :Why are C character literals ints instead of chars? Unknown 23:45. Unknown. Question:

I don't know the specific reasons why a character literal in C is of type int. But in C++, there is a good reason not to go that way. Consider this:

void print(int);
void print(char);

print('a');

You would expect that the call to print selects the second version taking a char. Having a character literal being an int would make that impossible. Note that in C++ literals having more than one character still have type int, although their value is implementation defined. So, 'ab' has type int, while 'a' has type char.

Literals, values include specifying an integer value for a code point, such as an ASCII code value or a Unicode code point. @Cody: the two decisions may be related, in that the correct datatype for doing "calculations" on characters in C is int.But literals having the same type as these functions' parameters isn't as simple as it looks.

using gcc on my MacBook, I try:

#include <stdio.h>
#define test(A) do{printf(#A":\t%i\n",sizeof(A));}while(0)
int main(void){
  test('a');
  test("a");
  test("");
  test(char);
  test(short);
  test(int);
  test(long);
  test((char)0x0);
  test((short)0x0);
  test((int)0x0);
  test((long)0x0);
  return 0;
};

which when run gives:

'a':    4
"a":    2
"":     1
char:   1
short:  2
int:    4
long:   4
(char)0x0:      1
(short)0x0:     2
(int)0x0:       4
(long)0x0:      4

which suggests that a character is 8 bits, like you suspect, but a character literal is an int.

C Constants, What is the correct way to express a character literal in Java? Why are C character literals ints instead of chars? folks, I tried to print out the size of char in C. With the following code, I got the result output as . int, 4 char, 1 char?, 4 Why is the last one not the same as the 2nd one? Thanks.

In C, a character literal is treated as int type where as in C++, a character literal is treated as char type (sizeof('V') and sizeof(char) are same in C++ but not in C). The C11 standard says that character constants (e.g., 'x') are of type int, not char.This surprised and confused me (especially as a relative beginner). I came across this answer Why are C character literals ints instead of chars?, which somewhat cleared things up, but still left me wondering why it seems to be routine practice (at least in all the books and tutorials I've come across) to

A character literal is a type of literal in programming for the representation of a single character's value within the source code of a computer program. Languages that have a dedicated character data type generally include character literals; these include C, C++, This may be done directly via converting an integer literal to a character,  The following is the famous line from the famous C book - The C programming Language by Kernighan & Ritchie with respect to a character written between single quotes.. A character written between single quotes represents an integer value equal to the numerical value of the character in the machine's character set.

An integer literal can be a decimal, octal, or hexadecimal constant. There are certain characters in C that represent special meaning when preceded by a  886 If an integer character constant contains a single character or escape sequence, its value is the one that results when an object with type char whose value is that of the single character or escape sequence is converted to type int. Yes. In C, a character literal is an int not a char. (char is the smallest integer datatype) PS: The same is

How to declare and define string and character literals in C++. std::​string_literals; // enables s-suffix for std::string literals int main() { // Character UTF-8 character literals of type char (char8_t in C++20), for example u8'a' literal to point to the same location in memory, instead of having each reference  Confusing behavior of sizeof with chars . This question already has an answer here: Why are C character literals ints instead of chars? 12 answers#include<stdio.h>#include<string.h> int main(vo…

Comments
  • sizeof would just return the size of a byte wouldn't it? Aren't a char and an int equal in size?
  • This is probably compiler (and architecture) dependent. Care to say what you're using? The standard (at least up to '89) was very loose.
  • no. a char is always 1 byte large, so sizeof('a') == 1 always (in c++), while an int can theoretically be sizeof of 1, but that would require a byte having at least 16bits, which is very unlikely :) so sizeof('a') != sizeof(int) is very likely in C++ in most implementations
  • ... while it's always wrong in C.
  • 'a' is an int in C - period. C got there first - C made the rules. C++ changed the rules. You can argue that the C++ rules make more sense, but changing the C rules would do more damage than good, so the C standard committee wisely hasn't touched this.
  • Multi-character constants are not portable, even between compilers on a single machine (though GCC seems to be self-consistent across platforms). See: stackoverflow.com/questions/328215
  • I would note that a) This quotation is unattributed; the citation merely says "Would you disagree with this opinion, which was posted in a past thread discussing the issue in question?" ... and b) It is ludicrous, because a char variable is not an int, so making a character constant be one is a special case. And it's easy to use a character value without promoting it: c1 = c2;. OTOH, c1 = 'x' is a downward conversion. Most importantly, sizeof(char) != sizeof('x'), which is serious language botch. As for multibyte character constants: they're the reason, but they're obsolete.
  • +1 from me for actually answering 'why?'. But I disagree with the last statement -- "The advantage of this in C++ is that a function with a char parameter and a function with an int parameter have different signatures" -- in C++ it is still possible for 2 functions to have parameters of same size and different signatures, e.g. void f(unsigned char) Vs void f(signed char).
  • @PeterK John could have put it better, but what he says is essentially accurate. The motivation for the change in C++ was, if you write f('a'), you probably want overload resolution to choose f(char) for that call rather than f(int). The relative sizes of int and char are not relevant, as you say.
  • Yes, "Design and Evolution of C++" says overloaded input/output routines were the main reason C++ changed the rules.
  • Max, yeah i cheated. i looked in the standard in the compatibility section :)
  • +1 for being interesting. People often think that sizeof("a") and sizeof("") are char*'s and should give 4 (or 8). But in fact they're char[]'s at that point (sizeof(char[11]) gives 11). A trap for newbies.