The Moosader Community

Visit the IRC at #Moosader! Community dedicated to programming and game development!
It is currently Wed Oct 16, 2019 7:45 pm

All times are UTC - 6 hours [ DST ]

Post new topic Reply to topic  [ 4 posts ] 
Author Message
 Post subject: C Mini-Faq
PostPosted: Tue Apr 26, 2011 7:40 am 
User avatar

Joined: Mon Sep 13, 2010 12:18 am
Posts: 116
Location: Southern United States
Various things about C have been popping up in the channel lately, so I decided to write a small FAQ on C. Let me know if you find this useful, and if there's something you'd like to see answered here.

What standards define C?

C was originally developed for the UNIX operating system, and is ultimately based a similar (and older) language called BCPL. One of the first documents that served as a defacto standard for C is "The C Programming Language",( a primer that was written by the original authors. This book is also known as K&R. C has changed a great deal since the release of that book, although it is still an interesting book from a historical perspective. The first attempt to make a dejure standard was undertaken by ANSI X3.159-1989, also known as C89. This version of the language was quickly adopted by ISO, who accepted this standard in the form of ISO 9899:1990. This version (C90) is essentially the same as C89, only ratified by a different standards body.

ISO 9899 was updated in 1994 via TCRO1, and again in 1996 under TCRO2. The current standardized version of the language is ISO 9899:1999 (C99), and this may soon be superseded by C1x. Currently, almost every compiler conforms to C89; you can check this by looking for the precense of the "__stdc__" macro, which is defined by the compiler. C99 is not nearly as ubiquitous, but again, it is possible to check support from predefined macros:

/* See if we have a standard C compiler */
#if __STDC__
  #define C89_COMPLIANT
  #if __STDC_VERSION__ >= 1999
    #define C99_COMPLIANT

Is C an outdated language?

That really all boils down to a matter of opinion. C is certainly still a usable language in this day and age, and is even still being updated. The same is actually true of many programming languages, including the first compiled language. (FORTRAN) C is a Turing-complete language, meaning that any program you can implement in another Turing-complete language can be implemented in C. Turing machines are capable of solving any problem which is computable, but are physically impossible to implement due to their requirement of an infinitely large memory. However, as a Turing-complete language, C can solve any problem which does not rely on an infinite memory and is also computable.

That's not to say that just because a language is Turing complete, that it is desirable. Brainfuck is also Turing-complete, but is not very useful because it is difficult to read. Fortunately, C is well adapted to certain tasks, such as systems programming, and is known to be fairly lightweight. Some tasks are more easily modeled in other languages; for others, C is well suited, or at least good enough.

The argument that because C is an old language, it is no longer useful is a bit disingenuous, and so is the notion that procedural programming, in general, is outdated. An older language has the benefit of having been tried in a variety of use cases, and the fact that C is still being actively updated means that it is only getting better. It would be foolish to throw away an old tool simply by its virtue of being old.

C is a tool: it is designed with a specific goal in mind to solve a class of problems. Just as you can use a knife to start a fire, you can use C to implement programs the designers never thought of. Ultimately, a knife is going to be best at cutting things; if you have a lighter at your disposal, then you shouldn't use a knife. If you want to cut things, the knife is the perfect tool for the job. Don't let anybody trick you into using a tool that doesn't make the job easier.

How specific are primitive types in C?

A common misconception is that C defines primitive types (such as int) in a specific way. You might be told, for instance, that an int in C is 32-bits wide and in 2's complement notation. They may also tell you that the size of char is 8 bits. This is not necessarily so: C89 makes no such guaruntees. What is garunteed is that the following conditions are true in an ISO C environment:

  • sizeof(char) == 1
  • sizeof(short) >= sizeof(char)
  • sizeof(int) >= sizeof(short)
  • sizeof(long) >= sizeof(int)

Note that char is indeed the size of a byte. Why is it wrong to say that the size of char is always 8-bits wide, then? Think about it this way: on most computers, a byte is 8-bits wide. However, 36-bit machines (such as the PDP10) may often work with 9-bit bytes. With the existence of unicode, it is concievable that a byte may be even larger. A much more useful definition for the byte is "the smallest unit of memory that can be addressed via a pointer." You can check the number of bits in a byte for any C89-compliant compiler by including the "limits.h" header and inspecting the value of CHAR_BITS. Also, note from the above, that it is possible for a long to be the same size as a byte!

Note that integers can still suffer from the phenomenon of overflow. Overflow occurs because integral types are a set with finite cardinality; that is, you can only store a finite range of numbers within something like an int. If you want to know the limits for these ranges, they are available as preprocessor macros in limits.h. If you want integers of a certain size, you may have stdint.h available to you. The stdint.h header is always available if your compiler is C99 compliant.

It is also a mistake to assume that floating point types are compliant with IEEE 754. Again, this is commonly the case; however, some architectures (such as the Rabbit series microcontrollers) don't use IEEE 754 floating point numbers. Usually, this is only a concern for manipulating the binary files you want to share between two machines. If you want to use the fast InvSqrt() hack from the Quake 3 engine, how your compiler defines floats is also important.

It is an error to assume that you can directly compare two arbitrary floats for equality. Floating point numbers may have multiple representations (as in IEEE754), or if a floating point value is NaN, then it will not be considered equal to itself. If you want to compare two floating point numbers for equality, you usually need to compare them by checking to see if they are within some range. The following webpage discusses several methods for implementing an equality test with error:

What assumptions can I make about composite types?

If you have a struct, it is guarunteed to be at least the size of the sum of the sizes for its members. A struct must be addressible, so its size is always at least 1. For reasons related to alignment, it may be larger. Structs retain their memory layout in such a way that the following is true:

struct Foo {
  int a;
  int b;

struct Bar {
  int c;
  int d;
  int e;

int testLayout(const struct Bar* b) {
   const struct Foo* f = (Foo*) b;
   return (b->d == f->b); /* Always true */

Union types are at least as large as the widest member in the union. Unions must be addressible, so a union always has a size of at least 1. All members of unions share the same base pointer in memory.

What assumptions can I make about pointers?

A fairly common misconception about pointers is that they always come from malloc(), and therefore, must always be released with a call to free(). While it's true that you should free any pointers that you know come from malloc(), it is not the case that malloc() is the source of all pointers. Trying to free() a pointer that you don't know came from malloc() is a cardinal sin, and will likely result in crashing your program. A pointer may point to a local variable, which is likely managed by a completely different memory management strategy alltogether.

It is also not uncommon for malloc() to prefix its result with some housekeeping information. It is may be possible to confuse free() because of this, although it is likely that free() is implemented to check for this. Here's the problem: it is entirely possible, that through a complete fluke, you happen to have valid housekeeping information in the memory where free() expects to find it. The worst case scenario is that free() gets confused and accidentally overwrites memory that it's not supposed to. For those of you using C++, it is also not necessarily true that the new operator uses malloc(); if you pass a pointer from C++ to a C function, do not free() it in the C function.

Pointers have a specific form of arithmetic associated with them. In pointer arithmetic, you can add or subtract an offset. The offset is scaled by the referent type of the pointer. Note that array subscripting is a shorthand notation for performing pointer arithmetic. If we have an int* p, the following things are true:

  • p[ i ] == *(p + i)
  • &p[ i ] == (p + i)
  • p + i == (int*) ((char*) p + i*sizeof(int))

An assumption you might have made is that a pointer will fit in an int. Knowing what you know from above, you should recognize that this assumption is flawed to begin with: sizeof(int) is compiler-dependent. However, let's assume, for the moment, that it was. Now a pointer must be capable of addressing any byte within a global address space in the program. On the 80386, both the word size and address space is 32-bits, so it this sometimes works when sizeof(int) is the size of the machine word. However, on x86_64, int is often 32-bits, while the size of a pointer is 64-bits. Therefore, trying to pack a pointer into an integer results in a loss of precision in this case. In general, you should avoid trying to pack a pointer into an integer. (even though you can pack a pointer into size_t)

Because an int may not be capable of storing a pointer, it is also true that you should not depend on int when indexing an array or performing pointer arithmetic. Instead, C89 provides the size_t type as a typedef in stddef.h. The size_t type is guarunteed to be an unsigned integer as wide as the address space, and is the domain of the sizeof operator. In general, you should use size_t when you want to return sizes of objects in memory or when indexing these objects.

Note that not all pointers are valid. The value 0 is always NULL for a pointer, (even if 0 is a valid address on the host) and it is possible to have pointer which violate alignment requirements for the host. Alignment is the constraint imposed on pointers where the host must read values from certain memory addresses. Consider the following code:

int causeAlignmentError() {
   char bytes[5];
   return *((int*) (bytes + 1))

On some architectures, this will succeed (albeit with a penalty to performance in most cases). However, for some architectures, such as the SPARC architecture, this will trigger a trap in the CPU that ultimately leads to the untimely demise of the program. Per C89, any memory buffer created by alloc() or realloc() is properly aligned to prevent access to the base pointer from triggering such a trap. Generally, structs and unions are also sized to prevent alignment. Therefore, you can access an array of these types without worrying about alignment issues. Alignment errors usually only occur whenever the above strategy is used. If you must work with a potentially unaligned value, it is much better to use memcopy() to perform a shallow copy of the source into a temporary variable of some kind.

What is pointer aliasing, and when can it go wrong?

Pointer aliasing is the phenomenon that occurs when two pointers contain the same address. This is the result of making a shallow copy of a pointer (copying the address itself) into another pointer. There are several places in your code where this may arise, including when you pass a pointer as a parameter to a function. Pointer aliasing can become a problem whenever you attempt to manipulate the memory that a pointer points to after it has been invalidated. An appropriate way to preface this discussion is with a small blurb on object lifetime.

Pointers are a rather low-level abstraction. Recall, firstly, that a pointer can point to any address within the host's memory range, whether it is valid or not. This can include addresses which do not actaully correspond to any physical memory, control registers for memory-mapped I/O, or even the operating system kernel. Pointers aren't required to explicitly carry any information about their source, just the addresses of their referents. This can be a problem if you're not paying attention to what you're doing. You can think of data as having a lifetime which specifies the period of time in which that data is what it claims to be. This can be illustrated by analogy.

Let's say you work as a collections agent. You know that a person by the name of John Q. Public owes quite a deal of money to another company, and you have bought this debt, hoping to capitalize on it. The company has supplied you with his phone number, which you will use contact him. (i.e., by "dereferencing" the phone number) Instead, you've reached the Post Office when you call the number, but you don't care, because you've made the assumption that whoever you are calling must be the debtor you are trying to reach. Therefore, you will harass the post office regardless of whether or not it is actually the debtor.

The phone number has become what is called a stale pointer. In truth, there could be a number of reasons why you could never reach John Q. Public; perhaps he has moved, or perhaps he has died. All that is important is that the phone number has been recycled to refer to something that is completely different, and this represents perhaps the worst case scenario in dealing with a stale pointer: performing an operation on an entirely unrelated data structure. With the right operation, this can lead to completely trashing the contents of the new data structure. The compiler cannot prevent this, because it believes the pointer refers to a Person, not a Company.

This scenario can happen with or without pointer aliasing; however, the likelyhood of it happening with pointer aliasing is much greater in some cases. Pointers are not magically updated whenever an object exceeds its lifetime, so an aliased pointer will become a stale one. Another example, this time in C code, will illustrate how you can alias a pointer which always refers to an object which has exceeded its lifetime:

int* getStalePointer() {
  int iGoAway = 5;
  /* This address will not be valid for the callee because it was allocated on the stack */
  return &iGoAway;

[ Todo: add in information on restrict qualifier ]

I was looking through an older code base, and function declarations look weird. Why is this?

If you've been looking at some of the code from older projects, you might have noticed function declarations look a little like this:

long avg(x, n)
  long x;
  long n; {
  return x/n;

This is the syntax from K&R, and was originally used to define functions. As you can see, this syntax is a bit bulky, and a replacement syntax became available in C89. As a result, nobody really uses the K&R syntax anymore. If you're working on a new project, you should probably consider using the new ANSI C syntax for function definitions.

Why doesn't the C standard library have X?

There may be several reasons why the C standard library doesn't have a particular function you're looking for. The most common reason is that what you want is difficult to implement portably, and may not even fit the model of the host environment. For instance, it doesn't make sense for the standard library to provide sockets: that's up to the operating system to do. Because C can be used with a variety of platforms, a decision had to be made whether or not to restrict C to a limited number of platforms, or to make C as minimal as possible so that it could be used in a variety of environments. ISO went for the latter decision.

It is also possible that what you are looking for wasn't as big of a problem when the last version of the standard was being written. For instance, multithreading was not a big concern in 1989, because few machines could effectively leverage it. Now we have multicore machines, so threading is being considered for C1x. While it is almost impossible to predict changes like these, the designers of C do the best they can to figure out what is useful enough to be included in the standard library. If you can't wait for the next release, you always have the option of using an external library.

How do I handle endianness issues?

Because C is close to the machine, it is sometimes affected by issues related to endianness. Endianness is the way that the bytes in a number are ordered. The two major forms of endianness, big-endian and little-endian, are equally likely to be chosen by somebody designing a new processor architecture. If you want a rough shot in the dark, this will cover most cases:

#include <stddef.h>
#include <string.h>

enum Endianness {

/* For detecting endian */
Endianness hostEndian() {
  int testValue = 0x11223344;
  switch (*((char*) &testValue)) {
     case 0x44:
        return ENDIAN_BIG;
     case 0x11:
        return ENDIAN_LITTLE;
        return ENDIAN_UNKNOWN;

/* Precondition: length of src == length of dest == n */
void endianConvert_(const char* src, char* dest, size_t n) {
  while(n--) {
    *src = dest[n];

int endianConvert(const void* src, void* dest, size_t n, Endianness srcE, Endianness destE) {
   /* Endianness unknown; can't continue */
      return 0;

   if (srcE == ENDIANNESS_HOST)
      srcE = hostEndian();
   if (destE == ENDIANNESS_HOST)
      destE = hostEndian();
   /* Different endianness? Then swap */
   if (srcE != destE)
      endianConvert_((const char*) src, (char*) dest, n);
      memcpy(dest, src, n);

   return 1;

Some architectures have mixed endianness; however, this isn't common, so implementing an endianness conversion routine for this is an exercise left to the reader.

You can't use the object-oriented paradigm in C... right?

C is a procedural language, and so doesn't have the concept of a class. While this would seem to make object-oriented programming impossible in C, it is possible to do object-oriented programming with the clever use of function pointers. Let's start out by defining a class by its traditional definition: a collection of states and associated behaviors. There are three attributes of an object class to consider:

  • Encapsulation - restricting access to the members of a class
  • Inheritance - the ability to say that an instance of a subclass has the attributes and behaviors of its parent class
  • Polymorphism - the ability for an instance of a subclass to be substituted for an instance of the parent class

Let's start by modeling our class in the following way. We'll have a part which represents the class instance that stores the object's data, and a part which stores the virtual methods. Let's try to model a stack in this way:

struct Stack;

struct StackVTable {
  int (*push)(struct Stack*, const void*);
  int (*pop)(struct Stack*);
  void* (*top)(struct Stack*);
  size_t (*size)(struct Stack*);
  void (*destroy)(struct Stack*);

struct Stack {
   const StackVTable* vtable;

A StackVTable contains the virtual interface for the stack. You can see we have the basic interface for a stack, in addition to a class destructor. Now if we want to create a linked stack, we do the following:

#include <stdlib.h>
#include <string.h>

#define THIS ((struct LinkedStackNode*) self)

struct LinkedStackNode {
   struct LinkedStackNode* prev;

struct LinkedStack {
   struct Stack parent;
   size_t elementSize;
   size_t listSize;
   struct LinkedStackNode* top;

/* Handlers for linked-stack method */

static int lsPush(struct Stack* self, const void* e) {
  struct LinkedStackNode* target = (struct LinkedStackNode*) malloc(struct sizeof(LinkedStackNode) +
  /* Couldn't create the node; fail */
  if (!target)
    return 0;
  /* Construct the node */
  memcpy((target + 1), e, THIS->elementSize);
  target->prev = THIS->top;

  /* Now set the top */
  THIS->top = target; 

  return 1;

static int lsPop(struct Stack* self) {
  if (THIS->top) {
     struct LinkedStackNode* prev = THIS->top->prev;
     THIS->top = prev;
     return 1;
       return 0;

static void* lsTop(struct Stack* self) {
  if (THIS->top)
    return (THIS->top + 1);
    return 0;

static size_t lsSize(struct Stack* self) {
  return THIS->listSize;

static void lsDestructor(struct Stack* self) {
  if (THIS->top) {
     struct LinkedListNode* cur, *prev;
     cur = THIS->top;
    /* Free all of the nodes */
     while(cur) {
        prev = cur->prev;
        cur = prev;
     /* Stack after this is garbage */

/* Virtual table for the linked stack */
static const struct StackVTable LINKED_STACK_VTABLE = { lsPush, lsPop, lsTop, lsSize, lsDestructor };

struct Stack* initLinkedStack(size_t elementSize) {
  struct LinkedStack* ret = (struct LinkedStack*) malloc(sizeof(struct LinkedStack));

  /* We couldn't allocate the stack; we can't go on */
  if (!ret)
     return 0;

  /* Now construct the LinkedStack */
  ret->parent.vtable = &LINKED_STACK_VTABLE;
  ret->elementSize = elementSize;
  ret->listSize = 0;
  ret->top = 0;

  return &ret->parent;

#undef THIS

At this point, we've sufficiently demonstrated the three properties of Object-Oriented programming. The LinkedStack inherits all of the fields of Stack, which, at the moment, is just a VTable. We also demonstrate that, as long as the struct definition for LinkedStack isn't part of the public interface for the library. Finally, we demostrate polymorphism through the use of virtual tables. Currently, we can send a message to the objects explicitly by calling the appropriate method from their Vtable; for instance, by calling stack->vtable->size(stack); However, in many cases, its nice to have a public interface for this. We will create functions called "thunks" which simplify these calls:

int stackPush(struct Stack* stack, void* e) {
  return stack->vtable->push(stack, e);

int stackPop(struct Stack* stack) {
  return stack->vtable->pop(stack);

void* stackTop(struct Stack* stack) {
  return stack->vtable->top(stack);

size_t stackSize(struct Stack* stack) {
  return stack->vtable->pop(stack);

/* Note; assumes Stack was allocated with malloc() */
void stackDestroy(struct Stack* stack) {

Nobody said it was easy, but it is doable. Note that there are a couple of things we can't really do here. For instance, protected access is out the window: they must either use private or public access. For the VTable, we can only have public members, but it doesn't make any sense to have virtual private members, anyway. We also cannot finalize methods unless we finalize them from the get-go: the method must be final in the base class. Still, this pattern can be useful in some situations, and a number of C libraries use a pattern like this to simply things.

What does the volatile qualifier do?

The volatile qualifier can be applied to a type to indicate that the compiler must not cache the value of a variable. This is particularly useful if you are waiting for a condition variable to be modified by an another process, like another thread or an interrupt. Consider the following code:

int signalOccurred = 0;

void waitForSignal() {
  signalOccurred = 0;

An optimizing compiler might noticed that there is no opportunity for signalOccurred to update within the while loop, so it instead decide to replace it with the following code, thinking it is being smart:

void waitForSignal() {
  signalOccurred = 0;

Now, this is a problem if another process actually does modify signalOccurred. In most cases, the decision the compiler made would've made perfect sense. In the context of a single thread, signalOccurred is provably unable to change. However, because the outside process may eventually change signalOccurred, this is the absolute wrong thing to do. What we need to do is give the compiler a hint, so that it will always check signalOccurred, even if it believes optimizing the loop is a better choice. We can do this by declaring signalOccurred as "volatile int".

Note that the volatile qualifier works much like the const qualifier. You can implicitly cast an int* to a volatile int*, for instance, but not the other way around. You can also combine volatile with const. Declaring a variable const volatile means that the C interface cannot change the value of a variable, but some outside process is permitted to do so.

How well does C play with other languages?

For most langauges, C plays the role of a least-common denominator when accessing system interfaces or other libraries. Most languages have some mechanism which they can use to somehow interface to C-code. Some resources for interfacing C to some popular languages can be found below:

A word of warning to people trying to mix C++ and C code; C++ code may not easily link to C code. C++ is also stricter about certain things. For instance, you cannot implicitly cast from (void*) in C++ code, but C code will let you do this without hassle. Also, any references to C library code must be declared with extern "C". If you don't do this, the compiler will probably try to mangle the name of the symbol you're using, and you'll get linker errors.

"Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. -- C. Babbage

 Post subject: Re: C Mini-Faq
PostPosted: Mon May 16, 2011 12:49 pm 
Site Admin
User avatar

Joined: Wed May 14, 2008 4:43 am
Posts: 2328
Location: Kansas City
Jeebus that's a lot of info. I'm going to sticky it since it's good information. ^_^

Android apps by Moosader! - Open Source projects -

PostPosted: Wed Jan 03, 2018 2:49 am 

Joined: Wed Jan 03, 2018 2:24 am
Posts: 53

 Post subject: Re: C Mini-Faq
PostPosted: Sat May 11, 2019 5:21 pm 

Joined: Wed Mar 13, 2019 4:37 am
Posts: 28345

Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 4 posts ] 

All times are UTC - 6 hours [ DST ]

Who is online

Users browsing this forum: No registered users and 0 guests

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: