Posts Tagged ‘ c

Passing A Variable Number of Arguments to a Function at Run-time

At the request of an Ozzu member last night, I wrote up two new quick tutorials outlining how to write functions that can accept a variable-length argument list at run-time.  I wrote the original tutorial for C programmers and later adapted it for PHP.  I’ll likely adapt it for a few other languages in the near future.

TUTORIAL: Pass Variable Number of Arguments to a C Function
TUTORIAL: Pass Variable Num. of Arguments to a PHP Function

Feedback is always welcome either here or in the tutorial topics themselves.

malloc() Causes a Segfault at _malloc_unlocked

I ran into a strange bug this week in the code to the C project I’ve been working on.  Seemingly randomly, I was encountering a segfault in a call to malloc() while allocating memory for a new struct.  The problem had me completely baffled, and web searches turned up no useful information, since most people encounter this problem when dealing with multithreading, which I was not.

I fired up gdb and ran a backtrace from the fault:

#1  0x08055e4a in _malloc_unlocked () at src/file.c:263

This didn’t help much, but my guess was that I had corrupted the heap somehow somewhere in the code before the allocation. After a bit of careful code browsing, I found the culprit:

memcpy(ptr, &obj->data.data[19], obj->data.length);
ptr += obj->data.length;

A simple memory bounds error. My intention here was to copy all of the data from position 19 through the end of the data.data buffer into the buffer pointed at by ptr.  But I had left off the 19-byte adjustment from the size argument to memcpy and the subsequent pointer incrementation.  Correcting the bounds fixed the problem and the program went on its merry way:

memcpy(ptr, &obj->data.data[19], obj->data.length - 19);
ptr += obj->data.length - 19;

The problem with these kinds of errors is that they often don’t reveal themselves until later on in the execution of the program, in an area of code that has nothing to do with the actual problem.  The code above is used in a loop and executes successfully for a while, until malloc() tried to deal with an area of memory that was accidentally written over by memcpy(). At that point, bad things happen.

Be careful with memcpy().

Object-Oriented C

Although C is regarded primarily as a procedural language, it is entirely possible to write C code structured in a way similar to code written in object-oriented languages such as C++.

Now, of course, you could go all out and write truely object-oriented C, complete with inheritance, type checking, and the like. But that’s not what we’re going to be doing here.  Instead of recreating the complete functionality of object-orientation, we’re going to look at how to write pseudo-object-oriented code in C. The key is that the code itself is still procedural, but organized in a way such that it can be used in an OO fashion.  The technique itself is very simple, and when used properly it can make code management much easier.

The first thing to address is data encapsulation.  How do we define a new data type so that the rest of the program is able to use it without knowing about its internal structure?

Doing this is rather easy.  In our header file we tell the compiler that the structure will be defined elsewhere by simply declaring the struct without defining it:

struct String;
typedef struct String String;    /* typedef'd for convenience */

Then, in our source file, we define the actual structure:

struct String {
    unsigned char *str;
    unsigned int len;
};

Now, whenever we include string.h in our program, we have access to the String type, that is, we can declare String variables and pointers, but the internal data of the struct is hidden from us. Voila – encapsulation!

The next step is to distinguish the scope of our type’s methods. This is equally as simple, and we’ll start by establishing a few simple naming conventions that will allow us to simulate the scope of functions related to our data type.

For public methods, we’ll prefix the function with the name of the type and an underscore.  For example, if we wanted to create a public method for String called append(), then the corresponding function would be String_append().

For private methods, we’ll prefix the function name with only an underscore.  For example, if we wanted to add a private method to String called resize(), the corresponding function would be _resize().

These conventions help us to visually distinguish which methods should be called by other parts of the program and which ones should be limited for use by only the module containing the data type.

But let’s not rely on these conventions alone.  Where we place our function prototypes is just as important as how we name them. Since we want to make our public methods available to other parts of the program, we place their prototypes in the header file for our module.  This grants access to these functions to any file that includes our header, just like we did with the structure.

Our private methods, however, are declared in the source file as static methods.  This ensures that only other functions within the module will be able to access them.

Let’s create a data type called String to illustrate how the technique works.  We’ll start by defining our header file, mystring.h:

#ifndef STRING_H
#define STRING_H

/* declare the struct (but don't define it!) */
struct String;
typedef struct String String;

/* declare some public methods */
String* String_new( const char *init );
void String_delete( String *str );
void String_append( String *str, const char *other );

#endif

Since the code we’re writing isn’t truely object-oriented (and we’re not messing around with all sorts of function pointers), we need a way for the functions to know which object they are acting upon. For this reason, we pass a pointer to the object as the first argument of each function. In an actual object-oriented language, a method call would look like this:

obj.method(arg1, arg2, ...);

In our pseudo-object-oriented code, the method call looks like this:

method(obj, arg1, arg2, ...);

Now let’s move on to our source file, mystring.c, where we will define the struct, declare our private methods, and define both our public and private methods.

#include <stdlib.h>
/* string.h and strings.h are included for
   strlen() and strlcpy(), respectively. */
#include <string.h>
#include <strings.h>
#include "mystring.h"

/* define the struct */
struct String {
    unsigned char *str;
    unsigned int len;
};

/* declare private methods */
static void _resize( String *str, const unsigned int newSize );

/* define private methods */
void _resize( String *str, const unsigned int newSize ) {
    if( newSize != str->len ) {
        str->str = realloc(str->str, newSize);
        str->len = newSize;
    }
}

/* define public methods */
String* String_new( const char *init ) {
    String *retval = malloc(sizeof(String));
    retval->len = strlen(init);
    retval->str = malloc(sizeof(char) * retval->len);
    strlcpy(retval->str, init, retval->len);
}

void String_delete( String* str ) {
    free(str->str);
    free(str);
}

void String_append( String* str, const char *other ) {
    int i, oldLen = str->len;
    _resize(str, strlen(other));
    for( i = oldLen; i < str->len; ++i ) {
        str->str[i] = other[oldLen - i];
    }
}

That’s essentially all there is to it. Other C modules in the program will be able to declare and create String objects, but will not have access to their internal variables and will only be allowed to call the public methods declared in the header file.

On a final note, if you consider yourself a proficient C programmer, I highly recommend checking out the book I liked to at the beginning of this article.  It’s an excellent read and gives a truely insightful look into the inner workings of many of the object-oriented language constructs we’ve come to rely on.