Monday, November 16, 2009

Using C, convert a dynamically-allocated int array to a comma-separated string as cleanly as possible

EDIT: There are no "dynamic arrays", so to speak, in C. What I meant was "dynamically-allocated". I've updated the wording to reflect this.

EDIT 2: Someone on Reddit pointed out that my Python example doesn't actually work, since I have an array of ints rather than strings. I've updated the code example so it works.

I'm much less experienced in C than I am in higher-level languages. At Cisco, we use C, and I sometimes run into something that would be easy to do in Java or Python, but very difficult to do in C. Now is one of those times.

I have a dynamically-allocated array of unsigned integers which I need to convert to a comma-separated string for logging. While the integers are not likely to be very large, they could conceptually be anywhere from 0 to 4,294,967,295 In Python, that's one short line.

[code lang="python"]my_str = ','.join([str(num) for num in my_list])[/code]

How elegantly can people do this in C? I came up with a way, but it's gross. If anyone knows a nice way to do it, please enlighten me.

17 comments:

  1. I think that without using a nonstandard string library, the cleanest you'll get is something like:

    char *showlist(unsigned *xs, unsigned length)
    {
    char *result, *p;

    /* 10 digits + comma/nul for each */
    p = result = malloc(11*length);
    for (unsigned i = 0; i < length; ++i)
    {
    if (i != 0)
    *p++ = ',';
    p += sprintf(p, "%u", xs[i]);
    }

    /* you'll probably free it soon enough to not care about realloc */
    return realloc(result, p - result + 1);
    }

    But depending on the logging API, you might be able to do this cleanly without the big allocation.

    ReplyDelete
  2. Social comments and analytics for this post...

    This post was mentioned on Reddit by Bannakaffalatta: It's a magic function that was defined elsewhere....

    ReplyDelete
  3. http://pastebin.com/f6674853e

    I cheat and use the non-POSIX asprintf, but it's been available on every system I've deployed on (FreeBSD and Linux). Without it, you have to finangle around with allocating your own buffers, which is a pain.

    ReplyDelete
  4. It's a library method in Python, depending on what list library you're using in C it either exists or you'll have to write it yourself. I don't think the implementation of Python's join() is much cleaner than your version.

    I'd implement join() more or less like this:
    1) Iterate through the list while summing sizes.
    2) Allocate memory.
    3) Iterate through the list again, printing the strings to the allocated memory and incrementing the pointer.

    ReplyDelete
  5. char *join(char *delim, int *xs, int n)
    {
    char *str, *s;
    int i;

    s = str = malloc(n*11 + strlen(delim) +1);
    for (i = 0; i < n-1; i++)
    s += sprintf(s, "%d%s", xs[i], delim);
    sprintf(s, "%d", xs[i]);
    return str;
    }

    ReplyDelete
  6. It doesn't have to be super-gross, though it's not a one-liner. You'll want to allocate a buffer for the string (and reallocate it bigger if it threatens to get too long), and keep appending your ints to the end of it. The sprintf function returns the number of characters it writes, so you don't have to keep scanning through the string again and again, you can just keep a pointer to the end of it.

    Here is a rough draft - this function works but doesn't check if the list exceeds the buffer. (The little test-code main() also casually leaks memory - note that the string is dynamically allocated and has to be freed elsewhere!)

    Hope your blogging software displays this okay - hope this helps!
    -----------------------------------

    #include
    #include
    #define SIZE 30

    char * array_string(int * arr, int size)
    {
    int i;
    char * buf = malloc(1000); /* example code, check for malloc failure! */
    char *p = buf;
    if (size == 0) { /* return empty string for zero-length array */
    *buf = '';
    return buf;
    }
    for (i = 0; i < size; i++) {
    int nchars;
    nchars = sprintf(p, "%d,", *(arr+i));
    p += nchars;
    }
    *(p-1) = ''; /* replace last comma with string terminator */
    return buf;
    }

    int main(void)
    {
    int i, iarray[SIZE];
    char * str;
    for (i = 0; i < SIZE; i++) /* fill the array with numbers 0 to 29 */
    iarray[i] = i;
    for (i = 0; i <= SIZE; i++) { /* try printing a load of lists */
    str = array_string(&iarray[0], i); /* memory leak here! */
    printf("%s\n", str);
    }
    return 0;
    }

    ReplyDelete
  7. my_str = ','.join(my_list)
    is not a very clean code, because its function is not obvious

    ReplyDelete
  8. http://codepad.org/uNj4ltwd

    doesn't really check any errors but allocating your own memory isn't really all that bad.

    ReplyDelete
  9. Finangling around with allocating your own buffers is a pain, especially when you try to free() string literals!

    ReplyDelete
  10. Hey There,

    out of curiosity, I plagiarized your post and asked it as a question on stackoverflow.

    http://stackoverflow.com/questions/1745811/using-c-convert-a-dynamically-allocated-int-array-to-a-comma-separated-string-as

    I wanted to see what would happen :)

    ReplyDelete
  11. #include
    #include

    /* My approach is to count the length of the string required. And do a single alloc.
    Sure you can allocate more, but I don't know for how long this data will be retained.
    */

    #define LEN(a, b) ((sizeof (a)/sizeof (b)))

    int main(void) {

    unsigned a[] = {1, 23, 45, 523, 544};
    int i, str_len=0, t_written=0;
    char tmp[11]; /* enough to fit the biggest unsigned int */

    for(i = 0; i < LEN(a, unsigned); i++)
    str_len += sprintf(tmp, "%d", a[i]);

    /* total: we need LEN(a) - 1 more for the ',' and + 1 for '' */
    str_len += LEN(a, unsigned);
    char *str = malloc(str_len);
    if (!str)
    return 0;

    t_written += sprintf(str+t_written, "%d", a[0]);
    for(i = 1; i < LEN(a, unsigned); i++)
    t_written += sprintf(str+t_written, ",%d", a[i]);

    printf("%s\n", t_written, str);

    free(str);
    return 0;
    }

    ReplyDelete
  12. Neater post :)
    http://codepad.org/07vrA26F

    ReplyDelete
  13. You can get a slightly more accurate count of digits by taking log10 + 1, so:

    int maxmagnitude(int *array, int len)
    {
    int i, max = -1;
    for(i = 0; i max ? array[i] : max;

    return (int) log10(max) + 1;
    }

    and then when you allocate memory, you can allocate only enough to hold the maximum # of characters for the largest # + 2 (, and ' ') times the length of the array.

    ReplyDelete
  14. Wow, that got mangled...

    #include
    #include
    #include
    #include

    int maxmagnitude(int *array, int len)
    {
    int i, max = -1;
    for(i = 0; i max ? array[i] : max;

    return (int) log10(max) + 1;
    }

    char *join(int *array, int len, char sep)
    {
    int i, max = maxmagnitude(array, len);

    char *string = (char *)(calloc(max + 2, len));
    char *buf = (char *)(calloc(max + 2, 1));

    for (i = 0; i < len; i++) {
    int last = (i == (len - 1));
    sprintf(buf, "%d%c%c", array[i], last ? '\n' : sep, last ? '' : ' ');
    strcat(string, buf);
    }

    free(buf);
    return string;
    }

    int main()
    {
    int nums[] = {100293020,219399,310938,4,5,6,122,3,455};
    char *joined = join(nums, 9, ',');
    printf("JOINED: %s", joined);
    free(joined);
    }

    ReplyDelete
  15. Hey,
    Stackoverflow already got 13 answers :)
    But I don't know C, so I don't know which is the best answer.
    Can you take a look?
    http://stackoverflow.com/questions/1745811/using-c-convert-a-dynamically-allocated-int-array-to-a-comma-separated-string-as

    ReplyDelete
  16. http://codepad.org/BRSqL2Yb

    Accept that the cleanest way to do this in C is to have a limit on the length of the output string. Using realloc() to resize the string, return the string as a char * and then make the caller free() it later on is butt ugly.

    ReplyDelete
  17. Taras: Thanks a bunch for submitting it :) I really like the winning answer, but it uses snprintf(), and I have to use the Visual Studio libraries, which don't have it.

    ReplyDelete