Improving on my C (little?) understanding.

Message

tomazzi · #21 Post by **tomazzi** » 2016-07-11 23:22

The above is obviously a test - the question is: who is testing who...

reinob · #22 Post by **reinob** » 2016-07-12 06:55

tomazzi wrote: Size of pointer of any type is always the same as the size of void*

While I have never seen an example of the contrary, the standard (C99 at least) explicitly allows different sizes for pointers to different types. Casting to (void *) should not lose information, but not the other way around.

tomazzi · #23 Post by **tomazzi** » 2016-07-13 09:48

reinob wrote:
tomazzi wrote: Size of pointer of any type is always the same as the size of void*
While I have never seen an example of the contrary, the standard (C99 at least) explicitly allows different sizes for pointers to different types. Casting to (void *) should not lose information, but not the other way around.

Actually, the standard says about representation and alignment of pointers, not the size (except that sizeof(void*) is platform-dependand)
In fact, in the CPU hardware all pointers are the same, including function pointers.

Both alignment and representation of pointers is related to assembly or rather hardware level, not the C level.
Possible representation problem: some 64-bit CPUs can't in fact use 64bit pointers (f.e. on amd64 real size of void ptr is 48bits).

Alignment: some of architectures can read/write 32/64bit variables only from addresses which are multiple of 4 or 8 respectively, and the following example will cause crash/compiler warning/error, or at best performance loss:

Code: Select all

int32_t  var;
void    *vptr;
int32_t *ptr32 = 32;

vptr = ptr32;
var = *(++ptr32); /* read from address 36: OK */
var = *((int32_t*) (++vptr)); /* read from address 33: unaligned: crash/warning */

Regards.

luvr · #24 Post by **luvr** » 2016-07-22 14:46

edbarx wrote:

Code: Select all

*t++ means first increment the pointer, then dereference it,
otherwise this wouldn't work.

Hmmm... Isn't that the other way around? That is, first dereference the pointer, then increment it (which is known as "post-increment"). Incrementing first, then dereferencing is what wouldn't work, if I understand your code correctly.

srq2625 · #25 Post by **srq2625** » 2016-07-22 16:24

A couple of comments on style/readability/maintainability from days gone WAY past (thus, some may no longer be relevant):

Look into function prototypes - these can/are used to define both the return type of a function but also the parameter list (and parameter types) of the function.
Header files - create one (or more) such for your application and use that in all the files contained within your project. This, in conjunction with #1, will help to avoid parameter type/number mis-match.
Use "{" and "}" (I call these braces) to define code blocks - even blocks of only one line. It often happened to me that code I thought would only be one statement turned out to be many more than this. If one adds statements to a block and fails to define that block with braces, then all but the first statement will be either ignored, always executed, or cause syntax errors. For example (disregarding naming conventions for now - it's been a long time since I last coded in 'C'):
Code: Select all
```
    if(*pString != '\n'){
        do_this();
    }
    else {
        HowAboutThis();
        AndThisOne();
    }
```
versus
Code: Select all
```
    if(*pString != '\n'){
        do_this();
    }
    else 
        HowAboutThis();
        AndThisOne();
```
Imagine what would happen if you came back later and added the call to "AndThisOne()" and forgot to define the code block at that time - the function "AndThisOne()" would always be called. Not what was intended. The bugs introduced by this "error" are particularly hard to find when reading the code because we all read what we think we are seeing.

Anyway, just a couple of thoughts as you progress...

tomazzi · #26 Post by **tomazzi** » 2016-07-22 20:06

luvr wrote:
edbarx wrote:
Code: Select all
*t++ means first increment the pointer, then dereference it,
otherwise this wouldn't work.
Hmmm... Isn't that the other way around? That is, first dereference the pointer, then increment it (which is known as "post-increment"). Incrementing first, then dereferencing is what wouldn't work, if I understand your code correctly.

This is a bit tricky, but of course You are right.
The trick is, that postincrement has higher precedence than indirection (dereferencing), but the effect of postincrementing is visible for the next statement in the code (this is defined by the "sequence points" specification for C language). Otherwise it would be simply not different from pre-incrementing.

srq2625 wrote:3. Use "{" and "}" (I call these braces) to define code blocks - even blocks of only one line.

Good point, but code blocks (scopes) are in fact far more important:

Code: Select all

typedef struct {
   int  err_code;
   char err_desc[64];
}  err_t;

int func1(void) {
   err_t err1, err2;
   
   err1.err_code = some_glibc_fn();
   if (0 != err1.err_code) {
      strerror_r(err1.err_code, err1.err_desc, 64);
      return err1.err_code;
   }

   err2.err_code = some_glibc_fn();
   if (0 != err2.err_code) {
      strerror_r(err2.err_code, err2.err_desc, 64);
      return err2.err_code;
   }
}

int func2(void) {
   {
      err_t err1;
      err1.err_code = some_glibc_fn();
      if (0 != err1.err_code) {
         strerror_r(err1.err_code, err1.err_desc, 64);
         return err1.err_code;
      }
   }
   {
      err_t err2;
      err2.err_code = some_glibc_fn();
      if (0 != err2.err_code) {
         strerror_r(err2.err_code, err2.err_desc, 64);
         return err2.err_code;
      }
   }
}

The above functions are logically the same, but they are producing *completely* diffrent resulting code:
Func1 uses 136 bytes of stack memory and func2 only 68 bytes:
The compiler can't determine if both structures will be neded at runtime, so for both functions it must generate the code for allocating the stack memory for both err1 and err2.
But the key difference is, that in func2, the err2 structure will be allocated in the same stack memory space as the err1 (which is out-of scope when the code execution enters the scope of err2).

In other words: scopes allow to efficiently use/manage the stack memory.

Regards.

edbarx · #27 Post by **edbarx** » 2016-08-03 14:11

I am posting a working expression evaluator that I coded as an exercise to improve my proficiency in C. For those interested to give me feedback about its quality, I tell them there is NO need for them to examine every detail of the code. This is like giving one's opinion about a novel by reading parts of it.

Thanks for anyone helping me in this.

expr_calc.h:

Code: Select all

#ifndef _EXPR_CALC_H_
#define _EXPR_CALC_H_

#include <stdio.h>
#include <ctype.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>

#define NUMBER_TYPE double

#define token_delta 50
#define CHAR_SIZE sizeof(char)

#define _DEBUG

typedef enum {
  id_undef = -1, 
  id_op, /* operator */
  id_sign,
  id_br, /* bracket  */
  id_num
} t_idy;

typedef struct {
  t_idy idy;
  union {
    char cmd;
    NUMBER_TYPE num;
  };
} t_expr_token;

typedef t_expr_token* p_expr_token;

typedef struct {
  t_expr_token* tokens;
  int token_count;
  int malloc_token_count; 
  char* fatal_error_ptr;
  int error_pos;
  int memory_error; /* */
  
  double result;
} t_calc_struct;

t_calc_struct* create_calc(char* expr, int* error);
void init_calc(t_calc_struct* calc);
void free_calc(t_calc_struct* calc);

t_calc_struct* copy_calc(t_calc_struct* source);
int valid_calc_stack(t_calc_struct* calc);
NUMBER_TYPE calc_eval(t_calc_struct* calc);

#endif

expr_calc.c:

Code: Select all

#include <stdio.h>
#include <ctype.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
#include "expr_calc.h"

void init_calc(t_calc_struct* calc) {
  calc->tokens = NULL;
  calc->token_count = 0;
  calc->malloc_token_count = 0; 
  calc->fatal_error_ptr = NULL;
  calc->error_pos = -1;
  calc->result = 0.0;
  calc->memory_error = 0;
}

/* returns 1 on success, 0 on failure */
static int malloc_tokens(t_calc_struct* calc) {
  void* realloc_res;
  int success = 0; 

  if (calc->malloc_token_count == 0) {
    calc->malloc_token_count = token_delta;
    calc->tokens = malloc(calc->malloc_token_count*sizeof(t_expr_token));
    
    if (calc->tokens != NULL)
      success = 1;
      else success = -1;
  } else {
    calc->malloc_token_count += token_delta;
    realloc_res = realloc(calc->tokens, calc->malloc_token_count*sizeof(t_expr_token)); 
    
    if (realloc_res != NULL)
      success = 2;
      else success = -2;
  }
  
  if (success == 1 || success == 2) {
    return 1;
  } else if (success < 0) {
    calc->malloc_token_count -= token_delta;
    return 0;
  }
  
  return 0; /* never reached */
}

t_calc_struct* copy_calc(t_calc_struct* source) {
  t_calc_struct* dest = malloc(sizeof(t_calc_struct));
  if (dest == NULL) return NULL;
  *dest = *source;
  
  int j;
  dest->tokens = malloc(source->malloc_token_count*sizeof(t_expr_token));
  if (dest->tokens == NULL) {
    free(dest);
    return NULL;
  }
  
  for (j = 0; j < source->token_count; j++)
    dest->tokens[j] = source->tokens[j];
  
  return dest;  
} 

void free_calc(t_calc_struct* calc) {
  free(calc->tokens);
  free(calc);
}

static void do_syntax_check(t_calc_struct* calc);

static char* save_number_to_tokens(t_calc_struct* calc, char* start) {
  char* end;
    
  if (start != NULL) {
    NUMBER_TYPE x = strtod(start, &end);
        
    calc->tokens[calc->token_count].idy = id_num;
    calc->tokens[calc->token_count].num = x;
    
    calc->token_count++;
    start = NULL;
  }
  
  return end;   
}

static void save_single_char_token(t_calc_struct* calc, char ch, t_idy id) {
  calc->tokens[calc->token_count].idy = id;
  calc->tokens[calc->token_count].cmd = ch;
  
  calc->token_count++;
}

static void tokenise(t_calc_struct* calc, char* expression) {
  char ch;              /* character at iterator's position */
  char* end;
      
  calc->token_count = 0;
  calc->error_pos = -1;
  calc->fatal_error_ptr = NULL;    
  if (calc->malloc_token_count == 0 && malloc_tokens(calc) < 0)
      return;    
      
  t_idy prev_id = id_undef, id; 
  while (expression) {
    if (calc->malloc_token_count == calc->token_count) {
      if (malloc_tokens(calc) < 0)
        calc->memory_error = 1;
        return;
    }
      
    ch = *expression;
    if (ch == '\0' || ch == '\n') break;
     
    switch (ch) {
      case '0':
      case '1':
      case '2':
      case '3':
      case '4':
      case '5':
      case '6':
      case '7':
      case '8':
      case '9':
      case '.':
        end = save_number_to_tokens(calc, expression);
        expression = end;
        prev_id = id_num;
        continue;    
    
      /* these char tokens are fully defined by themselves */
      case '(': 
      case ')':
        save_single_char_token(calc, ch, id_br);
        prev_id = id_br;
        break;
      
      /* these char tokens are fully defined by themselves */
      case '*':
      case '/':
      case '^':
        save_single_char_token(calc, ch, id_op);
        prev_id = id_op;
        break;
      
      case '+':
      case '-':
        if (
          prev_id == '*'           || 
          prev_id == '/'           ||
          *(expression - 1) == '(' ||
          prev_id == id_undef      ||
          prev_id == id_sign       ||
          prev_id == id_op
        ) id = id_sign;
          else id = id_op;
          
        save_single_char_token(calc, ch, id);
        prev_id = id;
        break;
        
      case ' ':
        expression++;
        continue;  
        
      default:
        calc->fatal_error_ptr = expression;  
    }
    
    expression++;
    if (calc->fatal_error_ptr) return; 
  }
  
  do_syntax_check(calc);
}

static void remove_tokens(t_calc_struct* calc, int from, int count) {
  int index = from, insert_pos = from;
  
  for (index = from + count; index < calc->token_count; index++) {
    calc->tokens[insert_pos] = calc->tokens[index];
    insert_pos++; 
  }  
} 

static void simplify_signs(t_calc_struct* calc, int from, int *to) {
  int index = from;
  int positive = 1, sign_found = 0;
  while (index <= *to) {
    while (calc->tokens[index].idy == id_sign) {
      sign_found++;
      if (calc->tokens[index].cmd == '-')
        positive = !positive;
        
      index++;
    }
    
    if (sign_found) {
      remove_tokens(calc, index - sign_found, sign_found);
      *to -= sign_found;
      calc->token_count -= sign_found;
      index -= sign_found;
      if (!positive) {
        calc->tokens[index].num = -(calc->tokens[index].num);
      }
      
      sign_found = 0;
      positive = 1;
      continue;
    }
      
    index++;
  }
}

static void do_syntax_check(t_calc_struct* calc) { 
  int index = -1;
  t_idy token_type;
  t_expr_token pred;
  
  calc->error_pos = -1;
  int brackets = 0;
  
  while (++index < calc->token_count) {
    token_type = calc->tokens[index].idy;
    switch (token_type) {
      case id_op:
        /* operator must be between first and last token excluding both ends */
        /* operator must succeed a number or a ')' */
        if (index > 0 && index < calc->token_count - 1) {
          pred = calc->tokens[index - 1];
          if (pred.idy == id_num ||(pred.idy == id_br && pred.cmd == ')'))
            continue;
        }
        
        calc->error_pos = index;  
        return;            
        
      case id_sign:
        /* sign must lie between token 0 and last token excluding the last */
        /* sign must succeed a '(', a sign and operator*/
        if (index == 0) continue;
        
        if (index >= 0 && index < calc->token_count - 1) {
          pred = calc->tokens[index - 1];
          if ( 
            pred.idy == id_sign || pred.idy == id_op ||
            (pred.idy == id_br && pred.cmd == '(')       
          ) continue;
        }
        
        calc->error_pos = index;  
        return;           
         
      case id_br:
        /* '(' can succeed op, sign, '(' */
        /* '(' must lie between token 0 and last one excluding the last token */
        
        /* ')' can succeed ')' and number */
        /* ')' must lie from token 2 onwards */
        
        if (calc->tokens[index].cmd == '(') {
          brackets++;
          if (index == calc->token_count - 1) {
            calc->error_pos = index;
            return; 
          }
            
          if (index == 0) continue;
          
          pred = calc->tokens[index - 1];
          if (
            pred.idy == id_op || pred.idy == id_sign ||
            (pred.idy == id_br && pred.cmd == '(')
          ) continue;        
        } else {
          brackets--;
          if (index < 2) { 
            calc->error_pos = index;
            return; 
          }
          
          if (brackets < 0) {
            calc->error_pos = index;  
            return;  
          }  
          
          pred = calc->tokens[index - 1];
          if (pred.idy == id_num || (pred.idy == id_br && pred.cmd == ')'))
            continue;
        }
        
        calc->error_pos = index;  
        return;   
        
      case id_num:
        /* number can succeed operator, sign and '(' */
        /* number can be first and last token */
        
        if (index == 0 || index == calc->token_count - 1)
          continue;
          
        pred = calc->tokens[index - 1];
        if (
          pred.idy == id_op || pred.idy == id_sign || 
          (pred.idy == id_br && pred.cmd == '(')
        ) continue;
        
        calc->error_pos = index;  
        return;
        
      case id_undef:
        calc->error_pos = index;
        return;        
    }
  }
  
  if (brackets > 0) {
    calc->error_pos = index + 1;
  }
}

static void do_binary_op(t_calc_struct* calc, char op, int from, int *to) {
  NUMBER_TYPE tmp_res;
  int index = from + 1;
  while (index < *to) {
    if (calc->tokens[index].idy == id_op && calc->tokens[index].cmd == op) {
      switch (op) {
        case '^':
          tmp_res = pow(calc->tokens[index - 1].num, calc->tokens[index + 1].num);
          break;
          
        case '*':
          tmp_res = calc->tokens[index - 1].num * calc->tokens[index + 1].num;
          break;
          
        case '/':
          tmp_res = calc->tokens[index - 1].num / calc->tokens[index + 1].num;
          break;
          
        case '+':
          tmp_res = calc->tokens[index - 1].num + calc->tokens[index + 1].num;
          break;
          
        case '-':
          tmp_res = calc->tokens[index - 1].num - calc->tokens[index + 1].num;
          break;
      }
      
      /* save tokens at */
      remove_tokens(calc, index, 2);
      calc->token_count -= 2;
      
      calc->tokens[index - 1].idy = id_num;
      calc->tokens[index - 1].num = tmp_res;
      
      *to -= 2;
    } else index += 2;
  }
}

static NUMBER_TYPE evaluate_fn(t_calc_struct* calc) {
  if (calc->fatal_error_ptr || calc->error_pos > -1 || calc->token_count == 0) return 0.0;
  
  int open_br = -1, close_br = -1, index = 0;
  /* search tokens for ')' starting from token 0 */
  while (index < calc->token_count) {
    if (calc->tokens[index].idy == id_br && calc->tokens[index].cmd == ')') {
      close_br = index;
      break;
    }
    
    index++;
  }
  
  /* search tokens for '(' searching backwards from close_br */
  index--;
  while (index > -1) {
    if (calc->tokens[index].idy == id_br && calc->tokens[index].cmd == '(') {
      open_br = index;
      break;
    }
    
    index--;
  }
  
  int to = -1;
  /* do the actual computation */
  if (close_br > -1 && open_br > -1) {
    to = close_br - 1;
    simplify_signs(calc, open_br + 1, &to);
    do_binary_op(calc, '^', open_br + 1, &to);
    do_binary_op(calc, '/', open_br + 1, &to);
    do_binary_op(calc, '*', open_br + 1, &to);
    do_binary_op(calc, '+', open_br + 1, &to);
    do_binary_op(calc, '-', open_br + 1, &to);
    remove_tokens(calc, to + 1, 1);
    remove_tokens(calc, open_br, 1);
    calc->token_count -= 2;
  } else {
    to = calc->token_count - 1;
    simplify_signs(calc, 0, &to);
    do_binary_op(calc, '^', 0, &to);
    do_binary_op(calc, '/', 0, &to);
    do_binary_op(calc, '*', 0, &to);
    do_binary_op(calc, '+', 0, &to);
    do_binary_op(calc, '-', 0, &to);
  }
  
  if (calc->token_count > 1)
    evaluate_fn(calc);
    
  if (calc->tokens[0].idy == id_num)
    return calc->tokens[0].num;
    else return 0.0;    
}

int valid_calc_stack(t_calc_struct* calc) {
  return (calc->error_pos == -1 && calc->fatal_error_ptr == NULL);
}

t_calc_struct* create_calc(char* expr, int* error) {
  t_calc_struct* calc = malloc(sizeof(t_calc_struct));
  if (calc == NULL) return NULL;
  
  init_calc(calc);
  tokenise(calc, expr);
  if (error != NULL) *error = calc->error_pos;
  return calc;
}

NUMBER_TYPE calc_eval(t_calc_struct* calc) {
  t_calc_struct* tmp_calc =copy_calc(calc);
  if (tmp_calc == NULL) {
   /* TODO: must set a variable indicating memory failure */ 
    return 0.0;
  }
  
  NUMBER_TYPE x = evaluate_fn(tmp_calc);
  free_calc(tmp_calc);
  return x;
}

The testing program:

Code: Select all

#include <stdio.h>
#include <ctype.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
#include "expr_calc.h"


int main() {
  char input[1024];
  fgets(input, 1023, stdin);
  
  t_calc_struct* calc = create_calc(input, NULL);
  
  #ifdef DEBUG  
  int i = 0;
  for (i = 0; i < calc->token_count; i++) {
    switch (calc->tokens[i]->idy) {
      case id_sign:
        printf("sign: %c\n", calc->tokens[i]->cmd);
        break;
      case id_op:
        printf("operat: %c\n", calc->tokens[i]->cmd);
        break;
      case id_br:
        printf("bracket: %c\n", calc->tokens[i]->cmd);
        break;
      
      case id_num:
        printf("number: %f\n", calc->tokens[i]->num);
        break;
        
      case id_undef:
        printf("something strange happened at index %d\n", i);
        break;  
    }
  }
  #endif 
  
  NUMBER_TYPE x; 
  if (valid_calc_stack(calc)) {
    x = calc_eval(calc);
    printf("result = %f\n", x);
  } else printf("error_pos = %d\n", calc->error_pos);
  
  free_calc(calc);

  return 0;
}

tomazzi · #28 Post by **tomazzi** » 2016-08-04 02:42

edbarx wrote:...
For those interested to give me feedback about its quality, I tell them there is NO need for them to examine every detail of the code.

The devil is in the details, Ed.

Anyway, this code is far better than the previous.

Regards.

Debian User Forums

Improving on my C (little?) understanding.

Re: Improving on my C (little?) understanding.

Re: Improving on my C (little?) understanding.

Re: Improving on my C (little?) understanding.

Re: Improving on my C (little?) understanding.

Re: Improving on my C (little?) understanding.

Re: Improving on my C (little?) understanding.

Re: Improving on my C (little?) understanding.

Re: Improving on my C (little?) understanding.