GitHub
CERT Secure Coding

STR34-C. Cast characters to unsigned char before converting to larger integer sizes

Signed character data must be converted to unsigned char before being assigned or converted to a larger signed type. This rule applies to both signed char and (plain) char characters on implementations where char is defined to have the same range, representation, and behaviors as signed char .

However, this rule is applicable only in cases where the character data may contain values that can be misinterpreted as negative numbers. For example, if the char type is represented by a two's complement 8-bit value, any character value greater than +127 is interpreted as a negative value.

This rule is a generalization of STR37-C. Arguments to character-handling functions must be representable as an unsigned char .

Noncompliant Code Example

This noncompliant code example is taken from a vulnerability in bash versions 1.14.6 and earlier that led to the release of CERT Advisory CA-1996-22 . This vulnerability resulted from the sign extension of character data referenced by the c_str pointer in the yy_string_get() function in the parse.y module of the bash source code:

Non-compliant code
static int yy_string_get(void) {
  register char *c_str;
  register int c;

  c_str = bash_input.location.string;
  c = EOF;

  /* If the string doesn't exist or is empty, EOF found */
  if (c_str && *c_str) {
    c = *c_str++;
    bash_input.location.string = c_str;
  }
  return (c);
}

The c_str variable is used to traverse the character string containing the command line to be parsed. As characters are retrieved from this pointer, they are stored in a variable of type int . For implementations in which the char type is defined to have the same range, representation, and behavior as signed char , this value is sign-extended when assigned to the int variable. For character code 255 decimal (−1 in two's complement form), this sign extension results in the value −1 being assigned to the integer, which is indistinguishable from EOF .

Noncompliant Code Example

This problem can be repaired by explicitly declaring the c_str variable as unsigned char :

Non-compliant code
static int yy_string_get(void) {
  register unsigned char *c_str;
  register int c;

  c_str = bash_input.location.string;
  c = EOF;

  /* If the string doesn't exist or is empty, EOF found */
  if (c_str && *c_str) {
    c = *c_str++;
    bash_input.location.string = c_str;
  }
  return (c);
}

This example, however, violates STR04-C. Use plain char for characters in the basic character set .

Compliant Solution

In this compliant solution, the result of the expression *c_str++ is cast to unsigned char before assignment to the int variable c :

Compliant code
static int yy_string_get(void) {
  register char *c_str;
  register int c;

  c_str = bash_input.location.string;
  c = EOF;

  /* If the string doesn't exist or is empty, EOF found */
  if (c_str && *c_str) {
    /* Cast to unsigned type */
    c = (unsigned char)*c_str++;

    bash_input.location.string = c_str;
  }
  return (c);
}

Noncompliant Code Example

In this noncompliant code example, the cast of *s to unsigned int can result in a value in excess of UCHAR_MAX because of integer promotions, a violation of ARR30-C. Do not form or use out-of-bounds pointers or array subscripts :

Non-compliant code
#include <limits.h>
#include <stddef.h>
 
static const char table[UCHAR_MAX + 1] = { 'a' /* ... */ };

ptrdiff_t first_not_in_table(const char *c_str) {
  for (const char *s = c_str; *s; ++s) {
    if (table[(unsigned int)*s] != *s) {
      return s - c_str;
    }
  }
  return -1;
}

Compliant Solution

This compliant solution casts the value of type char to unsigned char before the implicit promotion to a larger type:

Compliant code
#include <limits.h>
#include <stddef.h>
 
static const char table[UCHAR_MAX + 1] = { 'a' /* ... */ };

ptrdiff_t first_not_in_table(const char *c_str) {
  for (const char *s = c_str; *s; ++s) {
    if (table[(unsigned char)*s] != *s) {
      return s - c_str;
    }
  }
  return -1;
}

Exceptions

STR34-C-EX1: This rule only applies to characters that are to be treated as unsigned chars for some purpose, such as being passed to the isdigit() function. Characters that hold small integer values for mathematical purposes need not comply with this rule.

Risk Assessment

Conversion of character data resulting in a value in excess of UCHAR_MAX is an often-missed error that can result in a disturbingly broad range of potentially severe vulnerabilities .

Rule Severity Likelihood Detectable Repairable Priority Level
STR34-C Medium Probable Yes No P8 L2

Automated Detection

Tool

Version

Checker

Description

Astrée
25.10
char-sign-conversionFully checked
Axivion Bauhaus Suite

7.2.0

CertC-STR34Fully implemented
CodeSonar
9.1p0
MISC.NEGCHARNegative Character Value
Compass/ROSE

Can detect violations of this rule when checking for violations of INT07-C. Use only explicitly signed or unsigned char type for numeric values

Coverity
2017.07

MISRA C 2012 Rule 10.1

MISRA C 2012 Rule 10.2

MISRA C 2012 Rule 10.3

MISRA C 2012 Rule 10.4

Implemented

Essential type checkers

Cppcheck Premium

24.11.0

premium-cert-str34-c

ECLAIR

1.2

CC2.STR34

Fully implemented
GCC

2.95 and later

-Wchar-subscripts

Detects objects of type char used as array indices

Helix QAC

2025.2

C2140, C2141, C2143, C2144, C2145, C2147, C2148, C2149, C2151, C2152, C2153, C2155

C++3051


Klocwork
2025.2
CXX.CAST.SIGNED_CHAR_TO_INTEGER


LDRA tool suite
9.7.1

434 S

Partially implemented
Parasoft C/C++test2025.2

CERT_C-STR34-b
CERT_C-STR34-c
CERT_C-STR34-d

Cast characters to unsigned char before assignment to larger integer sizes
An expressions of the 'signed char' type should not be used as an array index
Cast characters to unsigned char before converting to larger integer sizes

PC-lint Plus

1.4

571

Partially supported

Polyspace Bug Finder

R2025b

CERT C: Rule STR34-CChecks for misuse of sign-extended character value (rule fully covered)
RuleChecker

25.10

char-sign-conversionFully checked
TrustInSoft Analyzer

1.38

out of bounds readPartially verified (exhaustively detects undefined behavior).

CVE-2009-0887 results from a violation of this rule. In Linux PAM (up to version 1.0.3), the libpam implementation of strtok() casts a (potentially signed) character to an integer for use as an index to an array. An attacker can exploit this vulnerability by inputting a string with non-ASCII characters, causing the cast to result in a negative index and accessing memory outside of the array [ xorl 2009 ].

Search for vulnerabilities resulting from the violation of this rule on the CERT website .

CERT C Secure Coding StandardSTR37-C. Arguments to character-handling functions must be representable as an unsigned char
STR04-C. Use plain char for characters in the basic character set
ARR30-C. Do not form or use out-of-bounds pointers or array subscripts
ISO/IEC TS 17961:2013Conversion of signed characters to wider integer types before a check for EOF [signconv]
MISRA-C:2012

Rule 10.1 (required)

Rule 10.2 (required)

Rule 10.3 (required)

Rule 10.4 (required)

MITRE CWECWE-704 , Incorrect Type Conversion or Cast

Bibliography

[ xorl 2009 ]CVE-2009-0887: Linux-PAM Signedness Issue