Introduction
Quick summary of who are behind this material:
-
Author: Pasi Sarolahti
-
Web page created using scripts by Markus Holmström
-
Acknowledgments: the TMC developers (Matti Luukkainen, Jarmo Isotalo, Tony Kovanen, Martin Pärtel) have been of great help in setting up and maintaining the TMC system. Some of the course material and examples are based on the earlier version of the course by Raimo Nikkilä. The past and present course assistants (including Riku Lääkkölä, Konsta Hölttä, Essi Jukkala, Tero Marttila) have fixed errors and otherwise improved the content and exercises.
Foreword
This material is not intended as a complete reference about C, but just aims to contain sufficient information to get started with C programming. For additional and more complete information, it is recommended that you get a book that contains additional details about various aspects. A very commonly used reference is "The C Programming Language" by Brian W. Kernighan and Dennis M. Ritchie (currently at its 2nd edition). In this material we typically refer to this book by just as "the K&R book".
An important part of the course (and generally, in learning C) is to do small programming exercises that are embedded within this material. The exercises are designed so that if you have read the text from the beginning without jumping around, you should be able to do the exercise with the information you have read until that point. Therefore, when you encounter an exercise, you could stop reading and try to do the exercise.
Doing the exercises follow roughly the following cycle (see Instructions for more details):
- Write some code in editor
- Compile the code using the Makefiles provided. If there are compile errors or warnings, resolve those by modifying your code, until no errors or warnings are produced by compiler.
- Execute and test the code by running src/main in exercise directory. If something works in unexpected way, go back to 1.
- Run local TMC tests (may not work in Windows). If there are failures, try to figure out what is wrong and go back to 1.
- If local tests passed, submit your code to the TMC server. If those tests passed, you have completed the exercise. Note that the server may implement different tests than the local checker. If there were failures, jump back to 1.
There will be about 70-80 exercises altogether in the course, so the above process will become familiar to you.
Happy hacking!
Introduction to C
The wikipedia article on C language gives a succinct summary of the relevant properties of C, and its relationship to other common programming languages.
C program consists of (usually several) functions, written on one or more text-based source code files. These files are processed by a compiler and linker that produce a binary executable file understood by the underlying computer. While the text-based C programs are intended to be portable, i.e., the same code should work in different computers, the binary executable is specific to the architecture it was compiled for (for example, Intel-based 64-bit Linux). When moving the program code to a different machine, it therefore needs to be recompiled for that machine. Also, every time you modify the source code, you will need to recompile it. This is a significant difference to higher-level interpreted languages, such as Python.
Building an executable from C source code happens in the following distinct phases, in the following order:
-
Preprocessor: processes macros, inclusion of header files, conditional compile instructions, etc, to prepare the source code for actual compiling into binary code.
-
Compiler: compiles the preprocessed, text-based code into native binary object code. The object code still contains symbols for external references (for example, functions implemented in other libraries), and cannot be executed as such. If a C program is split into multiple source code files, a separate object file (with .o suffix) is produced from each source code file (marked with .c suffix).
-
Linker: links together the different object files and resolves the references, producing the actual executable that can be run in the system.
Typically a C compiler, such as gcc performs all of the above
steps. For example, typing gcc -o exec source1.c source2.c
on a
command line shell reads source files source1.c and source2.c,
executes the above three steps on them, and produces a binary
executable called "exec", that can be run on the command
line. However, by using different command line options, the gcc
compiler can be instructed to perform only a selection of the above
phases.
In addition to source code files with .c suffix, a C program usually uses header files with .h suffix. These contain definitions of data types and functions needed by the C program, and they enable using definitions that are external to a particular source code file. A C program takes these definitions into use with a series of #include preprocessor directives that are located in the beginning of a C source code file.
There are two kinds of header files: user-defined headers and the
headers needed for using system libraries. The #include directive
format differs a bit in these two cases (e.g., #include
"source.h"
vs. #include <stdio.h>
). During
the first module you don't need to mind about these much -- the
exercise templates contain the necessary include statements.
There are different ways of working with the C code. You can edit the source code files using a separate text editor such as kate, emacs or vi (all installed in Aalto Linux systems), and use the command line shell to compile and test the code. Alternatively, you can use an Integrated Development Environment (IDE), that has an integrated graphical user interace for editing code, compiling it, debugging it, and so on. Using either of these alternatives are possible on this course. See the Instructions for more details.
The First C Program
Below is a very simple C-program. It prints one line of text to screen and finishes after that.
1 2 3 4 5 6 7 | #include <stdio.h> int main(void) { /* The following line will print out some text */ printf("Hey! How are you?\n"); } |
The first line tells that the program uses the definitions for "Standard I/O" functions provided by the standard library. This header is needed, for example, for printing text on the screen. We will look into these functions in more detail later.
Every executable C program must have the main function that the system calls when the program execution is started. In the above example, line 3 starts the main function.
The definition of the function is included in a block enclosed with
braces (from line 3 to 7), and consists of one or (usually) more
statements. Each statement ends with a semicolon. In this
example we only have one statement on line 6, calling the printf
function that outputs text on screen. On line 5 there is a descriptive
comment enclosed inside /*
and */
markers. Such comments can
contain any free-form text, and are intended for programmer to leave
clarifying notes for the reader of the code. Compiler ignores anything
inside the comment marks.
The 'printf' function shown in the example outputs text on the
screen. It is defined in the Standard I/O library, which is why we had
to begin the program with the #include
directive. In this example
the printf function contains one parameter (inside parenthesis), which
is a string that is written to the screen. The string ends with a
newline character (\n
), causing the following output to be printed
on the next line. The 'printf' function call, like all other C
statements, ends in semicolon (;). Forgetting the semicolon is a
common mistake for beginners that causes compilation of source code to
fail.
As you might guess, the program prints "Hey! How are you?" on the screen. You can try this program by yourself to practice the use of compiler.
C is liberal about the formatting of the source code. The spaces or line feeds in the code do not affect the program logic. In extreme case the whole program could be written on a (very long) single line. However, even though the C compiler does not care about formatting, it is important to follow clear and consistent programming style. Otherwise the program code is very difficult to read. Some rules of thumb for clear style:
- Indent code based on program blocks: whenever a new block of statements is started, indent the start of the line by a consistent space. Always use consistent spacing (for example 4 spaces for every indent, or one tab for every indent)
- apply clear and consistent style in naming of variables and functions. They should be descriptive enough, but very_long_local_variable_names are usually not a good idea.
- divide program in logical, not-very-long functions, instead of writing everything in the main function.
- use comments to explain logic that might not be easy to follow for someone else by just reading the code.
Text editors, especially in IDEs, often try to assist the programmer in following consistent style, for example by applying indentation automatically (sometimes to the point of irritation, if the programmer disagrees with the style).
Let's try the first exercise, as described below.
Task 02-intro-1: Hi C! (1 pts)
Objective: Test that your development environment works, and you can return exercises to TMC. Get initial feeling of printf function.
Implement function three_lines (in source.c of the TMC exercise template) that prints three lines in the following way. Also the last line (March) should be followed by a new line:
1 2 3 | January February March |
Data Types and Variables
In C (and many other programming languages) data is stored in variables. Each variable has a name and a data type that determines what values can be stored in the variable. C applies static type checking, meaning that compliance to declared data types is checked already at the compile time. Therefore, before a variable can be used, it must be declared with an indication of its data type.
The variable names in C are case-sensitive. The name can consists of alphabetic characters, numbers and underline (_), but must not start with number. These naming rules apply all different types of names in C: functions, data types, and so on.
Integer data types
Integers are perhaps the most common data type in C (although this depends on the application area). There are different types of integers in C, differing in the amount of memory space they require, and consequently the number range they can represent. The integer data types are the following:
- char -- size: 8 bits (1 byte), signed values from -127 to 127, unsigned values from 0 to 255.
- short int -- 16 bits (2 bytes), signed values from -32767 to 32767, unsigned values from 0 to 65535
- int -- at least 16 bits, typically 32 bits (4 bytes), signed values from -(231 - 1) to 231 - 1, unsigned values from 0 to 232 - 1
- long int -- at least 32 bits, can be 64 bits (8 bytes)
- long long int -- 64 bits, signed values from -(263 - 1) to 263 - 1, unsigned values from 0 to 264 - 1
For each basic data type, a declaration can be contain signed
and unsigned
keywords (before the actual type), to specify whether
the data type is intended for only positive values, or also for
negative values. If this is not specified, the default is to apply
signed type, except in the case of char, where the default behavior is
implementation dependent. Unfortunately, the basic integer types do
not always have the same range, and for example in the old
implementations the size of int type can be shorter than in modern implementations.
For long int
and short int
a shorter form of long
and short
can be (and are usually) used.
Below are examples of few variable declarations. When a variable is declared, it can be set to have an initial value, or it can be left uninitialized. If a variable is not initialized, its initial value is unknown, and results in unpredictable program behavior. Therefore it is recommended to initialize the variable when declaring it, when possible.
1 2 3 4 5 6 7 8 | int main(void) { char varA = -50; unsigned char varB = 200; unsigned char varB2 = 500; // Error, exceeds the value range int varC; // ok, but initial value is unknown long varD = 100000; } |
The above example also shows another way of using comments: a single line comment can start with two slashes (//). Such comment ends at the end of line. Because C is liberal about formatting, we can add a comment on the same line with other code.
One problem with the basic data types presented above is that the
exact size of the variable can differ between architectures. The C99
standard also includes new, fixed size integer definitions that
improve the portability of programs between architectures. These are
defined in the stdint.h
header, and are as follows:
- uint8_t, int8_t: unsigned and signed 8-bit integer
- uint16_t, int16_t: unsigned and signed 16-bit integer
- uint32_t, int32_t: unsigned and signed 32-bit integer
- uint64_t, int64_t: unsigned and signed 64-bit integer
Constants are fixed values defined by the programmer when writing
the code. By default integer constants assume int data type, i.e.,
they can represent a 32-bit value range in modern systems. Above we
saw constants -50, 200, 500 and 100000. By default constants follow
the decimal (base 10) number system, but there is a representation
format for giving octal (base 8) constants, and for giving hexadecimal
(base 16) constants. Octal constants start with digit 0. Hexadecimal
constants are prefixed with 0x
. In the early parts of the course
operating with decimal numbers is sufficient, but later we take a
closer look at hexadecimal notation. Below are examples of each of
these notations.
1 2 3 | short a = 012; /* set variable a to octal 012, equal to decimal 10 */` short b = -34; /* just using decimal number here */` short c = 0xffff; /* hexadecimal constant, equal to decimal 65535 */` |
If constant for long data type is needed, 'L' needs to be added to the
end of the number: long la = 10000000000L;
.
Even though C is statically typed, the types are not strictly enforced, and the type of a value is implicitly converted when, for example, assigning integer constant into char type variable. For example, assigning value 1000 to char variable is possible, but as it is likely incorrect code, the compiler will warn about this. If programmer ignores this warning, the actual value of the variable will become equivalent to the 8 lowest bits of decimal 1000, which is a different number.
Floating point numbers
For presenting large numbers, or fractions of integers, floating point data types can be used. Internally, a floating point number is built from three components: the sign bit, significand, and exponent. The number is then composed of these three parts in the following way:
number = (-1)sign * 1.significand * 2exponent
Because of the way how floating point numbers are stored in binary memory, the floating point numbers cannot cover a continuous number space. Therefore floating point calculation operations do not always give an exactly correct results, but sometimes a value "close" to the correct result comes out. In addition, typically computation with floating point numbers is slower than with integers. Therefore integers are often used in C, and floating point numbers are only used when the integer value range is not sufficient. Additional details can be found in a related Wikipedia article.
There are three floating point data types, differing in how many bits are allocated to the above three components:
- float -- 32 bits (1b + 23b significant + 8b exponent)
- double -- 64 bits (1b + 52b significant + 11b exponent)
- long double -- 80 or 128 bits
The constants for floating point numbers can either use the conventional decimal format (e.g., 1.543), or exponent format (1e-2), or combination of both. The default data type for floating point constant is double, but if the constant is suffixed by 'F', it is assumed to be of type float. For example:
1 2 3 | float d = 0.534; double e = 2e10; float g = 0.111F; |
String and character constants
String constants are included in double quotes, as we saw together with the printf call in the first example. Operating with strings in C requires understanding arrays and pointers, and therefore we defer that to Module 2 for now.
The characters shown to user in C follow the ASCII encoding scheme. There are also various other encoding schemes, but the common property in all of them is that each visible character has a numeric character code. For example, in ASCII, letter 'A' is stored similarly as decimal number 65 in the system memory, but shown as 'A' when printed to the screen as character. C supports character constants to make it easier to operate with ASCII-encoded characters. Character constants are included in apostrophes ('):
int char_A = 'A';
It is important to make distinction between string constants ("text") and character constants ('t'), because they stand for different data type. The character constants are of int type, similar to normal integer constants, and strings are arrays of char variables (as will be discussed in Module 2)
Note that constants '1' and 1 are different in C: '1' is same as decimal number 49 according to the ASCII system, whereas 1 is just decimal number 1, but is nothing relevant interpreted as ASCII. Both are integers nonetheless.
Arithmetic operators
Above we have seen cases of assignment operator (=) when initializing the variables together with declaration. Assignment can also be done separately of the declaration, and an earlier used variable can be re-assigned -- after the following three lines, variable var has value 20:
1 2 3 | int var; var = 10; var = 20; |
Operators + (plus), - (minus), * (multiply), / (divide) and % (modulus) are used as normal mathematical operators. As customary, + and - have lower precedence than *, / and % (i.e., the latter are evaluated first, regardless of their position in the expression). The modulus operator can only be applied for integers, but the others work for both floating point numbers and integers. Parenthesis can be used to control the precedence (order of computation) as taught in school. Here are a few examples:
1 2 3 4 | float fa = 5.0 / 2; /* using '5.0' to distinguish float constant from integer constant */ int ia = 5 / 2; /* different result than above because this is integer; */ char cb = 3 * (1 + 2); long lc = cb * fa; |
The above example also shows that multiple operators and expressions can be used to form a single statement -- here together with variable declaration and its initialization.
C provides an alternative unary way for incrementing or decrementing
the value of a variable by one, by using increment and decrement
operators. These operators take either postfix or prefix form. In
postfix form, a++
increments value of a by one, and a--
decrements
the value of a by one. In prefix form, these operators are ++a
and
--a
. The functionality in postfix and prefix formats is not
completely equivalent: in postfix form the value of the expression is
evaluated before the adjustment to the variable takes place, but in
prefix form the value is evaluated after the adjustment. This can have
significance when the unary increment/decrement operator is used as
part of a longer expression.
Another alternative is to use assignment operators, such as a += 2
which is equivalent a = a + 2
. The assignment operator formant works for all above arithmetic operators.
Below example shows these operators in use:
1 2 3 4 5 6 7 | int main(void) { int varA; /* Value is unspecified now */ varA = 10; /* value is set to 10 */ varA++; /* value is 11 */ varA *= 2; /* value is 22 */ } |
Type conversions
Because C can do implicit type conversions between variables, there can be multiple data types as part of single expression. When larger data type is converted into a smaller one, the excess high-order bits are dropped, and therefore the value may change. When float is assigned into integer, the decimals will be truncated.
Conversions can be forced explicitly using a type cast by including the intended data type before an expression in parenthesis. Sometimes this can affect the outcome of the expression, as happens in the following example:
1 2 3 | float f = 1.5; int a = f + f; int b = (int) f + (int) f; |
The above program causes the value of 'a' to be 3, while value of 'b' is 2. The first assignment to variable 'a' calculates 1.5 + 1.5 = 3 (as floating point number), which is automatically converted to integer as part of assignment operation to 'a'. In the second case (assignment to 'b'), the value of 'f' is first converted to integer on both sides of the plus operation, which causes its value to change from 1.5 to 1. After this the result becomes 2. Use of type casts is normally not necessary in simple programs, but sometimes are unavoidable.
Task 02-intro-2: Fix the types (1 pts)
Objective: Get first touch on proper use of basic data types in C.
The exercise template contains function fix_types, that makes three calculations and outputs the results. The purpose would be to print the first result at a precision of one decimal, and the later two result as integers. Unfortunately the function outputs incorrect results.
Fix the function in such way that it prints correct results. The expected output is:
5.3 8000000 66666
Do not touch the printf line, but correct the data types defined for the variables in the function.
Functions
The C programs are organized in functions. A function contains a single logical part of the program, and it can be called from other functions of the program (or from within the function itself). Use of functions enables reusing code: in a well designed program any particular part of program logic only needs to be implemented once in a single function, that is then called from other parts of the code.
Function has four main components: name, return value, argument declarations, and the body of the function. Below is an example of a simple function definition of function 'square' that multiplies argument 'base' by itself and returns it as the result of the function (lines 1-5). Below the 'square' function definition there is the main function that has three calls to the square function (lines 9-11).
1 2 3 4 5 6 7 8 9 10 11 12 13 | int square(int base) { int res = base * base; return res; } int main(void) { int val = square(3); int val2 = square(val * 2); int val3 = square(square(val)); return 0; } |
Line 1 above starts with the data type of the function return value
(here 'int') that can be one of the data types introduced above
(there are also other data types that can be used, but more about
those later). Then comes the function name, 'square'. The function
arguments are listed inside parenthesis. There can be multiple
arguments separated with comma, but here we only have one parameter,
'base' that has int data type. Each argument must have data
type and a name. It is also possible that function does not have any
arguments. In such case void
is used to represent an empty argument
list. This is the case with the main function on line 7.
After the function return value type, name, and arguments is the definition of the function body, inside curly brackets. The 'square' function body is on lines 2 to 6, and the 'main' function body is on lines 8 to 13. As discussed earlier, the program execution always starts from the main function, and each function body consists of one or more statements that make the program logic.
The function arguments work like any other local variables inside the function implementation. This can be seen in the 'square' function body, where the 'base' argument is used in the expression that multiplies the given argument by itself, and stores the result to variable named 'res'.
The result of the function is indicated by the return statement (on line line 4 for the 'square function). When program encounters the return statement, it exits the function, and returns to the point of the code where the function was called. At the same time the value of the expression following the return statement is passed to the caller of the function. For example, when the main function calls the 'square' function for the first time on line 9, the value of variable 'val' is set based on the result of the function. In this case it will be 9. The return statement can be in any part of the function definition, and there can be multiple return statements in single function definition.
The main function calls the 'square' function repeatedly three times with different parameters. As can be seen, the parameter can be any expression, not just a constant value. In such case, the expression is evaluated first, and when the result is known, it is passed to the function as an argument. Because a function call can be as part of any expression, a function call can contain another function call as a parameter, as can be seen on line 11. In such case the inner function result is evaluated first, and the result is passed as argument to the outer function. In this case we happen to use the 'square' function itself as parameter. What will be the value of 'val3' at the end of the program execution?
It is important to note that the local variables declared inside a function definition are only visible inside the function (or more generally, inside the block statement marked with curly brackets). Therefore variable 'res' declared in function 'square' cannot be used in the main function, nor can variable 'val' be used inside the 'square' function. The only way to pass information between the functions is by using the arguments, or the return value. (It should be noted, though, that the C language allows declaring variables outside the functions, in which case they are global, and visible to all functions. Use of global variables is discouraged, however, unless there is a very good and well-justified reason for that).
Function does not need to have any return value. In such case void
is used in place of the return type on the function definition. In
such function, the return statement does not provide any value, and
can be omitted from the function. Functions can have several return
statements, to force early exit from the function under some given
conditions, and in such case return statement can be useful also when
there is not return value for the function.
Task 04_func: Vector function (1 pts)
Objective: get familiar with writing a function, and calling another function.
Implement function titled vectorlength that calculates the length of the given three-dimensional euclidean vector. The function gets three arguments that represent the different components in the three-dimensional space. The function should return the length of the vector. All numbers should be double-precision floating point numbers.
If you have forgotten about vector mathematics, you can find additional information in the Wikipedia. You will need to calculate square root that is not part of C's basic operators, but there is a sqrt function in that math library that you can use. With pow function you can calculate power functions. See the detailed function specifications from the linked manual pages.
Implement your function in file source.c. The file already has reference to math.h header that defines the math functions, but in this exercise you will need to write everything else by yourself. Initially, the program will not even compile, before you have implemented at least a placeholder for the function.
Formatted input and output basics
Formatted output
The printf function can be used for outputting information from
the programs. The printf function takes a string as a parameter, and
can optionally have any number of additional parameters for variables
that are printed as part of the string. For example, the following
code sample prints "The number is 50", followed by newline character
(not visible in the output), that causes the following output to
appear at the beginning of the next line. The printf function
interface is defined in include header 'stdio.h'. Therefore the
#include
directive is needed in the beginning of the program always
when printf is used.
1 2 3 4 5 6 7 | #include <stdio.h> int main(void) { int number = 50; printf("The number is %d\n", number); } |
The printf parameters are placed in the output string by using formatting conversion specifications. In its basic form, a conversion specification consists of a percent sign (%) and a letter that indicates the type of conversion. The conversion specification is replaced by the parameter given in the printf call. If there are multiple parameters (separated by comma), multiple conversion specifications need to be used. The number of conversion specifications must be the same as the number of parameters in the printf call. The type of conversion specification must be compatible with the data type of the corresponding parameter. Here are some conversion types:
- %d: int -- integer in decimal format
- %u: unsigned int -- unsigned integer in decimal format
- %o: unsigned int -- octal number
- %x, %X: unsigned int -- hexadecimal number, using either lowercase letters (former) or uppercase letters (latter)
- %c: int -- single character based on the used character encoding (e.g., ASCII), as discussed earlier with the character constants.
- %s: char* -- string. We will take a more detailed look into strings in the next module.
- %f: double -- floating point number (format: n.nnnnnn). Default number of decimals included in output is 6.
- %e, %E: double -- floating point number (format: n.nnnnnnE+-xx)
- %g, %G: double -- choose either %f or %e format, depending on the value of exponent.
If percent sign needs to be printed in the string, '%%' is used to distinguish from other formatting conversions.
The following adjustments can be made on the formatting specification before one of the above letters, after the percent sign. Different adjustments can be combined, but they need to be in the following order:
- minus (-) (e.g.
%-4d
): align the output left of the available field, when the field length is specified. - plus (+) (
%+4d
): for numeric conversion types, always include sign (+ or -). - 0 (%04d): for numeric conversion types with specified length, pad the field with leading zeros instead of space.
- number (e.g.
%4d
): indicates the minimum length of the output field. If the output needs less characters than given here, empty space is added before the output, such that the conversion replacement takes this many characters. By default the output is aligned to the right end of this field. - period followed by number (
%4.1f
): for floating point numbers, the precision (number of decimals following the point) - h or l (
%ld
): specifies that the argument is interpreted as short (h) or as long (l) form of the basic data type
More details can be found, for example, in the K&R book.
Below there are some examples about formatted output. The square brackets are not part of the formatting specification, but we use them to illustrate the width of the output field.
1 2 3 4 5 6 7 8 9 10 11 12 | #include <stdio.h> int main(void) { int numA = 10; float numB = 2.54; float numC = 0.000001; printf("At least five characters long: [%5d]\n", numA); printf("Length is six, one decimal shown: [%6.1f]\n", numB); printf("Float number, aligned left: [%-10.2e]\n", numC); printf("Number with leading zeros: [%05d]\n", ++numA); } |
This will output:
1 2 3 4 | The following field is at least five characters long: [ 10] The length is six, but just one decimal shown: [ 2.5] Another floating point number, aligned left: [1.00e-06 ] Number with leading zeros: [00011] |
Line 11 above shows an example of using the unary increment operator
as the printf parameter. Because the increment operator is prefixed,
the increment is done before the value of expression is determined,
and the value of numA is 11 when the printf function is called. If
the parameter had been numA++
, the call would have printed 10,
because the increment is done after the value of expression is
determined.
The printf call does not automatically start a new line. Multiple consecutive printf outputs will be shown on a single line, unless start of the new line is enforced by the '\n' special character. '\n' is not shown to user, but has its own ASCII encoding (10), that tells the console to change the line of output, and move to the beginning of the next line. '\n' does not have to be at the end of the output string, as above, but could be included in any place. Wherever it is, a new line is started at that point, and the following character appears on the beginning of the next line. Here are few of the special characters:
- '\n': new line -- the following output appears on the next line.
- '\': produces a single backslash
- '\"': produces a quote sign (")
- '\'': produces a single quote
There are also some others that you can study from the K&R book, or from other material.
Formatted input
scanf is another function defined in the standard I/O library (stdio.h). It reads formatted user input, and applies similar conversion specifications as printf (there are some differences between the two in how e.g. long data types are handled, but for now you can assume that they are roughly similar). Below is an example of two scanf calls.
1 2 3 4 5 6 7 8 9 10 11 | #include <stdio.h> int main(void) { int a; float b, c; int ret_a, ret_b; ret_a = scanf("%d", &a); ret_b = scanf("%f,%f", &b, &c); } |
On lines 5-7 there are a few variable declarations. This time they are not initialized, so their initial values are unpredictable. The first scanf call (line 9) expects a single integer value from user, and places it in variable 'a'. The function returns when user has pressed 'enter' to start a new line. The second scanf call (line 10) expects from user two float values separated by comma, and places them in variables 'b' and 'c'.
The scanf function has an integer return value, that in the above example is stored in variables 'ret_a' for the first call and 'ret_b' for the second call. The return value tells how many fields were read successfully. If everything went correctly, the ret_a should contain value 1 after the call, and ret_b should contain value 2 after the call. If user gave misformatted input, the return value will be smaller, for example 0. Therefore checking for the return value is recommended after the scanf call (you will see soon, how). If user had entered invalid input, the contents of variables a, b and c could remain uninitialized. As with printf, the number of scanf parameters following the string needs to match the number of conversion specifications.
The scanf function stops reading if the formatting specification does not match the given input, and returns the number or correctly read parameters. For example, if there have been multiple values on a single line of input, and one of them does not match the format specification, the next call to scanf encounters the same input again. scanf also ignores whitespace characters in the user input. Whitespace characters are, for example space, tab and newline characters. An exception to this rule is reading a character using scanf (the '%c' formatting conversion): it accepts also whitespace characters.
For basic data types the scanf parameters need the &
operator as
prefix. This relates to addresses and pointers that we will look at
more in module 2. For now, just include them in your scanf calls,
and assume that in near future you will know why.
Task 05-calc-1: Sum (1 pts)
Objective: practice use of formatted input and output.
Write function void simple_sum(void)
that asks two integers from the
user, calculates their sum, and prints the result in the following
format: 1 + 2 = 3
. There should be a newline ('\n') in the end. For
example (user input is shown as red, program output is shown as
black):
4 5 4 + 5 = 9
Hint: Because scanf function ignores the whitespace characters, you can accept inputs that involve, for example, a newline or multiple spaces between the two numbers.
Conditional statements
Statements and blocks
A function body consists statements that can contain expressions, and each statement is terminated by semicolon. Compound statement is is a group of statements that are separated by opening and closing braces { and }. In terms of program structure, a compound statement itself is a statement. Declarations that are done inside a compound statement block, are only visible inside the block, similarly to declarations done in the function definition (that can be seen as a top-level compound statement). Usually compound statement blocks are used together with control functions of the program, such as in if-conditions or loops, but it is possible (although not very common) to use them just stand-alone. Here is an example to illustrate this:
1 2 3 4 5 6 7 8 9 10 | int main(void) { int a = 1; a = a + 1; { int b = 6; b = b + 1; } a = a + b; /* Error! b is not visible here anymore */ } |
The above example causes a compiler error on line 9, because variable b is declared inside the inner block (lines 5 - 8), and is not available on a level above it.
Relational and logical operators
Relational and logical operators result in either 1 or 0, depending on whether the condition in the operator is true or false. The relational operators are:
<
-- less than<=
-- less or equal than>
-- greater than>=
-- greater or equal than==
-- equal (Important: notice the difference to assignment operator with one '=')!=
-- not equal
The following example illustrates the outcome of relational operators:
1 2 3 4 5 6 7 8 | int main(void) { int a; scanf("%d", &a); int a_res = a < 5; printf("a less than 5: %d\n", a_res); printf("a equal to 5: %d\n", a == 5); } |
In above example an integer value is first read to variable 'a'
(the implementor has been lazy to not check the return value of the
scanf call). Then, variable 'a_res' is set to the result of logical
operator a < 5
, and becomes either 0 or 1, depending on what user
gave as input. Line 7 demonstrates that the relational operators can
be used in expression as any other operator, and can therefore be
used, for example as function parameters. The printf function will
show 1 if user had typed '5', otherwise it will show 0. The user input
(red) and program output (black) from above code could be for example
following:
5 a less than 5: 0 a equal to 5: 1
In addition, there are logical operators for AND, OR and NOT:
- AND operator is
&&
: for example expression(a < 5 && b > 6)
is true if a is smaller than 5 AND b is greater than 6. - OR operator is
||
: for example(a < 5 || b > 6)
is true if either a is smaller than 5 OR b is greater than 6. - NOT (unary) operator is
!
in front of an expression, and negates the outcome of the expression. For example !(a < 5) says "a is not smaller than 5", i.e., it is greater or equal to 5.
A common mistake is to confuse && and || (the logical operators) with & and | (bitwise operations, explained in later modules), causing a different outcome. Similarly, it is common to confuse == (equality) with = (assignment). Because both the assignment and bitwise operators can be used as part of expression, the compiler accepts both forms, but use of wrong operator leads to wrong behavior.
If and else
The if-else structure can be used to implement decisions in the program, as with most other programming languages. The structure is:
if (expression) statement-1 else statement-2
If 'expression' is true, statement-1 is executed. If it is false,
statement-2 is executed. In C language, any non-zero value of
expression is interpreted as true, and zero is interpreted as
false. "expression" can be any C expression, and can contain a
function call or technically even just constant value (which would not
make any sense in practice). Often relational or logical operators
are used here, but they do not have to be used. For example, assuming
integer variable a, statement if (a)
would test whether variable a
contains a non-zero value. 'statement-1' and 'statement-2' must
either be terminated with semicolon, or they can be compound statements
indicated by curly braces. Here is a simple example:
1 2 3 4 5 6 7 8 9 10 11 12 | int days, years; scanf("%d %d", &days, &years); // return value omitted -- trust the user if (days > 365) { years = years + 1; days = days - 365; } else printf("%d days remaining until the next year\n", 365 - days); /* No curly braces in the else branch -- this part of code is executed in both cases */ printf("days: %d years: %d\n", days, years); |
The input could be, for example, like this:
400 2 days: 35 years: 3
The above example does not use curly braces in the else - branch, which therefore consists of only one statement. Forgetting curly braces unintentionally could also cause buggy behavior. Therefore they are often used for clarity even if they would contain just one statement. The else branch is not mandatory, and can be left out, if there is no viable alternative code to execute.
An if-else construct can have more than two parts: another if
statement can follow directly after else
, and this can be repeated
any number of times, until including the final else
in the end,
which is not mandatory. For example:
1 2 3 4 5 6 7 8 9 10 11 | int a; scanf("%d", &a); if (a == 1) printf("one\n"); else if (a == 2) printf("two\n"); else if (a == 3) printf("three\n"); else printf("some other number\n"); |
Switch
The switch
statement is another way for doing multi-way decisions,
when the options are constant integers. The switch statement compares
an expression to constant labels, and in case of matching label
executes the following code. Here is an example that reads one
character from user, and evaluates whether it is one of a few
alternatives:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | char a; scanf("%c", &a); switch(a) { case '1': printf("user typed one\n"); break; case '2': printf("user typed two\n"); break; case 'a': case 'b': case 'c': printf("user typed a, b or c\n"); break; default: printf("user typed something else\n"); break; } |
Pay attention to the differences between switch construct and if-else
construct. In the case of switch, multiple statements can follow each
case statement without enclosing them inside compound statement
(i.e., curly brackets). Usually after the last statement of each
branch there is a break
statement that causes the program to jump
out of the switch processing, to the code that follows the ending
brace of the whole switch statement (we will see more about break
shortly). If break is not included, the execution would continue
through the next labels: for example, if break was removed from branch
'1' above, the program would print two lines of output when user typed
character '1'. While this property is a common reason for bugs in C
programs, it allows assigning multiple labels for a piece of code, as
is done for 'a', 'b' and 'c' above. Finally, a special label "default"
is used to match all cases. Usually a good habit is to include break
statement also after the final branch, even though it is logically
unnecessary. This helps avoiding bugs if program is extended later.
Remember: '1' is also a constant integer, based on the ASCII
character encoding table, and is equivalent to 49 in decimal
format. Number 1 and character '1' are different values:
printf("%c\n", '1');
outputs 1, but printf("%d\n", '1');
outputs
49, because the latter prints numeric value corresponding the
character constant, but former assumes character encoding.
Note that switch can always be replaced with series of if..else if.. statements that perform the corresponding set of tests.
Task 05-calc-2: Calculator (1 pts)
Objective: practice use of conditional statements and formatted input and output using scanf and printf.
Write function void simple_math(void)
that asks three values from
the user: number, operator, and another number. Operator should be one
of the following characters: '+', '-', '*' or '/'. If some other
character is used as operator, the function should print "ERR"
(without quotes). The numbers should be float type. If user does not
type a valid number-operator-number combination, the function should
print "ERR". When valid input is given, the function performs the
calculation as given, and prints the result on the screen, using the
precision of one decimal:
8 - 2 6.0 8.3 / 5.1 1.6 -3.456 - 2.31 -5.8
Hint: Check how character constants are used, and single character as scanf format conversion.
Loops
While and do-while
The while statement repeats a (compound) statement as long as the specified expression is true (i.e., has non-zero value). Below is an example of a simple loop that repeats until the value of a is 10 (or higher).
1 2 3 | int a = 0; // in this case it is important to initialize the variable while (a < 10) a++; |
The termination condition is tested before executing the statement. Therefore it is possible that the statement is never executed, if the termination condition is false from the beginning.
As always, the statement inside while loop can be a compound statement, as in the following:
1 2 3 4 5 6 | int a = 0; while (a < 10) { printf("value of a is now %d\n", a); a++; } |
If it is desired to test the termination condition after the (compound) statement, a do-while construct can be used as follows:
1 2 3 4 5 | int a = 20; do { printf("value of a is now %d\n", a); a++; } while (a < 10); |
Because in this example the initial value of 'a' is 20, it prints one line before the termination condition is evaluated, and the loop is terminated.
For
for is another statement for constructing loops, and can be used as a convenient alternative to while in many cases. It takes the form
for (expression_1; expression_2; expression_3) statement
The above for construct could be built using while, in which case an equivalent pattern would be:
expression_1; while (expression_2) { statement expression_3; }
expression_1 initializes the for loop, expression_2 is the iteration check, and expression_3 is the adjustment expression at the end of the for loop. Again, any C expressions could be used in the three places (even including function calls, etc.).
For example, the second of the above while
examples could be written as:
1 2 3 4 | int a; for (a = 0; a < 10; a++) { printf("value of a is now %d\n", a); } |
Any of the three expressions in the for statement can be omitted. Therefore, yet another form for the above would be
1 2 3 4 5 | int a = 0; for ( ; a < 10; ) { printf("value of a is now %d\n", a); a++; } |
If the termination condition (expression_2) is omitted, it is always
considered to be true, making an endless loop, unless broken by some
other means such as using the break
statement.
Multiple for loops can be nested, to form multidimensional loops.
C99 allows declaration of variables as part of the for-statement, in expression-1.
break and continue
The break
statement can be used to terminate a loop before the
specified condition is evaluated. For example, the below example never
makes it to 10, but stops counting at 5:
1 2 3 4 5 6 | int a; for (a = 0; a < 10; a++) { printf("value of a is now %d\n", a); if (a == 5) break; } |
The continue
statement causes the next iteration of a loop to start
immediately. For example, the following code only shows the even
numbers (a % 2
takes the modulo 2 out of variable a):
1 2 3 4 5 6 | int a; for (a = 0; a < 10; a++) { if (a % 2 == 1) continue; printf("value of a is now %d\n", a); } |
Task 07-geometry-1: Multiplication table (1 pts)
Objective: Practice use of nested loops and formatted output
Implement function void multi_table(unsigned int xsize, unsigned int
ysize)
that prints a multiplication table on the screen that has
numbers from 1 to 'xsize' on the x-axis, and numbers from 1 to
'ysize' on the y-axis, and products of these numbers in tabular
format. Each number should take 4 characters when printed on the
screen, aligned to the right. There is a newline character ('\n') at
the end of each line, including the last line. For example, function
call multi_table(4,5)
should result in the following output:
1 2 3 4 2 4 6 8 3 6 9 12 4 8 12 16 5 10 15 20
Task 07-geometry-2: Draw Triangle (1 pts)
Objective: Practice use of nested loops in C code, together with a bit of conditional application logic.
Implement function void draw_triangle(unsigned int size)
that draws
an ASCII box that contains a triangle.
The box should be size characters wide, and size characters tall. The box is split diagonally in two such that the righthand and bottom characters are '#', and the lefthand and top characters are '.'.
The first line contains one '#' character at the right edge, the second line contains two '##' characters, and so on. On the last line all characters are '#'.
All lines (also the last) end with a newline character ('\n').
Here is an example calling draw_triangle(5)
:
....# ...## ..### .#### #####
Task 07-geometry-3: Draw Ball (1 pts)
Objective: One more exercise on loops in C code, together with a decision function that determines the output.
Implement function void draw_ball(unsigned int radius)
that draws an
ASCII box that contains a circle filled with character ('*').
The box is (2 * radius + 1) characters wide and tall, i.e., just large enough to contain the circle. There is a helper function 'distance' that returns the distance of given (x,y) coordinates from (0,0). If the circle is centered at (0,0), you can use the 'distance' function such that if distance(x,y) <= radius, coordinate (x,y) is within the circle, otherwise it is outside the circle.
If a square is within circle, print character '*'. If a square is outside the circle, print character '.'.
When draw_ball(3)
is called, the output should be like this:
...*... .*****. .*****. ******* .*****. .*****. ...*...
Hint: You can use also negative numbers as part of the for loop, as long as the data type allows negative numbers.
Task 08-characters-1: ASCII Chart (1 pts)
Objective: Getting more familiar with printf format specifications. You will also get an initial look at ASCII coding system and hexadecimal numbers: how different displayed characters are mapped to numeric values, that can be presented either as decimal or hexadecimal numbers.
Implement function void ascii_chart(char min, char max)
that outputs
(a portion of) ASCII character mapping. It should iterate through numbers
starting from 'min' and ending to (and including) 'max'.
For each displayed item, the output should look like following:
-
three-character field that shows the given number (integer) in decimal format. If the number takes less than three characters (it is < 100), it is aligned right.
-
one space, followed by four-character field that shows the same number in hexadecimal format. Each hexadecimal number should take two characters, and one-digit numbers are prefixed with 0. The whole hexadecimal number is prefixed with '0x'. For example, number 1 is shown as '0x01'.
-
one space, followed by the same number when printed in character format (always one-character field). The number is converted into a character according to ASCII coding system.
Some character codes are not "printable", and do not produce sensible output with this formatting. For non-printable characters, just '?' should be shown. You can use function
int isprint(int c)
(man page) to test if character in variable 'c' is printable. If function returns 0, the character is not printable and should show as '?'. If it is non-zero the character should be printed normally. -
one tab ('\t'), if the current line has less than four entries printed. On the fourth entry, you should change to the next line, i.e., instead of tab, print newline ('\n').
You should cycle through each number in the parameter range in the above-mentioned format. For example, call ascii_chart(28,38), should show the following:
28 0x1c ? 29 0x1d ? 30 0x1e ? 31 0x1f ? 32 0x20 33 0x21 ! 34 0x22 " 35 0x23 # 36 0x24 $ 37 0x25 % 38 0x26 &
Task 08-characters-2: Secret Message (1 pts)
Objective: More playing with character manipulation, loops and use of functions. Also, this works as a preliminary introduction to strings (actual content in module 2).
Implement function void secret_msg(int msg)
which decrypts (and encrypts) a given
message using a primitive algorithm (as described below). Each secret message
is identified by integer that is given as the parameter of this call ('msg').
You receive the message one character at a time by calling function char
get_character(int msg, unsigned int cc)
that is given in the exercise
template. Implementation of that function contains some concepts introduced
only in Module 2 (arrays, strings), but you don't need to mind about them yet.
Just assume that the function returns character number 'cc' from message that
is numbered 'msg' (which is just the same value passed with the call to
secret_msg).
Character numbering starts from 0. You will need to call get_character multiple times, by increasing the value of 'cc' by one for each call, until you receive 0 as the return value (i.e., cc is the character count from the beginning of the message).
As you read the characters in message, you'll need to decrypt each of them at a time and print the decrypted character to screen, until you reach the end of the message (do not print the terminating 0).
The decryption algorithm is as follows: you decrement the received character code from 158 (158 - code, where code is the value returned by get_character), and the print the result as a character.
You can test this with messages numbered 0 and 1 that are provided in src/main.c. If the function works, these messages should translate to readable short sentences (in English). The TMC tests have also other message numbers.
Task 09-ships: Ship Sinker (4 pts)
Objective: Practice use of function calls as part of a (slightly) bigger program
We will now implement a basic ship sinking game. Because it requires features of C language that have not yet been discussed so far, part of the functions are already given to you, and you will implement four additional functions to complete the logic.
The game field is 10x10 squares, and each ship is 3 squares tall. The coordinate range is between 0 and 9 both vertically and horizontally (i.e., (0,0) is the upper left corner; (9,9) is the lower right). The game ends when all ships are sunk.
The exercise code is divided into two c source files. shiplib.c contains support functions for operating the game field. You will need to call these functions in the exercise tasks, but you should not modify this file. See the comments in the source code for explanations about how this functions are used. ships.c contains the functions you will need to implement.
You will need to implement four functions as follows. You will gain a point for each of them, if implemented by the primary deadline.
a) Place ships
Implement function void set_ships(unsigned int num)
that places
'num' ships to the game field. To place one ship at given
location, you should call function place_ship(), with the
location and direction of the ship (see source code for detailed
explanation). Note that place_ship() function fails if you try to
place ship to a location that overlaps with an existing ship, or goes
out of bounds, so you'll need to investigate its return value.
Hint: You can use C library function
rand() to pick a random
location and direction for a ship. The function returns a random
integer, which you can scale to the appropriate range using the
modulo operation (%
). For example, rand() % 10
produces random
numbers between 0 and 9.
b) Print gamefield
Implement function void print_field(void)
that prints the whole
gamefield on screen. If a location is not yet visible to user, '?'
should be printed. If the location is visible (i.e., it has been
shot at already), there are three options to be printed:
- '.' if the location does not have a ship
- '+' if the location has a part of a ship that has not yet been hit (when is this needed?)
- '#' if the location has a part of a ship that has been hit
Initially all sectors are invisible, and a sector becomes visible when the player shoots at it.
You will need to use two functions: is_visible(x,y) tells whether the given sector is visible to the player. is_ship(x,y) tells whether a location has a ship, and if it is hit. See the source code comments at 'shiplib.c' for more detailed descriptions of these functions.
c) Shoot
Implement function int shoot(void)
that asks two unsigned numbers from the
user, separated by space, that represent coordinates in the game
field. If user input is invalid or coordinates are out of bounds, the
function should return -1. If the given location has ship, the
function should call function hit_ship() to mark the location
hit, and return 1. If the given location does not have ship, the
function should return 0. In either case, the function should call
function checked() that marks the location visible.
d) End of game
Implement function int game_over(unsigned int num)
that returns 1 if
all ships are sunk, or 0, if there are still ships to be hit. Parameter
'num' tells the number of ships on the game field. You know about
ships' status using the is_ship() function, and since you know
that each ship occupies 3 squares in game field, you can determine if
all ship positions have been hit.