Introduction to C++

Hello world

There are many lessons in writing a simple “Hello world” program

  • C++ programs are normally written using a text editor or integrated development environment (IDE) - we use the %%file magic to simulate this

  • The #include statement literally pulls in and prepends the source code from the iostream header file

  • Types must be declared - note the function return type is int

  • There is a single function called main - every program has main as the entry point although you can write libraries without a main function

  • Notice the use of braces to delimit blocks

  • Notice the use of semi-colons to delimit expressions

  • Unlike Python, white space is not used to delimit blocks or expressions only tokens

  • Note the use of the std namespace - this is similar to Python except C++ uses :: rather than . (like R)

  • The I/O shown here uses streaming via the << operator to send output to cout, which is the name for the standard output

  • std::endl provides a line break and flushes the input buffer

[1]:
%%file hello.cpp

#include <iostream>

int main() {
    std::cout << "Hello, world!" << std::endl;
}
Writing hello.cpp

Compilation

  • The source file must be compiled to machine code before it can be exeuted

  • Compilation is done with a C++ compiler - here we use one called g++

  • By default, the output of compilation is called a.out - we use -o to change the output executable filename to hello.exe

    • Note the use of .exe is a Windows convention; Unix executables typically have no extension - for example, just be the name hello

[2]:
%%bash

g++ hello.cpp -o hello.exe

Execution

[3]:
%%bash

./hello.exe
Hello, world!

C equivalent

Before we move on, we briefly show the similar Hello world program in C. C is a precursor to C++ that is still widely used. While C++ is derived from C, it is a much richer and more complex language. We focus on C++ because the intent is to show how to wrap C++ code using pybind11 and take advantage of C++ numerical libraries that do not exist in C.

[4]:
%%file hello01.c

#include <stdio.h>

int main() {
    printf("Hello, world from C!\n");
}
Writing hello01.c
[5]:
%%bash

gcc hello01.c
[6]:
%%bash

./a.out
Hello, world from C!

Namespaces

Just like Python, C++ has namespaces that allow us to build large libraries without worrying about name collisions. In the Hello world program, we used the explicit name std::cout indicating that cout is a member of the standard workspace. We can also use the using keyword to import selected functions or classes from a namespace.

using std::cout;

int main()
{
    cout << "Hello, world!\n";
}

For small programs, we sometimes import the entire namespace for convenience, but this may cause namespace collisions in larger programs.

using namespace std;

int main()
{
    cout << "Hello, world!\n";
}

You can easily create your own namespace.

namespace sta_663 {
    const double pi=2.14159;

    void greet(string name) {
        cout << "\nTraditional first program\n";
        cout << "Hello, " << name << "\n";
    }
}

int main()
{
    cout << "\nUsing namespaces\n";
    string name = "Tom";
    cout << sta_663::pi << "\n";
    sta_663::greet(name);
}
[7]:
%%file hello02.cpp

#include <iostream>

using std::cout;
using std::endl;

int main() {
    cout << "Hello, world!" << endl;
}
Writing hello02.cpp
[8]:
%%bash

g++ hello02.cpp -o hello02
[9]:
%%bash

./hello02
Hello, world!

Wholesale imports of namespace is generally frowned upon, similar to how from X import * is frowned upon in Python.

[10]:
%%file hello03.cpp

#include <iostream>

using namespace std;

int main() {
    cout << "Hello, world!" << endl;
}
Writing hello03.cpp
[11]:
%%bash

g++ hello03.cpp -o hello03
[12]:
%%bash

./hello03
Hello, world!

Types

[13]:
%%file dtypes.cpp

#include <iostream>
#include <complex>

using std::cout;

int main() {
    // Boolean
    bool a = true, b = false;

    cout << "and            " << (a and b) << "\n";
    cout << "&&             " << (a && b) << "\n";
    cout << "or             " << (a or b) << "\n";
    cout << "||             " << (a || b) << "\n";
    cout << "not            " << not (a or b) << "\n";
    cout << "!              " << !(a or b) << "\n";

    // Integral numbers
    cout << "char           " << sizeof(char) << "\n";
    cout << "short int      " << sizeof(short int) << "\n";
    cout << "int            " << sizeof(int) << "\n";
    cout << "long           " << sizeof(long) << "\n";

    // Floating point numbers
    cout << "float          " << sizeof(float) << "\n";
    cout << "double         " << sizeof(double) << "\n";
    cout << "long double    " << sizeof(long double) << "\n";
    cout << "complex double " << sizeof(std::complex<double>) << "\n";

    // Characters and strings
    char c = 'a'; // Note single quotes
    char word[] = "hello"; // C char arrays
    std::string s = "hello"; // C++ string

    cout << c << "\n";
    cout << word << "\n";
    cout << s << "\n";
}
Writing dtypes.cpp
[14]:
%%bash

g++ dtypes.cpp -o dtypes.exe
./dtypes.exe
and            0
&&             0
or             1
||             1
not            0
!              0
char           1
short int      2
int            4
long           8
float          4
double         8
long double    16
complex double 16
a
hello
hello

Type conversions

Converting between types can get pretty complicated in C++. We will show some simple versions.

[15]:
%%file type.cpp
#include <iostream>
using std::cout;
using std::string;
using std::stoi;

int main() {
    char c = '3'; // A char is an integer type
    string s = "3"; // A string is not an integer type
    int i = 3;
    float f = 3.1;
    double d = 3.2;

    cout << c << "\n";
    cout << i << "\n";
    cout << f << "\n";
    cout << d << "\n";

    cout << "c + i is " << c + i << "\n";
    cout << "c + i is " << c - '0' + i << "\n";

    // Casting string to number
    cout << "s + i is " << stoi(s) + i << "\n"; // Use std::stod to convert to double

    // Two ways to cast float to int
    cout << "f + i is " << f + i << "\n";
    cout << "f + i is " << int(f) + i << "\n";
    cout << "f + i is " << static_cast<int>(f) + i << "\n";

}

Writing type.cpp
[16]:
%%bash

g++ -o type.exe type.cpp -std=c++14
[17]:
%%bash

./type.exe
3
3
3.1
3.2
c + i is 54
c + i is 6
s + i is 6
f + i is 6.1
f + i is 6
f + i is 6

Header, source, and driver files

C++ allows separate compilation of functions and programs that use those functions. The way it does this is to write functions in source files that can be compiled. To use these compiled functions, the calling program includes header files that contain the function signatures - this provides enough information for the compiler to link to the compiled function machine code when executing the program.

  • Here we show a toy example of typical C++ program organization

  • We build a library of math functions in my_math.cpp

  • We add a header file for the math functions in my_math.hpp

  • We build a library of stats functions in my_stats.cpp

  • We add a header file for the stats functions in my_stats.hpp

  • We write a program that uses math and stats functions called my_driver.cpp

    • We pull in the function signatures with #include for the header files

  • Once you understand the code, move on to see how compilation is done

  • Note that it is customary to include the header file in the source file itself to let the compiler catch any mistakes in the function signatures

[18]:
%%file my_math.hpp
#pragma once

int add(int a, int b);
int multiply(int a, int b);
Writing my_math.hpp
[19]:
%%file my_math.cpp

#include "my_math.hpp"

int add(int a, int b) {
    return a + b;
}

int multiply(int a, int b) {
    return a * b;
}
Writing my_math.cpp
[20]:
%%file my_stats.hpp
#pragma once

int mean(int xs[], int n);
Writing my_stats.hpp
[21]:
%%file my_stats.cpp

#include "my_math.hpp"

int mean(int xs[], int n) {
    double s = 0;
    for (int i=0; i<n; i++) {
        s += xs[i];
    }
    return s/n;
}
Writing my_stats.cpp
[22]:
%%file my_driver.cpp

#include <iostream>
#include "my_math.hpp"
#include "my_stats.hpp"

int main() {
    int xs[] = {1,2,3,4,5};
    int n = 5;
    int a = 3, b= 4;

    std::cout << "sum = " << add(a, b) << "\n";
    std::cout << "prod = " << multiply(a, b) << "\n";
    std::cout << "mean = " << mean(xs, n) << "\n";
}
Writing my_driver.cpp

Compilation

  • Notice in the first 2 compile statements, that the source files are compiled to object files with default extension .o by usin gthe flag -c

  • The 3rd compile statement builds an executable by linking the main file with the recently created object files

  • The function signatures in the included header files tells the compiler how to match the function calls add, multiply and mean with the matching compiled functions

[23]:
%%bash

g++ -c my_math.cpp
g++ -c my_stats.cpp
g++ my_driver.cpp my_math.o my_stats.o
[24]:
%%bash

./a.out
sum = 7
prod = 12
mean = 3

Using make

As building C++ programs can quickly become quite complicated, there are builder programs that help simplify this task. One of the most widely used is make, which uses a file normally called Makefile to coordinate the instructions for building a program

  • Note that make can be used for more than compiling programs; for example, you can use it to automatically rebuild tables and figures for a manuscript whenever the data is changed

  • Another advantage of make is that it keeps track of dependencies, and only re-compiles files that have changed or depend on another changed file since the last compilation

We will build a simple Makefile to build the my_driver executable:

  • Each section consists of a make target denoted by <targget>: followed by files the target depends on

  • The next line is the command given to build the target. This must begin with a TAB character (it MUST be a TAB and not spaces)

  • If a target has dependencies that are not met, make will see if each dependency itself is a target and build that first

  • It uses timestamps to decide whether to rebuild a target (not actually changes)

  • By default, make builds the first target, but can also build named targets

How to get the TAB character. Copy and paste the blank space between a and b.

[25]:
! echo "a\tb"
a       b
[ ]:

[26]:
%%file Makefile

driver: my_math.o my_stats.o
    g++ my_driver.cpp my_math.o my_stats.o -o my_driver

my_math.o: my_math.cpp my_math.hpp
    g++ -c my_math.cpp

my_stats.o: my_stats.cpp my_stats.hpp
    g++ -c my_stats.cpp
Writing Makefile
  • We first start with a clean slate

[27]:
%%capture logfile
%%bash

rm *\.o
rm my_driver
[28]:
%%bash

make
g++ -c my_math.cpp
g++ -c my_stats.cpp
g++ my_driver.cpp my_math.o my_stats.o -o my_driver
[29]:
%%bash

./my_driver
sum = 7
prod = 12
mean = 3
  • Re-building does not trigger re-compilation of source files since the timestamps have not changed

[30]:
%%bash

make
g++ my_driver.cpp my_math.o my_stats.o -o my_driver
[31]:
%%bash

touch my_stats.hpp
  • As my_stats.hpp was listed as a dependency of the target my_stats.o, touch, which updates the timestamp, forces a recompilation of my_stats.o

[32]:
%%bash

make
g++ -c my_stats.cpp
g++ my_driver.cpp my_math.o my_stats.o -o my_driver

Use of variables in Makefile

[33]:
%%file Makefile2

CC=g++
CFLAGS=-Wall -std=c++14

driver: my_math.o my_stats.o
    $(CC) $(CFLAGS) my_driver.cpp my_math.o my_stats.o -o my_driver2

my_math.o: my_math.cpp my_math.hpp
    $(CC) $(CFLAGS) -c my_math.cpp

my_stats.o: my_stats.cpp my_stats.hpp
    $(CC) $(CFLAGS) -c my_stats.cpp
Writing Makefile2

Compilation

Note that no re-compilation occurs!

[34]:
%%bash

make -f Makefile2
g++ -Wall -std=c++14 my_driver.cpp my_math.o my_stats.o -o my_driver2

Execution

[35]:
%%bash

./my_driver2
sum = 7
prod = 12
mean = 3

Input and output

[36]:
%%file main_args.cpp

#include <iostream>
using std::cout;

int main(int argc, char* argv[]) {
    for (int i=0; i<argc; i++) {
        cout << i << ": " << argv[i] << "\n";
    }
}
Writing main_args.cpp
[37]:
%%bash

g++ main_args.cpp -o main_args
[38]:
%%bash

./main_args hello 1 2 3
0: ./main_args
1: hello
2: 1
3: 2
4: 3

Exercise

Write, compile and execute a progrm called greet that when called on the command line with

greet Santa 3

gives the output

Hello Santa!
Hello Santa!
Hello Santa!
[39]:
%%file data.txt
9 6
Writing data.txt
[40]:
%%file io.cpp

#include <fstream>
#include "my_math.hpp"

int main() {
    std::ifstream fin("data.txt");
    std::ofstream fout("result.txt");

    double a, b;

    fin >> a >> b;
    fin.close();

    fout << add(a, b) << std::endl;
    fout << multiply(a, b) << std::endl;
    fout.close();
}
Writing io.cpp
[41]:
%%bash

g++ io.cpp -o io.exe my_math.cpp
[42]:
%%bash

./io.exe
[43]:
! cat result.txt
15
54

Arrays

[44]:
%%file array.cpp

#include <iostream>
using std::cout;
using std::endl;

int main() {

    int N = 3;
    double counts[N];

    counts[0] = 1;
    counts[1] = 3;
    counts[2] = 3;

    double avg = (counts[0] + counts[1] + counts[2])/3;

    cout << avg << endl;
}
Writing array.cpp
[45]:
%%bash

g++ -o array.exe array.cpp
[46]:
%%bash

./array.exe
2.33333

Loops

[47]:
%%file loop.cpp

#include <iostream>
using std::cout;
using std::endl;
using std::begin;
using std::end;

int main()
{
    int x[] = {1, 2, 3, 4, 5};

    cout << "\nTraditional for loop\n";
    for (int i=0; i < sizeof(x)/sizeof(x[0]); i++) {
        cout << i << endl;
    }

    cout << "\nUsing iterators\n";
    for (auto it=begin(x); it != end(x); it++) {
        cout << *it << endl;
    }

    cout << "\nRanged for loop\n\n";
    for (auto const &i : x) {
        cout << i << endl;
    }
}
Writing loop.cpp
[48]:
%%bash

g++ -o loop.exe loop.cpp -std=c++14
[49]:
%%bash

./loop.exe

Traditional for loop
0
1
2
3
4

Using iterators
1
2
3
4
5

Ranged for loop

1
2
3
4
5

Function arguments

  • A value argument means that the argument is copied in the body of the function

  • A referene argument means that the addresss of the value is useed in the function. Reference or pointer arugments are used to avoid copying large objects.

[50]:
%%file func_arg.cpp

#include <iostream>
using std::cout;
using std::endl;

// Value parameter
void f1(int x) {
    x *= 2;
    cout << "In f1    : x=" << x << endl;
}

// Reference parameter
void f2(int &x) {
    x *= 2;
    cout << "In f2    : x=" << x << endl;
}

/* Note
If you want to avoid side effects
but still use references to avoid a copy operation
use a const refernece like this to indicate that x cannot be changed

void f2(const int &x)
*/

/* Note
Raw pointers are prone to error and
generally avoided in modern C++
See unique_ptr and shared_ptr
*/

// Raw pointer parameter
void f3(int *x) {
    *x *= 2;
    cout << "In f3    : x=" << *x << endl;
}

int main() {
    int x = 1;

    cout << "Before f1: x=" << x << "\n";
    f1(x);
    cout << "After f1 : x=" << x << "\n";

    cout << "Before f2: x=" << x << "\n";
    f2(x);
    cout << "After f2 : x=" << x << "\n";

    cout << "Before f3: x=" << x << "\n";
    f3(&x);
    cout << "After f3 : x=" << x << "\n";
}
Writing func_arg.cpp
[51]:
%%bash

c++ -o func_arg.exe func_arg.cpp --std=c++14
[52]:
%%bash

./func_arg.exe
Before f1: x=1
In f1    : x=2
After f1 : x=1
Before f2: x=1
In f2    : x=2
After f2 : x=2
Before f3: x=2
In f3    : x=4
After f3 : x=4

Arrays, pointers and dynamic memory

A pointer is a number that represents an address in computer memory. What is stored at the address is a bunch of binary numbers. How those binary numbers are interpetedd depends on the type of the pointer. To get the value at the pointer adddress, we derefeernce the pointer using *ptr. Pointers are often used to indicagte the start of a block of value - the name of a plain C-style array is essentialy a pointer to the start of the array.

For example, the argument char** argv means that argv has type pointer to pointer to char. The pointer to char can be thought of as an array of char, hence the argument is also sometimes written as char* argv[] to indicate pointer to char array. So conceptually, it refers to an array of char arrays - or a colleciton of strings.

We generally avoid using raw pointers in C++, but this is standard in C and you should at least understand what is going on.

In C++, we typically use smart pointers, STL containers or convenient array constructs provided by libraries such as Eigen and Armadillo.

Pointers and addresses

[53]:
%%file p01.cpp

#include <iostream>

using std::cout;

int main() {
    int x = 23;
    int *xp;
    xp = &x;

    cout << "x                    " << x << "\n";
    cout << "Address of x         " << &x << "\n";
    cout << "Pointer to x         " << xp << "\n";
    cout << "Value at pointer to x " << *xp << "\n";
}
Writing p01.cpp
[54]:
%%bash

g++ -o p01.exe p01.cpp -std=c++14
./p01.exe
x                    23
Address of x         0x7ffef073c6d4
Pointer to x         0x7ffef073c6d4
Value at pointer to x 23

Arrays

[55]:
%%file p02.cpp

#include <iostream>

using std::cout;
using std::begin;
using std::end;

int main() {
    int xs[] = {1,2,3,4,5};

    int ys[3];
    for (int i=0; i<5; i++) {
        ys[i] = i*i;
    }

    for (auto x=begin(xs); x!=end(xs); x++) {
        cout << *x << " ";
    }
    cout << "\n";

    for (auto x=begin(ys); x!=end(ys); x++) {
        cout << *x << " ";
    }
    cout << "\n";
}
Writing p02.cpp
[56]:
%%bash

g++ -o p02.exe p02.cpp -std=c++14
./p02.exe
16 2 3 4 5
0 1 4

Dynamic memory

  • Use new and delete for dynamic memory allocation in C++.

  • Do not use the C style malloc, calloc and free

  • Abosolutely never mix the C++ and C style dynamic memory allocation

[57]:
%%file p03.cpp

#include <iostream>

using std::cout;
using std::begin;
using std::end;

int main() {

    // declare memory
    int *z = new int; // single integer
    *z = 23;

    // Allocate on heap
    int *zs = new int[3]; // array of 3 integers
    for (int i=0; i<3; i++) {
        zs[i] = 10*i;
    }

    cout << *z << "\n";

    for (int i=0; i < 3; i++) {
        cout << zs[i] << " ";
    }
    cout << "\n";

    // need for manual management of dynamically assigned memory
    delete z;
    delete[] zs;
}
Writing p03.cpp
[58]:
%%bash

g++ -o p03.exe p03.cpp -std=c++14
./p03.exe
23
0 10 20

Pointer arithmetic

When you increemnt or decrement an array, it moves to the preceding or next locaion in memory as aprpoprite for the pointer type. You can also add or substract an number, since that is equivalent to mulitple increments/decrements. This is know as pointer arithmetic.

[59]:
%%file p04.cpp

#include <iostream>

using std::cout;
using std::begin;
using std::end;

int main() {
    int xs[] = {100,200,300,400,500,600,700,800,900,1000};

    cout << xs << ": " << *xs  << "\n";
    cout << &xs << ": " << *xs  << "\n";
    cout << &xs[3] << ": " << xs[3] << "\n";
    cout << xs+3 << ": " << *(xs+3)  << "\n";
}
Writing p04.cpp
[60]:
%%bash

g++ -std=c++11 -o p04.exe p04.cpp
./p04.exe
0x7ffe7687db60: 100
0x7ffe7687db60: 100
0x7ffe7687db6c: 400
0x7ffe7687db6c: 400

C style dynamic memory for jagged array (“matrix”)

[61]:
%%file p05.cpp

#include <iostream>

using std::cout;
using std::begin;
using std::end;

int main() {
    int m = 3;
    int n = 4;
    int **xss = new int*[m]; // assign memory for m pointers to int
    for (int i=0; i<m; i++) {
        xss[i] = new int[n]; // assign memory for array of n ints
        for (int j=0; j<n; j++) {
            xss[i][j] = i*10 + j;
        }
    }

    for (int i=0; i<m; i++) {
        for (int j=0; j<n; j++) {
            cout << xss[i][j] << "\t";
        }
        cout << "\n";
    }

    // Free memory
    for (int i=0; i<m; i++) {
        delete[] xss[i];
    }
    delete[] xss;
}
Writing p05.cpp
[62]:
%%bash

g++ -std=c++11 -o p05.exe p05.cpp
./p05.exe
0       1       2       3
10      11      12      13
20      21      22      23

Functions

[63]:
%%file func01.cpp

#include <iostream>

double add(double x, double y) {
    return x + y;
}

double mult(double x, double y) {
    return x * y;
}

int main() {
    double a = 3;
    double b = 4;

    std::cout << add(a, b) << std::endl;
    std::cout << mult(a, b) << std::endl;

}
Writing func01.cpp
[64]:
%%bash

g++ -o func01.exe func01.cpp  -std=c++14
./func01.exe
7
12

Function parameters

In the example below, the space allocated inside a function is deleted outside the function. Such code in practice will almost certainly lead to memory leakage. This is why C++ functions often put the output as an argument to the function, so that all memory allocation can be controlled outside the function.

void add(double *x, double *y, double *res, n)
[65]:
%%file func02.cpp

#include <iostream>

double* add(double *x, double *y, int n) {
    double *res = new double[n];

    for (int i=0; i<n; i++) {
        res[i] = x[i] + y[i];
    }
    return res;
}

int main() {
    double a[] = {1,2,3};
    double b[] = {4,5,6};

    int n = 3;
    double *c = add(a, b, n);

    for (int i=0; i<n; i++) {
        std::cout << c[i] << " ";
    }
    std::cout << "\n";

    delete[] c; // Note difficulty of book-keeping when using raw pointers!
}
Writing func02.cpp
[66]:
%%bash

g++ -o func02.exe func02.cpp  -std=c++14
./func02.exe
5 7 9
[67]:
%%file func03.cpp

#include <iostream>
using std::cout;

// Using value
void foo1(int x) {
    x = x + 1;
}


// Using pointer
void foo2(int *x) {
    *x = *x + 1;
}

// Using ref
void foo3(int &x) {
    x = x + 1;
}

int main() {
    int x = 0;

    cout << x << "\n";
    foo1(x);
    cout << x << "\n";
    foo2(&x);
    cout << x << "\n";
    foo3(x);
    cout << x << "\n";
}
Writing func03.cpp
[68]:
%%bash

g++ -o func03.exe func03.cpp  -std=c++14
./func03.exe
0
0
1
2

Generic programming with templates

In C, you need to write a different function for each input type - hence resulting in duplicated code like

int iadd(int a, int b)
float fadd(float a, float b)

In C++, you can make functions generic by using templates.

Note: When you have a template function, the entire funciton must be written in the header file, and not the source file. Hence, heavily templated libaries are often “header-only”.

[69]:
%%file template.cpp

#include <iostream>

template<typename T>
T add(T a, T b) {
    return a + b;
}

int main() {
    int m =2, n =3;
    double u = 2.5, v = 4.5;

    std::cout << add(m, n) << std::endl;
    std::cout << add(u, v) << std::endl;
}
Writing template.cpp
[70]:
%%bash

g++ -o template.exe template.cpp
[71]:
%%bash

./template.exe
5
7

Anonymous functions

[72]:
%%file lambda.cpp

#include <iostream>
using std::cout;
using std::endl;

int main() {

    int a = 3, b = 4;
    int c = 0;

    // Lambda function with no capture
    auto add1 = [] (int a, int b) { return a + b; };
    // Lambda function with value capture
    auto add2 = [c] (int a, int b) { return c * (a + b); };
    // Lambda funciton with reference capture
    auto add3 = [&c] (int a, int b) { return c * (a + b); };

    // Change value of c after function definition
    c += 5;

    cout << "Lambda function\n";
    cout << add1(a, b) <<  endl;
    cout << "Lambda function with value capture\n";
    cout << add2(a, b) <<  endl;
    cout << "Lambda function with reference capture\n";
    cout << add3(a, b) <<  endl;

}
Writing lambda.cpp
[73]:
%%bash

c++ -o lambda.exe lambda.cpp --std=c++14
[74]:
%%bash

./lambda.exe
Lambda function
7
Lambda function with value capture
0
Lambda function with reference capture
35

Function pointers

[75]:
%%file func_pointer.cpp

#include <iostream>
#include <vector>
#include <functional>

using std::cout;
using std::endl;
using std::function;
using std::vector;

int main()
{
    cout << "\nUsing generalized function pointers\n";
    using func = function<double(double, double)>;

    auto f1 = [](double x, double y) { return x + y; };
    auto f2 = [](double x, double y) { return x * y; };
    auto f3 = [](double x, double y) { return x + y*y; };

    double x = 3, y = 4;

    vector<func> funcs = {f1, f2, f3,};

    for (auto& f : funcs) {
        cout << f(x, y) << "\n";
    }
}
Writing func_pointer.cpp
[76]:
%%bash

g++ -o func_pointer.exe func_pointer.cpp -std=c++14
[77]:
%%bash

./func_pointer.exe

Using generalized function pointers
7
12
19

Standard template library (STL)

The STL provides templated containers and gneric algorithms acting on these containers with a consistent API.

[78]:
%%file stl.cpp

#include <iostream>
#include <vector>
#include <map>
#include <unordered_map>

using std::vector;
using std::map;
using std::unordered_map;
using std::string;
using std::cout;
using std::endl;

struct Point{
    int x;
    int y;

    Point(int x_, int y_) :
      x(x_), y(y_) {};
};

int main() {
    vector<int> v1 = {1,2,3};
    v1.push_back(4);
    v1.push_back(5);

    cout << "Vecotr<int>" << endl;
    for (auto n: v1) {
        cout << n << endl;
    }
    cout << endl;

    vector<Point> v2;
    v2.push_back(Point(1, 2));
    v2.emplace_back(3,4);

    cout <<  "Vector<Point>" << endl;
    for (auto p: v2) {
        cout << "(" << p.x << ", " << p.y << ")" << endl;
    }
    cout << endl;

    map<string, int> v3 = {{"foo", 1}, {"bar", 2}};
    v3["hello"] = 3;
    v3.insert({"goodbye", 4});

    // Note the a C++ map is ordered
    // Note using (traditional) iterators instead of ranged for loop
    cout << "Map<string, int>" << endl;
    for (auto iter=v3.begin(); iter != v3.end(); iter++) {
        cout << iter->first << ": " << iter->second << endl;
    }
    cout << endl;

    unordered_map<string, int> v4 = {{"foo", 1}, {"bar", 2}};
    v4["hello"] = 3;
    v4.insert({"goodbye", 4});

    // Note the unordered_map is similar to Python' dict.'
    // Note using ranged for loop with const ref to avoid copying or mutation
    cout << "Unordered_map<string, int>" << endl;
    for (const auto& i: v4) {
        cout << i.first << ": " << i.second << endl;
    }
    cout << endl;
}
Writing stl.cpp
[79]:
%%bash

g++ -o stl.exe stl.cpp -std=c++14
[80]:
%%bash

./stl.exe
Vecotr<int>
1
2
3
4
5

Vector<Point>
(1, 2)
(3, 4)

Map<string, int>
bar: 2
foo: 1
goodbye: 4
hello: 3

Unordered_map<string, int>
goodbye: 4
hello: 3
bar: 2
foo: 1

STL algorithms

[81]:
%%file stl_algorithm.cpp

#include <vector>
#include <iostream>
#include <numeric>

using std::cout;
using std::endl;
using std::vector;
using std::begin;
using std::end;

int main() {
    vector<int> v(10);

    // iota is somewhat like range
    std::iota(v.begin(), v.end(), 1);

    for (auto i: v) {
        cout << i << " ";
    }
    cout << endl;

    // C++ version of reduce
    cout << std::accumulate(begin(v), end(v), 0) << endl;

    // Accumulate with lambda
    cout << std::accumulate(begin(v), end(v), 1, [](int a, int b){return a * b; }) << endl;
}
Writing stl_algorithm.cpp
[82]:
%%bash

g++ -o stl_algorithm.exe stl_algorithm.cpp -std=c++14
[83]:
%%bash

./stl_algorithm.exe
1 2 3 4 5 6 7 8 9 10
55
3628800

Random numbers

[84]:
%%file random.cpp

#include <iostream>
#include <random>
#include <functional>

using std::cout;
using std::random_device;
using std::mt19937;
using std::default_random_engine;
using std::uniform_int_distribution;
using std::poisson_distribution;
using std::student_t_distribution;
using std::bind;

// start random number engine with fixed seed
// Note default_random_engine may give differnet values on different platforms
// default_random_engine re(1234);

// or
// Using a named engine will work the same on differnt platforms
// mt19937 re(1234);

// start random number generator with random seed
random_device rd;
mt19937 re(rd());

uniform_int_distribution<int> uniform(1,6); // lower and upper bounds
poisson_distribution<int> poisson(30); // rate
student_t_distribution<double> t(10); // degrees of freedom

int main()
{
    cout << "\nGenerating random numbers\n";

    auto runif = bind (uniform, re);
    auto rpois = bind(poisson, re);
    auto rt = bind(t, re);

    for (int i=0; i<10; i++) {
        cout << runif() << ", " << rpois() <<  ", " << rt() << "\n";

    }
}
Writing random.cpp
[85]:
%%bash

g++ -o random.exe random.cpp -std=c++14
[86]:
%%bash

./random.exe

Generating random numbers
3, 31, -0.770332
3, 27, -0.242753
6, 42, 0.635808
1, 22, -1.28388
6, 27, 1.84322
4, 22, 0.90192
5, 27, 0.2381
5, 23, -0.755602
6, 33, 0.389624
6, 21, 1.29621

Numerics

Using Armadillo

[87]:
%%file test_arma.cpp

#include <iostream>
#include <armadillo>

using std::cout;
using std::endl;

int main()
{
    using namespace arma;

    vec u = linspace<vec>(0,1,5);
    vec v = ones<vec>(5);
    mat A = randu<mat>(4,5); // uniform random deviates
    mat B = randn<mat>(4,5); // normal random deviates

    cout << "\nVecotrs in Armadillo\n";
    cout << u << endl;
    cout << v << endl;
    cout << u.t() * v << endl;

    cout << "\nRandom matrices in Armadillo\n";
    cout << A << endl;
    cout << B << endl;
    cout << A * B.t() << endl;
    cout << A * v << endl;

    cout << "\nQR in Armadillo\n";
    mat Q, R;
    qr(Q, R, A.t() * A);
    cout << Q << endl;
    cout << R << endl;
}
Writing test_arma.cpp
[88]:
%%bash

g++ -o test_arma.exe test_arma.cpp -std=c++14 -larmadillo
[89]:
%%bash

./test_arma.exe

Vecotrs in Armadillo
        0
   0.2500
   0.5000
   0.7500
   1.0000

   1.0000
   1.0000
   1.0000
   1.0000
   1.0000

   2.5000


Random matrices in Armadillo
   0.7868   0.0193   0.5206   0.1400   0.4998
   0.2505   0.4049   0.3447   0.5439   0.4194
   0.7107   0.2513   0.2742   0.5219   0.7443
   0.9467   0.0227   0.5610   0.8571   0.2492

  -0.7674  -0.0953   0.4285   0.3620   0.0415
  -1.1120   0.0613   1.4152  -1.9853   0.1661
  -0.7436  -1.6618  -0.0841   1.9332  -0.9602
  -0.6272  -0.2744   0.6118   0.0490  -1.3933

  -0.3111  -0.3320  -0.8700  -0.8698
   0.1312  -0.7760  -0.2394  -0.6150
  -0.2320  -1.2993  -0.6748  -1.3584
  -0.1677  -1.9175   0.6289  -0.5619

   1.9665
   1.9633
   2.5024
   2.6367


QR in Armadillo
  -0.6734   0.5087   0.2592   0.0105  -0.4696
  -0.1024  -0.6686  -0.2105   0.1467  -0.6904
  -0.3950   0.1098  -0.7627   0.4187   0.2738
  -0.4618  -0.2996  -0.1503  -0.7862   0.2373
  -0.4083  -0.4386   0.5332   0.4302   0.4142

  -3.0934e+00  -6.5241e-01  -1.8687e+00  -2.3279e+00  -2.0254e+00
        0e+00  -2.4110e-01  -4.0694e-02  -2.1686e-01  -2.5067e-01
        0e+00        0e+00  -6.0467e-02  -1.0163e-01   9.8203e-02
        0e+00        0e+00        0e+00  -2.1243e-01   1.2173e-01
        0e+00        0e+00        0e+00        0e+00   3.6082e-16

Using Eigen

[1]:
%%file test_eigen.cpp
#include <iostream>
#include <fstream>
#include <random>
#include <Eigen/Dense>
#include <functional>

using std::cout;
using std::endl;
using std::ofstream;

using std::default_random_engine;
using std::normal_distribution;
using std::bind;

// start random number engine with fixed seed
default_random_engine re{12345};

normal_distribution<double> norm(5,2); // mean and standard deviation
auto rnorm = bind(norm, re);

int main()
{
    using namespace Eigen;

    VectorXd x1(6);
    x1 << 1, 2, 3, 4, 5, 6;
    VectorXd x2 = VectorXd::LinSpaced(6, 1, 2);
    VectorXd x3 = VectorXd::Zero(6);
    VectorXd x4 = VectorXd::Ones(6);
    VectorXd x5 = VectorXd::Constant(6, 3);
    VectorXd x6 = VectorXd::Random(6);

    double data[] = {6,5,4,3,2,1};
    Map<VectorXd> x7(data, 6);

    VectorXd x8 = x6 + x7;

    MatrixXd A1(3,3);
    A1 << 1 ,2, 3,
          4, 5, 6,
          7, 8, 9;
    MatrixXd A2 = MatrixXd::Constant(3, 4, 1);
    MatrixXd A3 = MatrixXd::Identity(3, 3);

    Map<MatrixXd> A4(data, 3, 2);

    MatrixXd A5 = A4.transpose() * A4;
    MatrixXd A6 = x7 * x7.transpose();
    MatrixXd A7 = A4.array() * A4.array();
    MatrixXd A8 = A7.array().log();
    MatrixXd A9 = A8.unaryExpr([](double x) { return exp(x); });
    MatrixXd A10 = MatrixXd::Zero(3,4).unaryExpr([](double x) { return rnorm(); });

    VectorXd x9 = A1.colwise().norm();
    VectorXd x10 = A1.rowwise().sum();

    MatrixXd A11(x1.size(), 3);
    A11 << x1, x2, x3;

    MatrixXd A12(3, x1.size());
    A12 << x1.transpose(),
          x2.transpose(),
          x3.transpose();

    JacobiSVD<MatrixXd> svd(A10, ComputeThinU | ComputeThinV);


    cout << "x1: comman initializer\n" << x1.transpose() << "\n\n";
    cout << "x2: linspace\n" << x2.transpose() << "\n\n";
    cout << "x3: zeors\n" << x3.transpose() << "\n\n";
    cout << "x4: ones\n" << x4.transpose() << "\n\n";
    cout << "x5: constant\n" << x5.transpose() << "\n\n";
    cout << "x6: rand\n" << x6.transpose() << "\n\n";
    cout << "x7: mapping\n" << x7.transpose() << "\n\n";
    cout << "x8: element-wise addition\n" << x8.transpose() << "\n\n";

    cout << "max of A1\n";
    cout << A1.maxCoeff() << "\n\n";
    cout << "x9: norm of columns of A1\n" << x9.transpose() << "\n\n";
    cout << "x10: sum of rows of A1\n" << x10.transpose() << "\n\n";

    cout << "head\n";
    cout << x1.head(3).transpose() << "\n\n";
    cout << "tail\n";
    cout << x1.tail(3).transpose() << "\n\n";
    cout << "slice\n";
    cout << x1.segment(2, 3).transpose() << "\n\n";

    cout << "Reverse\n";
    cout << x1.reverse().transpose() << "\n\n";

    cout << "Indexing vector\n";
    cout << x1(0);
    cout << "\n\n";

    cout << "A1: comma initilizer\n";
    cout << A1 << "\n\n";
    cout << "A2: constant\n";
    cout << A2 << "\n\n";
    cout << "A3: eye\n";
    cout << A3 << "\n\n";
    cout << "A4: mapping\n";
    cout << A4 << "\n\n";
    cout << "A5: matrix multiplication\n";
    cout << A5 << "\n\n";
    cout << "A6: outer product\n";
    cout << A6 << "\n\n";
    cout << "A7: element-wise multiplication\n";
    cout << A7 << "\n\n";
    cout << "A8: ufunc log\n";
    cout << A8 << "\n\n";
    cout << "A9: custom ufucn\n";
    cout << A9 << "\n\n";
    cout << "A10: custom ufunc for normal deviates\n";
    cout << A10 << "\n\n";
    cout << "A11: np.c_\n";
    cout << A11 << "\n\n";
    cout << "A12: np.r_\n";
    cout << A12 << "\n\n";

    cout << "2x2 block startign at (0,1)\n";
    cout << A1.block(0,1,2,2) << "\n\n";
    cout << "top 2 rows of A1\n";
    cout << A1.topRows(2) << "\n\n";
    cout << "bottom 2 rows of A1";
    cout << A1.bottomRows(2) << "\n\n";
    cout << "leftmost 2 cols of A1";
    cout << A1.leftCols(2) << "\n\n";
    cout << "rightmost 2 cols of A1";
    cout << A1.rightCols(2) << "\n\n";

    cout << "Diagonal elements of A1\n";
    cout << A1.diagonal() << "\n\n";
    A1.diagonal() = A1.diagonal().array().square();
    cout << "Transforming diagonal eelemtns of A1\n";
    cout << A1 << "\n\n";

    cout << "Indexing matrix\n";
    cout << A1(0,0) << "\n\n";

    cout << "singular values\n";
    cout << svd.singularValues() << "\n\n";

    cout << "U\n";
    cout << svd.matrixU() << "\n\n";

    cout << "V\n";
    cout << svd.matrixV() << "\n\n";
}
Overwriting test_eigen.cpp
[9]:
import os
if not os.path.exists('./eigen'):
    ! git clone https://gitlab.com/libeigen/eigen.git
[10]:
%%bash

g++ -o test_eigen.exe test_eigen.cpp -std=c++11 -I./eigen
[11]:
%%bash

./test_eigen.exe
x1: comman initializer
1 2 3 4 5 6

x2: linspace
  1 1.2 1.4 1.6 1.8   2

x3: zeors
0 0 0 0 0 0

x4: ones
1 1 1 1 1 1

x5: constant
3 3 3 3 3 3

x6: rand
 0.680375 -0.211234  0.566198   0.59688  0.823295 -0.604897

x7: mapping
6 5 4 3 2 1

x8: element-wise addition
 6.68038  4.78877   4.5662  3.59688  2.82329 0.395103

max of A1
9

x9: norm of columns of A1
8.12404 9.64365  11.225

x10: sum of rows of A1
 6 15 24

head
1 2 3

tail
4 5 6

slice
3 4 5

Reverse
6 5 4 3 2 1

Indexing vector
1

A1: comma initilizer
1 2 3
4 5 6
7 8 9

A2: constant
1 1 1 1
1 1 1 1
1 1 1 1

A3: eye
1 0 0
0 1 0
0 0 1

A4: mapping
6 3
5 2
4 1

A5: matrix multiplication
77 32
32 14

A6: outer product
36 30 24 18 12  6
30 25 20 15 10  5
24 20 16 12  8  4
18 15 12  9  6  3
12 10  8  6  4  2
 6  5  4  3  2  1

A7: element-wise multiplication
36  9
25  4
16  1

A8: ufunc log
3.58352 2.19722
3.21888 1.38629
2.77259       0

A9: custom ufucn
36  9
25  4
16  1

A10: custom ufunc for normal deviates
5.22353 6.16474 3.67474 6.18264
3.81868 4.07998 3.52576 3.82792
3.74872 5.76697  5.3517 2.85655

A11: np.c_
  1   1   0
  2 1.2   0
  3 1.4   0
  4 1.6   0
  5 1.8   0
  6   2   0

A12: np.r_
  1   2   3   4   5   6
  1 1.2 1.4 1.6 1.8   2
  0   0   0   0   0   0

2x2 block startign at (0,1)
2 3
5 6

top 2 rows of A1
1 2 3
4 5 6

bottom 2 rows of A14 5 6
7 8 9

leftmost 2 cols of A11 2
4 5
7 8

rightmost 2 cols of A12 3
5 6
8 9

Diagonal elements of A1
1
5
9

Transforming diagonal eelemtns of A1
 1  2  3
 4 25  6
 7  8 81

Indexing matrix
1

singular values
15.8886
2.58818
0.54229

U
 -0.67345  0.607383  0.421368
-0.479529 0.0748696 -0.874326
-0.562598 -0.790873  0.240837

V
 -0.46939  0.190799 -0.433197
-0.588634 -0.197482  0.773183
-0.451663 -0.670964 -0.452453
-0.478731   0.68877 -0.099069

Check SVD

[12]:
import numpy as np

A10 = np.array([
    [5.17237, 3.73572, 6.29422, 6.55268],
    [5.33713, 3.88883, 1.93637, 4.39812],
    [8.22086, 6.94502, 6.36617,  6.5961]
])

U, s, Vt = np.linalg.svd(A10, full_matrices=False)
[13]:
s
[13]:
array([19.50007376,  2.80674189,  1.29869186])
[14]:
U
[14]:
array([[-0.55849978, -0.75124103, -0.3517313 ],
       [-0.40681745,  0.61759344, -0.67311062],
       [-0.72289526,  0.2328417 ,  0.65054376]])
[15]:
Vt.T
[15]:
array([[-0.56424535,  0.47194895, -0.04907563],
       [-0.44558625,  0.43195279,  0.45157518],
       [-0.45667231, -0.73048295,  0.48064272],
       [-0.52395657, -0.23890509, -0.75010267]])

Probability distributions and statistics

A nicer library for working with probability distributions. Show integration with Armadillo. Integration with Eigen is also possible.

[16]:
import os

if not os.path.exists('./stats'):
    ! git clone https://github.com/kthohr/stats.git
if not os.path.exists('./gcem'):
    ! git clone https://github.com/kthohr/gcem.git
Cloning into 'stats'...
remote: Enumerating objects: 6248, done.
remote: Total 6248 (delta 0), reused 0 (delta 0), pack-reused 6248
Receiving objects: 100% (6248/6248), 1.29 MiB | 0 bytes/s, done.
Resolving deltas: 100% (5468/5468), done.
Checking connectivity... done.
Cloning into 'gcem'...
remote: Enumerating objects: 5, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (5/5), done.
remote: Total 2153 (delta 0), reused 1 (delta 0), pack-reused 2148
Receiving objects: 100% (2153/2153), 435.40 KiB | 0 bytes/s, done.
Resolving deltas: 100% (1668/1668), done.
Checking connectivity... done.
[17]:
%%file stats.cpp
#define STATS_ENABLE_STDVEC_WRAPPERS
#define STATS_ENABLE_ARMA_WRAPPERS
// #define STATS_ENABLE_EIGEN_WRAPPERS
#include <iostream>
#include <vector>
#include "stats.hpp"

using std::cout;
using std::endl;
using std::vector;

// set seed for randome engine to 1776
std::mt19937_64 engine(1776);

int main() {
    // evaluate the normal PDF at x = 1, mu = 0, sigma = 1
    double dval_1 = stats::dnorm(1.0,0.0,1.0);

    // evaluate the normal PDF at x = 1, mu = 0, sigma = 1, and return the log value
    double dval_2 = stats::dnorm(1.0,0.0,1.0,true);

    // evaluate the normal CDF at x = 1, mu = 0, sigma = 1
    double pval = stats::pnorm(1.0,0.0,1.0);

    // evaluate the Laplacian quantile at p = 0.1, mu = 0, sigma = 1
    double qval = stats::qlaplace(0.1,0.0,1.0);

    // draw from a normal distribution with mean 100 and sd 15
    double rval = stats::rnorm(100, 15);

    // Use with std::vectors
    vector<int> pois_rvs = stats::rpois<vector<int> >(1, 10, 3);
    cout << "Poisson draws with rate=3 inton std::vector" << endl;
    for (auto &x : pois_rvs) {
        cout << x << ", ";
    }
    cout << endl;


    // Example of Armadillo usage: only one matrix library can be used at a time
    arma::mat beta_rvs = stats::rbeta<arma::mat>(5,5,3.0,2.0);
    // matrix input
    arma::mat beta_cdf_vals = stats::pbeta(beta_rvs,3.0,2.0);

    /* Example of Eigen usage: only one matrix library can be used at a time
    Eigen::MatrixXd gamma_rvs = stats::rgamma<Eigen::MatrixXd>(10, 5,3.0,2.0);
    */

    cout << "evaluate the normal PDF at x = 1, mu = 0, sigma = 1" << endl;
    cout << dval_1 << endl;

    cout << "evaluate the normal PDF at x = 1, mu = 0, sigma = 1, and return the log value" << endl;
    cout << dval_2 << endl;
    cout << "evaluate the normal CDF at x = 1, mu = 0, sigma = 1" << endl;
    cout << pval << endl;

    cout << "evaluate the Laplacian quantile at p = 0.1, mu = 0, sigma = 1" << endl;
    cout << qval << endl;

    cout << "draw from a normal distribution with mean 100 and sd 15" << endl;
    cout << rval << endl;



    cout << "draws from a beta distribuiotn to populate Armadillo matrix" << endl;
    cout << beta_rvs << endl;

    cout << "evaluaate CDF for beta draws from Armadillo inputs" << endl;
    cout << beta_cdf_vals << endl;

    /*  If using Eigen
    cout << "draws from a Gamma distribuiotn to populate Eigen matrix" << endl;
    cout << gamma_rvs << endl;
    */
}
Writing stats.cpp
[18]:
%%bash

g++ -std=c++11 -I./stats/include -I./gcem/include -I./eigen stats.cpp -o stats.exe
[19]:
%%bash

./stats.exe
Poisson draws with rate=3 inton std::vector
1, 3, 5, 6, 1, 3, 1, 1, 2, 3,
evaluate the normal PDF at x = 1, mu = 0, sigma = 1
0.241971
evaluate the normal PDF at x = 1, mu = 0, sigma = 1, and return the log value
-1.41894
evaluate the normal CDF at x = 1, mu = 0, sigma = 1
0.841345
evaluate the Laplacian quantile at p = 0.1, mu = 0, sigma = 1
-1.60944
draw from a normal distribution with mean 100 and sd 15
120.059
draws from a beta distribuiotn to populate Armadillo matrix
   0.6504   0.3613   0.5206   0.5956   0.6657
   0.4931   0.0844   0.5793   0.3354   0.1347
   0.8770   0.2274   0.7872   0.6645   0.5689
   0.6129   0.9087   0.8134   0.6172   0.6005
   0.6285   0.8030   0.6545   0.5177   0.3090

evaluaate CDF for beta draws from Armadillo inputs
   0.5637   0.1375   0.3440   0.4677   0.5909
   0.3022   0.0023   0.4398   0.1130   0.0088
   0.9234   0.0390   0.7993   0.5887   0.4222
   0.4975   0.9558   0.8394   0.5051   0.4760
   0.5249   0.8237   0.5710   0.3395   0.0906

Solution to exercise

[20]:
%%file greet.cpp

#include <iostream>
#include <string>
using std::string;
using std::cout;

int main(int argc, char* argv[]) {
    string name = argv[1];
    int n = std::stoi(argv[2]);

    for (int i=0; i<n; i++) {
        cout << "Hello " << name << "!" << "\n";
    }
}
Writing greet.cpp
[21]:
%%bash

g++ -std=c++11 greet.cpp -o greet
[22]:
%%bash

./greet Santa 3
Hello Santa!
Hello Santa!
Hello Santa!