Introduction to C++¶
Hello world¶
There are many lessons in writing a simple “Hello world” program
C++ programs are normally written using a text editor or integrated development environment (IDE) - we use the %%file magic to simulate this
The #include statement literally pulls in and prepends the source code from the
iostream
header fileTypes must be declared - note the function return type is
int
There is a single function called
main
- every program hasmain
as the entry point although you can write libraries without amain
functionNotice the use of braces to delimit blocks
Notice the use of semi-colons to delimit expressions
Unlike Python, white space is not used to delimit blocks or expressions only tokens
Note the use of the
std
namespace - this is similar to Python except C++ uses::
rather than.
(like R)The I/O shown here uses streaming via the
<<
operator to send output tocout
, which is the name for the standard outputstd::endl
provides a line break and flushes the input buffer
[1]:
%%file hello.cpp
#include <iostream>
int main() {
std::cout << "Hello, world!" << std::endl;
}
Writing hello.cpp
Compilation¶
The source file must be compiled to machine code before it can be exeuted
Compilation is done with a C++ compiler - here we use one called
g++
By default, the output of compilation is called
a.out
- we use-o
to change the output executable filename tohello.exe
Note the use of
.exe
is a Windows convention; Unix executables typically have no extension - for example, just be the namehello
[2]:
%%bash
g++ hello.cpp -o hello.exe
C equivalent¶
Before we move on, we briefly show the similar Hello world
program in C. C is a precursor to C++ that is still widely used. While C++ is derived from C, it is a much richer and more complex language. We focus on C++ because the intent is to show how to wrap C++ code using pybind11
and take advantage of C++ numerical libraries that do not exist in C.
[4]:
%%file hello01.c
#include <stdio.h>
int main() {
printf("Hello, world from C!\n");
}
Writing hello01.c
[5]:
%%bash
gcc hello01.c
[6]:
%%bash
./a.out
Hello, world from C!
Namespaces¶
Just like Python, C++ has namespaces that allow us to build large libraries without worrying about name collisions. In the Hello world
program, we used the explicit name std::cout
indicating that cout
is a member of the standard workspace. We can also use the using
keyword to import selected functions or classes from a namespace.
using std::cout;
int main()
{
cout << "Hello, world!\n";
}
For small programs, we sometimes import the entire namespace for convenience, but this may cause namespace collisions in larger programs.
using namespace std;
int main()
{
cout << "Hello, world!\n";
}
You can easily create your own namespace.
namespace sta_663 {
const double pi=2.14159;
void greet(string name) {
cout << "\nTraditional first program\n";
cout << "Hello, " << name << "\n";
}
}
int main()
{
cout << "\nUsing namespaces\n";
string name = "Tom";
cout << sta_663::pi << "\n";
sta_663::greet(name);
}
[7]:
%%file hello02.cpp
#include <iostream>
using std::cout;
using std::endl;
int main() {
cout << "Hello, world!" << endl;
}
Writing hello02.cpp
[8]:
%%bash
g++ hello02.cpp -o hello02
[9]:
%%bash
./hello02
Hello, world!
Wholesale imports of namespace is generally frowned upon, similar to how from X import *
is frowned upon in Python.
[10]:
%%file hello03.cpp
#include <iostream>
using namespace std;
int main() {
cout << "Hello, world!" << endl;
}
Writing hello03.cpp
[11]:
%%bash
g++ hello03.cpp -o hello03
[12]:
%%bash
./hello03
Hello, world!
Types¶
[13]:
%%file dtypes.cpp
#include <iostream>
#include <complex>
using std::cout;
int main() {
// Boolean
bool a = true, b = false;
cout << "and " << (a and b) << "\n";
cout << "&& " << (a && b) << "\n";
cout << "or " << (a or b) << "\n";
cout << "|| " << (a || b) << "\n";
cout << "not " << not (a or b) << "\n";
cout << "! " << !(a or b) << "\n";
// Integral numbers
cout << "char " << sizeof(char) << "\n";
cout << "short int " << sizeof(short int) << "\n";
cout << "int " << sizeof(int) << "\n";
cout << "long " << sizeof(long) << "\n";
// Floating point numbers
cout << "float " << sizeof(float) << "\n";
cout << "double " << sizeof(double) << "\n";
cout << "long double " << sizeof(long double) << "\n";
cout << "complex double " << sizeof(std::complex<double>) << "\n";
// Characters and strings
char c = 'a'; // Note single quotes
char word[] = "hello"; // C char arrays
std::string s = "hello"; // C++ string
cout << c << "\n";
cout << word << "\n";
cout << s << "\n";
}
Writing dtypes.cpp
[14]:
%%bash
g++ dtypes.cpp -o dtypes.exe
./dtypes.exe
and 0
&& 0
or 1
|| 1
not 0
! 0
char 1
short int 2
int 4
long 8
float 4
double 8
long double 16
complex double 16
a
hello
hello
Type conversions¶
Converting between types can get pretty complicated in C++. We will show some simple versions.
[15]:
%%file type.cpp
#include <iostream>
using std::cout;
using std::string;
using std::stoi;
int main() {
char c = '3'; // A char is an integer type
string s = "3"; // A string is not an integer type
int i = 3;
float f = 3.1;
double d = 3.2;
cout << c << "\n";
cout << i << "\n";
cout << f << "\n";
cout << d << "\n";
cout << "c + i is " << c + i << "\n";
cout << "c + i is " << c - '0' + i << "\n";
// Casting string to number
cout << "s + i is " << stoi(s) + i << "\n"; // Use std::stod to convert to double
// Two ways to cast float to int
cout << "f + i is " << f + i << "\n";
cout << "f + i is " << int(f) + i << "\n";
cout << "f + i is " << static_cast<int>(f) + i << "\n";
}
Writing type.cpp
[16]:
%%bash
g++ -o type.exe type.cpp -std=c++14
[17]:
%%bash
./type.exe
3
3
3.1
3.2
c + i is 54
c + i is 6
s + i is 6
f + i is 6.1
f + i is 6
f + i is 6
Header, source, and driver files¶
C++ allows separate compilation of functions and programs that use those functions. The way it does this is to write functions in source files that can be compiled. To use these compiled functions, the calling program includes header files that contain the function signatures - this provides enough information for the compiler to link to the compiled function machine code when executing the program.
Here we show a toy example of typical C++ program organization
We build a library of math functions in
my_math.cpp
We add a header file for the math functions in
my_math.hpp
We build a library of stats functions in
my_stats.cpp
We add a header file for the stats functions in
my_stats.hpp
We write a program that uses math and stats functions called
my_driver.cpp
We pull in the function signatures with
#include
for the header files
Once you understand the code, move on to see how compilation is done
Note that it is customary to include the header file in the source file itself to let the compiler catch any mistakes in the function signatures
[18]:
%%file my_math.hpp
#pragma once
int add(int a, int b);
int multiply(int a, int b);
Writing my_math.hpp
[19]:
%%file my_math.cpp
#include "my_math.hpp"
int add(int a, int b) {
return a + b;
}
int multiply(int a, int b) {
return a * b;
}
Writing my_math.cpp
[20]:
%%file my_stats.hpp
#pragma once
int mean(int xs[], int n);
Writing my_stats.hpp
[21]:
%%file my_stats.cpp
#include "my_math.hpp"
int mean(int xs[], int n) {
double s = 0;
for (int i=0; i<n; i++) {
s += xs[i];
}
return s/n;
}
Writing my_stats.cpp
[22]:
%%file my_driver.cpp
#include <iostream>
#include "my_math.hpp"
#include "my_stats.hpp"
int main() {
int xs[] = {1,2,3,4,5};
int n = 5;
int a = 3, b= 4;
std::cout << "sum = " << add(a, b) << "\n";
std::cout << "prod = " << multiply(a, b) << "\n";
std::cout << "mean = " << mean(xs, n) << "\n";
}
Writing my_driver.cpp
Compilation
Notice in the first 2 compile statements, that the source files are compiled to object files with default extension
.o
by usin gthe flag-c
The 3rd compile statement builds an executable by linking the
main
file with the recently created object filesThe function signatures in the included header files tells the compiler how to match the function calls
add
,multiply
andmean
with the matching compiled functions
[23]:
%%bash
g++ -c my_math.cpp
g++ -c my_stats.cpp
g++ my_driver.cpp my_math.o my_stats.o
[24]:
%%bash
./a.out
sum = 7
prod = 12
mean = 3
Using make
¶
As building C++ programs can quickly become quite complicated, there are builder programs that help simplify this task. One of the most widely used is make
, which uses a file normally called Makefile
to coordinate the instructions for building a program
Note that
make
can be used for more than compiling programs; for example, you can use it to automatically rebuild tables and figures for a manuscript whenever the data is changedAnother advantage of
make
is that it keeps track of dependencies, and only re-compiles files that have changed or depend on another changed file since the last compilation
We will build a simple Makefile
to build the my_driver
executable:
Each section consists of a make target denoted by
<targget>:
followed by files the target depends onThe next line is the command given to build the target. This must begin with a TAB character (it MUST be a TAB and not spaces)
If a target has dependencies that are not met,
make
will see if each dependency itself is a target and build that firstIt uses timestamps to decide whether to rebuild a target (not actually changes)
By default,
make
builds the first target, but can also build named targets
How to get the TAB character. Copy and paste the blank space between a
and b
.
[25]:
! echo "a\tb"
a b
[ ]:
[26]:
%%file Makefile
driver: my_math.o my_stats.o
g++ my_driver.cpp my_math.o my_stats.o -o my_driver
my_math.o: my_math.cpp my_math.hpp
g++ -c my_math.cpp
my_stats.o: my_stats.cpp my_stats.hpp
g++ -c my_stats.cpp
Writing Makefile
We first start with a clean slate
[27]:
%%capture logfile
%%bash
rm *\.o
rm my_driver
[28]:
%%bash
make
g++ -c my_math.cpp
g++ -c my_stats.cpp
g++ my_driver.cpp my_math.o my_stats.o -o my_driver
[29]:
%%bash
./my_driver
sum = 7
prod = 12
mean = 3
Re-building does not trigger re-compilation of source files since the timestamps have not changed
[30]:
%%bash
make
g++ my_driver.cpp my_math.o my_stats.o -o my_driver
[31]:
%%bash
touch my_stats.hpp
As
my_stats.hpp
was listed as a dependency of the targetmy_stats.o
,touch
, which updates the timestamp, forces a recompilation ofmy_stats.o
[32]:
%%bash
make
g++ -c my_stats.cpp
g++ my_driver.cpp my_math.o my_stats.o -o my_driver
Use of variables in Makefile¶
[33]:
%%file Makefile2
CC=g++
CFLAGS=-Wall -std=c++14
driver: my_math.o my_stats.o
$(CC) $(CFLAGS) my_driver.cpp my_math.o my_stats.o -o my_driver2
my_math.o: my_math.cpp my_math.hpp
$(CC) $(CFLAGS) -c my_math.cpp
my_stats.o: my_stats.cpp my_stats.hpp
$(CC) $(CFLAGS) -c my_stats.cpp
Writing Makefile2
Compilation¶
Note that no re-compilation occurs!
[34]:
%%bash
make -f Makefile2
g++ -Wall -std=c++14 my_driver.cpp my_math.o my_stats.o -o my_driver2
Input and output¶
[36]:
%%file main_args.cpp
#include <iostream>
using std::cout;
int main(int argc, char* argv[]) {
for (int i=0; i<argc; i++) {
cout << i << ": " << argv[i] << "\n";
}
}
Writing main_args.cpp
[37]:
%%bash
g++ main_args.cpp -o main_args
[38]:
%%bash
./main_args hello 1 2 3
0: ./main_args
1: hello
2: 1
3: 2
4: 3
Exercise
Write, compile and execute a progrm called greet
that when called on the command line with
greet Santa 3
gives the output
Hello Santa!
Hello Santa!
Hello Santa!
[39]:
%%file data.txt
9 6
Writing data.txt
[40]:
%%file io.cpp
#include <fstream>
#include "my_math.hpp"
int main() {
std::ifstream fin("data.txt");
std::ofstream fout("result.txt");
double a, b;
fin >> a >> b;
fin.close();
fout << add(a, b) << std::endl;
fout << multiply(a, b) << std::endl;
fout.close();
}
Writing io.cpp
[41]:
%%bash
g++ io.cpp -o io.exe my_math.cpp
[42]:
%%bash
./io.exe
[43]:
! cat result.txt
15
54
Arrays¶
[44]:
%%file array.cpp
#include <iostream>
using std::cout;
using std::endl;
int main() {
int N = 3;
double counts[N];
counts[0] = 1;
counts[1] = 3;
counts[2] = 3;
double avg = (counts[0] + counts[1] + counts[2])/3;
cout << avg << endl;
}
Writing array.cpp
[45]:
%%bash
g++ -o array.exe array.cpp
[46]:
%%bash
./array.exe
2.33333
Loops¶
[47]:
%%file loop.cpp
#include <iostream>
using std::cout;
using std::endl;
using std::begin;
using std::end;
int main()
{
int x[] = {1, 2, 3, 4, 5};
cout << "\nTraditional for loop\n";
for (int i=0; i < sizeof(x)/sizeof(x[0]); i++) {
cout << i << endl;
}
cout << "\nUsing iterators\n";
for (auto it=begin(x); it != end(x); it++) {
cout << *it << endl;
}
cout << "\nRanged for loop\n\n";
for (auto const &i : x) {
cout << i << endl;
}
}
Writing loop.cpp
[48]:
%%bash
g++ -o loop.exe loop.cpp -std=c++14
[49]:
%%bash
./loop.exe
Traditional for loop
0
1
2
3
4
Using iterators
1
2
3
4
5
Ranged for loop
1
2
3
4
5
Function arguments¶
A value argument means that the argument is copied in the body of the function
A referene argument means that the addresss of the value is useed in the function. Reference or pointer arugments are used to avoid copying large objects.
[50]:
%%file func_arg.cpp
#include <iostream>
using std::cout;
using std::endl;
// Value parameter
void f1(int x) {
x *= 2;
cout << "In f1 : x=" << x << endl;
}
// Reference parameter
void f2(int &x) {
x *= 2;
cout << "In f2 : x=" << x << endl;
}
/* Note
If you want to avoid side effects
but still use references to avoid a copy operation
use a const refernece like this to indicate that x cannot be changed
void f2(const int &x)
*/
/* Note
Raw pointers are prone to error and
generally avoided in modern C++
See unique_ptr and shared_ptr
*/
// Raw pointer parameter
void f3(int *x) {
*x *= 2;
cout << "In f3 : x=" << *x << endl;
}
int main() {
int x = 1;
cout << "Before f1: x=" << x << "\n";
f1(x);
cout << "After f1 : x=" << x << "\n";
cout << "Before f2: x=" << x << "\n";
f2(x);
cout << "After f2 : x=" << x << "\n";
cout << "Before f3: x=" << x << "\n";
f3(&x);
cout << "After f3 : x=" << x << "\n";
}
Writing func_arg.cpp
[51]:
%%bash
c++ -o func_arg.exe func_arg.cpp --std=c++14
[52]:
%%bash
./func_arg.exe
Before f1: x=1
In f1 : x=2
After f1 : x=1
Before f2: x=1
In f2 : x=2
After f2 : x=2
Before f3: x=2
In f3 : x=4
After f3 : x=4
Arrays, pointers and dynamic memory¶
A pointer is a number that represents an address in computer memory. What is stored at the address is a bunch of binary numbers. How those binary numbers are interpetedd depends on the type of the pointer. To get the value at the pointer adddress, we derefeernce the pointer using *ptr
. Pointers are often used to indicagte the start of a block of value - the name of a plain C-style array is essentialy a pointer to the start of the array.
For example, the argument char** argv
means that argv
has type pointer to pointer to char
. The pointer to char
can be thought of as an array of char
, hence the argument is also sometimes written as char* argv[]
to indicate pointer to char
array. So conceptually, it refers to an array of char
arrays - or a colleciton of strings.
We generally avoid using raw pointers in C++, but this is standard in C and you should at least understand what is going on.
In C++, we typically use smart pointers, STL containers or convenient array constructs provided by libraries such as Eigen and Armadillo.
Pointers and addresses¶
[53]:
%%file p01.cpp
#include <iostream>
using std::cout;
int main() {
int x = 23;
int *xp;
xp = &x;
cout << "x " << x << "\n";
cout << "Address of x " << &x << "\n";
cout << "Pointer to x " << xp << "\n";
cout << "Value at pointer to x " << *xp << "\n";
}
Writing p01.cpp
[54]:
%%bash
g++ -o p01.exe p01.cpp -std=c++14
./p01.exe
x 23
Address of x 0x7ffef073c6d4
Pointer to x 0x7ffef073c6d4
Value at pointer to x 23
Arrays¶
[55]:
%%file p02.cpp
#include <iostream>
using std::cout;
using std::begin;
using std::end;
int main() {
int xs[] = {1,2,3,4,5};
int ys[3];
for (int i=0; i<5; i++) {
ys[i] = i*i;
}
for (auto x=begin(xs); x!=end(xs); x++) {
cout << *x << " ";
}
cout << "\n";
for (auto x=begin(ys); x!=end(ys); x++) {
cout << *x << " ";
}
cout << "\n";
}
Writing p02.cpp
[56]:
%%bash
g++ -o p02.exe p02.cpp -std=c++14
./p02.exe
16 2 3 4 5
0 1 4
Dynamic memory¶
Use
new
anddelete
for dynamic memory allocation in C++.Do not use the C style
malloc
,calloc
andfree
Abosolutely never mix the C++ and C style dynamic memory allocation
[57]:
%%file p03.cpp
#include <iostream>
using std::cout;
using std::begin;
using std::end;
int main() {
// declare memory
int *z = new int; // single integer
*z = 23;
// Allocate on heap
int *zs = new int[3]; // array of 3 integers
for (int i=0; i<3; i++) {
zs[i] = 10*i;
}
cout << *z << "\n";
for (int i=0; i < 3; i++) {
cout << zs[i] << " ";
}
cout << "\n";
// need for manual management of dynamically assigned memory
delete z;
delete[] zs;
}
Writing p03.cpp
[58]:
%%bash
g++ -o p03.exe p03.cpp -std=c++14
./p03.exe
23
0 10 20
Pointer arithmetic¶
When you increemnt or decrement an array, it moves to the preceding or next locaion in memory as aprpoprite for the pointer type. You can also add or substract an number, since that is equivalent to mulitple increments/decrements. This is know as pointer arithmetic.
[59]:
%%file p04.cpp
#include <iostream>
using std::cout;
using std::begin;
using std::end;
int main() {
int xs[] = {100,200,300,400,500,600,700,800,900,1000};
cout << xs << ": " << *xs << "\n";
cout << &xs << ": " << *xs << "\n";
cout << &xs[3] << ": " << xs[3] << "\n";
cout << xs+3 << ": " << *(xs+3) << "\n";
}
Writing p04.cpp
[60]:
%%bash
g++ -std=c++11 -o p04.exe p04.cpp
./p04.exe
0x7ffe7687db60: 100
0x7ffe7687db60: 100
0x7ffe7687db6c: 400
0x7ffe7687db6c: 400
C style dynamic memory for jagged array (“matrix”)¶
[61]:
%%file p05.cpp
#include <iostream>
using std::cout;
using std::begin;
using std::end;
int main() {
int m = 3;
int n = 4;
int **xss = new int*[m]; // assign memory for m pointers to int
for (int i=0; i<m; i++) {
xss[i] = new int[n]; // assign memory for array of n ints
for (int j=0; j<n; j++) {
xss[i][j] = i*10 + j;
}
}
for (int i=0; i<m; i++) {
for (int j=0; j<n; j++) {
cout << xss[i][j] << "\t";
}
cout << "\n";
}
// Free memory
for (int i=0; i<m; i++) {
delete[] xss[i];
}
delete[] xss;
}
Writing p05.cpp
[62]:
%%bash
g++ -std=c++11 -o p05.exe p05.cpp
./p05.exe
0 1 2 3
10 11 12 13
20 21 22 23
Functions¶
[63]:
%%file func01.cpp
#include <iostream>
double add(double x, double y) {
return x + y;
}
double mult(double x, double y) {
return x * y;
}
int main() {
double a = 3;
double b = 4;
std::cout << add(a, b) << std::endl;
std::cout << mult(a, b) << std::endl;
}
Writing func01.cpp
[64]:
%%bash
g++ -o func01.exe func01.cpp -std=c++14
./func01.exe
7
12
Function parameters¶
In the example below, the space allocated inside a function is deleted outside the function. Such code in practice will almost certainly lead to memory leakage. This is why C++ functions often put the output as an argument to the function, so that all memory allocation can be controlled outside the function.
void add(double *x, double *y, double *res, n)
[65]:
%%file func02.cpp
#include <iostream>
double* add(double *x, double *y, int n) {
double *res = new double[n];
for (int i=0; i<n; i++) {
res[i] = x[i] + y[i];
}
return res;
}
int main() {
double a[] = {1,2,3};
double b[] = {4,5,6};
int n = 3;
double *c = add(a, b, n);
for (int i=0; i<n; i++) {
std::cout << c[i] << " ";
}
std::cout << "\n";
delete[] c; // Note difficulty of book-keeping when using raw pointers!
}
Writing func02.cpp
[66]:
%%bash
g++ -o func02.exe func02.cpp -std=c++14
./func02.exe
5 7 9
[67]:
%%file func03.cpp
#include <iostream>
using std::cout;
// Using value
void foo1(int x) {
x = x + 1;
}
// Using pointer
void foo2(int *x) {
*x = *x + 1;
}
// Using ref
void foo3(int &x) {
x = x + 1;
}
int main() {
int x = 0;
cout << x << "\n";
foo1(x);
cout << x << "\n";
foo2(&x);
cout << x << "\n";
foo3(x);
cout << x << "\n";
}
Writing func03.cpp
[68]:
%%bash
g++ -o func03.exe func03.cpp -std=c++14
./func03.exe
0
0
1
2
Generic programming with templates¶
In C, you need to write a different function for each input type - hence resulting in duplicated code like
int iadd(int a, int b)
float fadd(float a, float b)
In C++, you can make functions generic by using templates.
Note: When you have a template function, the entire funciton must be written in the header file, and not the source file. Hence, heavily templated libaries are often “header-only”.
[69]:
%%file template.cpp
#include <iostream>
template<typename T>
T add(T a, T b) {
return a + b;
}
int main() {
int m =2, n =3;
double u = 2.5, v = 4.5;
std::cout << add(m, n) << std::endl;
std::cout << add(u, v) << std::endl;
}
Writing template.cpp
[70]:
%%bash
g++ -o template.exe template.cpp
[71]:
%%bash
./template.exe
5
7
Anonymous functions¶
[72]:
%%file lambda.cpp
#include <iostream>
using std::cout;
using std::endl;
int main() {
int a = 3, b = 4;
int c = 0;
// Lambda function with no capture
auto add1 = [] (int a, int b) { return a + b; };
// Lambda function with value capture
auto add2 = [c] (int a, int b) { return c * (a + b); };
// Lambda funciton with reference capture
auto add3 = [&c] (int a, int b) { return c * (a + b); };
// Change value of c after function definition
c += 5;
cout << "Lambda function\n";
cout << add1(a, b) << endl;
cout << "Lambda function with value capture\n";
cout << add2(a, b) << endl;
cout << "Lambda function with reference capture\n";
cout << add3(a, b) << endl;
}
Writing lambda.cpp
[73]:
%%bash
c++ -o lambda.exe lambda.cpp --std=c++14
[74]:
%%bash
./lambda.exe
Lambda function
7
Lambda function with value capture
0
Lambda function with reference capture
35
Function pointers¶
[75]:
%%file func_pointer.cpp
#include <iostream>
#include <vector>
#include <functional>
using std::cout;
using std::endl;
using std::function;
using std::vector;
int main()
{
cout << "\nUsing generalized function pointers\n";
using func = function<double(double, double)>;
auto f1 = [](double x, double y) { return x + y; };
auto f2 = [](double x, double y) { return x * y; };
auto f3 = [](double x, double y) { return x + y*y; };
double x = 3, y = 4;
vector<func> funcs = {f1, f2, f3,};
for (auto& f : funcs) {
cout << f(x, y) << "\n";
}
}
Writing func_pointer.cpp
[76]:
%%bash
g++ -o func_pointer.exe func_pointer.cpp -std=c++14
[77]:
%%bash
./func_pointer.exe
Using generalized function pointers
7
12
19
Standard template library (STL)¶
The STL provides templated containers and gneric algorithms acting on these containers with a consistent API.
[78]:
%%file stl.cpp
#include <iostream>
#include <vector>
#include <map>
#include <unordered_map>
using std::vector;
using std::map;
using std::unordered_map;
using std::string;
using std::cout;
using std::endl;
struct Point{
int x;
int y;
Point(int x_, int y_) :
x(x_), y(y_) {};
};
int main() {
vector<int> v1 = {1,2,3};
v1.push_back(4);
v1.push_back(5);
cout << "Vecotr<int>" << endl;
for (auto n: v1) {
cout << n << endl;
}
cout << endl;
vector<Point> v2;
v2.push_back(Point(1, 2));
v2.emplace_back(3,4);
cout << "Vector<Point>" << endl;
for (auto p: v2) {
cout << "(" << p.x << ", " << p.y << ")" << endl;
}
cout << endl;
map<string, int> v3 = {{"foo", 1}, {"bar", 2}};
v3["hello"] = 3;
v3.insert({"goodbye", 4});
// Note the a C++ map is ordered
// Note using (traditional) iterators instead of ranged for loop
cout << "Map<string, int>" << endl;
for (auto iter=v3.begin(); iter != v3.end(); iter++) {
cout << iter->first << ": " << iter->second << endl;
}
cout << endl;
unordered_map<string, int> v4 = {{"foo", 1}, {"bar", 2}};
v4["hello"] = 3;
v4.insert({"goodbye", 4});
// Note the unordered_map is similar to Python' dict.'
// Note using ranged for loop with const ref to avoid copying or mutation
cout << "Unordered_map<string, int>" << endl;
for (const auto& i: v4) {
cout << i.first << ": " << i.second << endl;
}
cout << endl;
}
Writing stl.cpp
[79]:
%%bash
g++ -o stl.exe stl.cpp -std=c++14
[80]:
%%bash
./stl.exe
Vecotr<int>
1
2
3
4
5
Vector<Point>
(1, 2)
(3, 4)
Map<string, int>
bar: 2
foo: 1
goodbye: 4
hello: 3
Unordered_map<string, int>
goodbye: 4
hello: 3
bar: 2
foo: 1
STL algorithms¶
[81]:
%%file stl_algorithm.cpp
#include <vector>
#include <iostream>
#include <numeric>
using std::cout;
using std::endl;
using std::vector;
using std::begin;
using std::end;
int main() {
vector<int> v(10);
// iota is somewhat like range
std::iota(v.begin(), v.end(), 1);
for (auto i: v) {
cout << i << " ";
}
cout << endl;
// C++ version of reduce
cout << std::accumulate(begin(v), end(v), 0) << endl;
// Accumulate with lambda
cout << std::accumulate(begin(v), end(v), 1, [](int a, int b){return a * b; }) << endl;
}
Writing stl_algorithm.cpp
[82]:
%%bash
g++ -o stl_algorithm.exe stl_algorithm.cpp -std=c++14
[83]:
%%bash
./stl_algorithm.exe
1 2 3 4 5 6 7 8 9 10
55
3628800
Random numbers¶
[84]:
%%file random.cpp
#include <iostream>
#include <random>
#include <functional>
using std::cout;
using std::random_device;
using std::mt19937;
using std::default_random_engine;
using std::uniform_int_distribution;
using std::poisson_distribution;
using std::student_t_distribution;
using std::bind;
// start random number engine with fixed seed
// Note default_random_engine may give differnet values on different platforms
// default_random_engine re(1234);
// or
// Using a named engine will work the same on differnt platforms
// mt19937 re(1234);
// start random number generator with random seed
random_device rd;
mt19937 re(rd());
uniform_int_distribution<int> uniform(1,6); // lower and upper bounds
poisson_distribution<int> poisson(30); // rate
student_t_distribution<double> t(10); // degrees of freedom
int main()
{
cout << "\nGenerating random numbers\n";
auto runif = bind (uniform, re);
auto rpois = bind(poisson, re);
auto rt = bind(t, re);
for (int i=0; i<10; i++) {
cout << runif() << ", " << rpois() << ", " << rt() << "\n";
}
}
Writing random.cpp
[85]:
%%bash
g++ -o random.exe random.cpp -std=c++14
[86]:
%%bash
./random.exe
Generating random numbers
3, 31, -0.770332
3, 27, -0.242753
6, 42, 0.635808
1, 22, -1.28388
6, 27, 1.84322
4, 22, 0.90192
5, 27, 0.2381
5, 23, -0.755602
6, 33, 0.389624
6, 21, 1.29621
Numerics¶
Using Armadillo¶
[87]:
%%file test_arma.cpp
#include <iostream>
#include <armadillo>
using std::cout;
using std::endl;
int main()
{
using namespace arma;
vec u = linspace<vec>(0,1,5);
vec v = ones<vec>(5);
mat A = randu<mat>(4,5); // uniform random deviates
mat B = randn<mat>(4,5); // normal random deviates
cout << "\nVecotrs in Armadillo\n";
cout << u << endl;
cout << v << endl;
cout << u.t() * v << endl;
cout << "\nRandom matrices in Armadillo\n";
cout << A << endl;
cout << B << endl;
cout << A * B.t() << endl;
cout << A * v << endl;
cout << "\nQR in Armadillo\n";
mat Q, R;
qr(Q, R, A.t() * A);
cout << Q << endl;
cout << R << endl;
}
Writing test_arma.cpp
[88]:
%%bash
g++ -o test_arma.exe test_arma.cpp -std=c++14 -larmadillo
[89]:
%%bash
./test_arma.exe
Vecotrs in Armadillo
0
0.2500
0.5000
0.7500
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
2.5000
Random matrices in Armadillo
0.7868 0.0193 0.5206 0.1400 0.4998
0.2505 0.4049 0.3447 0.5439 0.4194
0.7107 0.2513 0.2742 0.5219 0.7443
0.9467 0.0227 0.5610 0.8571 0.2492
-0.7674 -0.0953 0.4285 0.3620 0.0415
-1.1120 0.0613 1.4152 -1.9853 0.1661
-0.7436 -1.6618 -0.0841 1.9332 -0.9602
-0.6272 -0.2744 0.6118 0.0490 -1.3933
-0.3111 -0.3320 -0.8700 -0.8698
0.1312 -0.7760 -0.2394 -0.6150
-0.2320 -1.2993 -0.6748 -1.3584
-0.1677 -1.9175 0.6289 -0.5619
1.9665
1.9633
2.5024
2.6367
QR in Armadillo
-0.6734 0.5087 0.2592 0.0105 -0.4696
-0.1024 -0.6686 -0.2105 0.1467 -0.6904
-0.3950 0.1098 -0.7627 0.4187 0.2738
-0.4618 -0.2996 -0.1503 -0.7862 0.2373
-0.4083 -0.4386 0.5332 0.4302 0.4142
-3.0934e+00 -6.5241e-01 -1.8687e+00 -2.3279e+00 -2.0254e+00
0e+00 -2.4110e-01 -4.0694e-02 -2.1686e-01 -2.5067e-01
0e+00 0e+00 -6.0467e-02 -1.0163e-01 9.8203e-02
0e+00 0e+00 0e+00 -2.1243e-01 1.2173e-01
0e+00 0e+00 0e+00 0e+00 3.6082e-16
Using Eigen¶
[1]:
%%file test_eigen.cpp
#include <iostream>
#include <fstream>
#include <random>
#include <Eigen/Dense>
#include <functional>
using std::cout;
using std::endl;
using std::ofstream;
using std::default_random_engine;
using std::normal_distribution;
using std::bind;
// start random number engine with fixed seed
default_random_engine re{12345};
normal_distribution<double> norm(5,2); // mean and standard deviation
auto rnorm = bind(norm, re);
int main()
{
using namespace Eigen;
VectorXd x1(6);
x1 << 1, 2, 3, 4, 5, 6;
VectorXd x2 = VectorXd::LinSpaced(6, 1, 2);
VectorXd x3 = VectorXd::Zero(6);
VectorXd x4 = VectorXd::Ones(6);
VectorXd x5 = VectorXd::Constant(6, 3);
VectorXd x6 = VectorXd::Random(6);
double data[] = {6,5,4,3,2,1};
Map<VectorXd> x7(data, 6);
VectorXd x8 = x6 + x7;
MatrixXd A1(3,3);
A1 << 1 ,2, 3,
4, 5, 6,
7, 8, 9;
MatrixXd A2 = MatrixXd::Constant(3, 4, 1);
MatrixXd A3 = MatrixXd::Identity(3, 3);
Map<MatrixXd> A4(data, 3, 2);
MatrixXd A5 = A4.transpose() * A4;
MatrixXd A6 = x7 * x7.transpose();
MatrixXd A7 = A4.array() * A4.array();
MatrixXd A8 = A7.array().log();
MatrixXd A9 = A8.unaryExpr([](double x) { return exp(x); });
MatrixXd A10 = MatrixXd::Zero(3,4).unaryExpr([](double x) { return rnorm(); });
VectorXd x9 = A1.colwise().norm();
VectorXd x10 = A1.rowwise().sum();
MatrixXd A11(x1.size(), 3);
A11 << x1, x2, x3;
MatrixXd A12(3, x1.size());
A12 << x1.transpose(),
x2.transpose(),
x3.transpose();
JacobiSVD<MatrixXd> svd(A10, ComputeThinU | ComputeThinV);
cout << "x1: comman initializer\n" << x1.transpose() << "\n\n";
cout << "x2: linspace\n" << x2.transpose() << "\n\n";
cout << "x3: zeors\n" << x3.transpose() << "\n\n";
cout << "x4: ones\n" << x4.transpose() << "\n\n";
cout << "x5: constant\n" << x5.transpose() << "\n\n";
cout << "x6: rand\n" << x6.transpose() << "\n\n";
cout << "x7: mapping\n" << x7.transpose() << "\n\n";
cout << "x8: element-wise addition\n" << x8.transpose() << "\n\n";
cout << "max of A1\n";
cout << A1.maxCoeff() << "\n\n";
cout << "x9: norm of columns of A1\n" << x9.transpose() << "\n\n";
cout << "x10: sum of rows of A1\n" << x10.transpose() << "\n\n";
cout << "head\n";
cout << x1.head(3).transpose() << "\n\n";
cout << "tail\n";
cout << x1.tail(3).transpose() << "\n\n";
cout << "slice\n";
cout << x1.segment(2, 3).transpose() << "\n\n";
cout << "Reverse\n";
cout << x1.reverse().transpose() << "\n\n";
cout << "Indexing vector\n";
cout << x1(0);
cout << "\n\n";
cout << "A1: comma initilizer\n";
cout << A1 << "\n\n";
cout << "A2: constant\n";
cout << A2 << "\n\n";
cout << "A3: eye\n";
cout << A3 << "\n\n";
cout << "A4: mapping\n";
cout << A4 << "\n\n";
cout << "A5: matrix multiplication\n";
cout << A5 << "\n\n";
cout << "A6: outer product\n";
cout << A6 << "\n\n";
cout << "A7: element-wise multiplication\n";
cout << A7 << "\n\n";
cout << "A8: ufunc log\n";
cout << A8 << "\n\n";
cout << "A9: custom ufucn\n";
cout << A9 << "\n\n";
cout << "A10: custom ufunc for normal deviates\n";
cout << A10 << "\n\n";
cout << "A11: np.c_\n";
cout << A11 << "\n\n";
cout << "A12: np.r_\n";
cout << A12 << "\n\n";
cout << "2x2 block startign at (0,1)\n";
cout << A1.block(0,1,2,2) << "\n\n";
cout << "top 2 rows of A1\n";
cout << A1.topRows(2) << "\n\n";
cout << "bottom 2 rows of A1";
cout << A1.bottomRows(2) << "\n\n";
cout << "leftmost 2 cols of A1";
cout << A1.leftCols(2) << "\n\n";
cout << "rightmost 2 cols of A1";
cout << A1.rightCols(2) << "\n\n";
cout << "Diagonal elements of A1\n";
cout << A1.diagonal() << "\n\n";
A1.diagonal() = A1.diagonal().array().square();
cout << "Transforming diagonal eelemtns of A1\n";
cout << A1 << "\n\n";
cout << "Indexing matrix\n";
cout << A1(0,0) << "\n\n";
cout << "singular values\n";
cout << svd.singularValues() << "\n\n";
cout << "U\n";
cout << svd.matrixU() << "\n\n";
cout << "V\n";
cout << svd.matrixV() << "\n\n";
}
Overwriting test_eigen.cpp
[9]:
import os
if not os.path.exists('./eigen'):
! git clone https://gitlab.com/libeigen/eigen.git
[10]:
%%bash
g++ -o test_eigen.exe test_eigen.cpp -std=c++11 -I./eigen
[11]:
%%bash
./test_eigen.exe
x1: comman initializer
1 2 3 4 5 6
x2: linspace
1 1.2 1.4 1.6 1.8 2
x3: zeors
0 0 0 0 0 0
x4: ones
1 1 1 1 1 1
x5: constant
3 3 3 3 3 3
x6: rand
0.680375 -0.211234 0.566198 0.59688 0.823295 -0.604897
x7: mapping
6 5 4 3 2 1
x8: element-wise addition
6.68038 4.78877 4.5662 3.59688 2.82329 0.395103
max of A1
9
x9: norm of columns of A1
8.12404 9.64365 11.225
x10: sum of rows of A1
6 15 24
head
1 2 3
tail
4 5 6
slice
3 4 5
Reverse
6 5 4 3 2 1
Indexing vector
1
A1: comma initilizer
1 2 3
4 5 6
7 8 9
A2: constant
1 1 1 1
1 1 1 1
1 1 1 1
A3: eye
1 0 0
0 1 0
0 0 1
A4: mapping
6 3
5 2
4 1
A5: matrix multiplication
77 32
32 14
A6: outer product
36 30 24 18 12 6
30 25 20 15 10 5
24 20 16 12 8 4
18 15 12 9 6 3
12 10 8 6 4 2
6 5 4 3 2 1
A7: element-wise multiplication
36 9
25 4
16 1
A8: ufunc log
3.58352 2.19722
3.21888 1.38629
2.77259 0
A9: custom ufucn
36 9
25 4
16 1
A10: custom ufunc for normal deviates
5.22353 6.16474 3.67474 6.18264
3.81868 4.07998 3.52576 3.82792
3.74872 5.76697 5.3517 2.85655
A11: np.c_
1 1 0
2 1.2 0
3 1.4 0
4 1.6 0
5 1.8 0
6 2 0
A12: np.r_
1 2 3 4 5 6
1 1.2 1.4 1.6 1.8 2
0 0 0 0 0 0
2x2 block startign at (0,1)
2 3
5 6
top 2 rows of A1
1 2 3
4 5 6
bottom 2 rows of A14 5 6
7 8 9
leftmost 2 cols of A11 2
4 5
7 8
rightmost 2 cols of A12 3
5 6
8 9
Diagonal elements of A1
1
5
9
Transforming diagonal eelemtns of A1
1 2 3
4 25 6
7 8 81
Indexing matrix
1
singular values
15.8886
2.58818
0.54229
U
-0.67345 0.607383 0.421368
-0.479529 0.0748696 -0.874326
-0.562598 -0.790873 0.240837
V
-0.46939 0.190799 -0.433197
-0.588634 -0.197482 0.773183
-0.451663 -0.670964 -0.452453
-0.478731 0.68877 -0.099069
Check SVD¶
[12]:
import numpy as np
A10 = np.array([
[5.17237, 3.73572, 6.29422, 6.55268],
[5.33713, 3.88883, 1.93637, 4.39812],
[8.22086, 6.94502, 6.36617, 6.5961]
])
U, s, Vt = np.linalg.svd(A10, full_matrices=False)
[13]:
s
[13]:
array([19.50007376, 2.80674189, 1.29869186])
[14]:
U
[14]:
array([[-0.55849978, -0.75124103, -0.3517313 ],
[-0.40681745, 0.61759344, -0.67311062],
[-0.72289526, 0.2328417 , 0.65054376]])
[15]:
Vt.T
[15]:
array([[-0.56424535, 0.47194895, -0.04907563],
[-0.44558625, 0.43195279, 0.45157518],
[-0.45667231, -0.73048295, 0.48064272],
[-0.52395657, -0.23890509, -0.75010267]])
Probability distributions and statistics¶
A nicer library for working with probability distributions. Show integration with Armadillo. Integration with Eigen is also possible.
[16]:
import os
if not os.path.exists('./stats'):
! git clone https://github.com/kthohr/stats.git
if not os.path.exists('./gcem'):
! git clone https://github.com/kthohr/gcem.git
Cloning into 'stats'...
remote: Enumerating objects: 6248, done.
remote: Total 6248 (delta 0), reused 0 (delta 0), pack-reused 6248
Receiving objects: 100% (6248/6248), 1.29 MiB | 0 bytes/s, done.
Resolving deltas: 100% (5468/5468), done.
Checking connectivity... done.
Cloning into 'gcem'...
remote: Enumerating objects: 5, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (5/5), done.
remote: Total 2153 (delta 0), reused 1 (delta 0), pack-reused 2148
Receiving objects: 100% (2153/2153), 435.40 KiB | 0 bytes/s, done.
Resolving deltas: 100% (1668/1668), done.
Checking connectivity... done.
[17]:
%%file stats.cpp
#define STATS_ENABLE_STDVEC_WRAPPERS
#define STATS_ENABLE_ARMA_WRAPPERS
// #define STATS_ENABLE_EIGEN_WRAPPERS
#include <iostream>
#include <vector>
#include "stats.hpp"
using std::cout;
using std::endl;
using std::vector;
// set seed for randome engine to 1776
std::mt19937_64 engine(1776);
int main() {
// evaluate the normal PDF at x = 1, mu = 0, sigma = 1
double dval_1 = stats::dnorm(1.0,0.0,1.0);
// evaluate the normal PDF at x = 1, mu = 0, sigma = 1, and return the log value
double dval_2 = stats::dnorm(1.0,0.0,1.0,true);
// evaluate the normal CDF at x = 1, mu = 0, sigma = 1
double pval = stats::pnorm(1.0,0.0,1.0);
// evaluate the Laplacian quantile at p = 0.1, mu = 0, sigma = 1
double qval = stats::qlaplace(0.1,0.0,1.0);
// draw from a normal distribution with mean 100 and sd 15
double rval = stats::rnorm(100, 15);
// Use with std::vectors
vector<int> pois_rvs = stats::rpois<vector<int> >(1, 10, 3);
cout << "Poisson draws with rate=3 inton std::vector" << endl;
for (auto &x : pois_rvs) {
cout << x << ", ";
}
cout << endl;
// Example of Armadillo usage: only one matrix library can be used at a time
arma::mat beta_rvs = stats::rbeta<arma::mat>(5,5,3.0,2.0);
// matrix input
arma::mat beta_cdf_vals = stats::pbeta(beta_rvs,3.0,2.0);
/* Example of Eigen usage: only one matrix library can be used at a time
Eigen::MatrixXd gamma_rvs = stats::rgamma<Eigen::MatrixXd>(10, 5,3.0,2.0);
*/
cout << "evaluate the normal PDF at x = 1, mu = 0, sigma = 1" << endl;
cout << dval_1 << endl;
cout << "evaluate the normal PDF at x = 1, mu = 0, sigma = 1, and return the log value" << endl;
cout << dval_2 << endl;
cout << "evaluate the normal CDF at x = 1, mu = 0, sigma = 1" << endl;
cout << pval << endl;
cout << "evaluate the Laplacian quantile at p = 0.1, mu = 0, sigma = 1" << endl;
cout << qval << endl;
cout << "draw from a normal distribution with mean 100 and sd 15" << endl;
cout << rval << endl;
cout << "draws from a beta distribuiotn to populate Armadillo matrix" << endl;
cout << beta_rvs << endl;
cout << "evaluaate CDF for beta draws from Armadillo inputs" << endl;
cout << beta_cdf_vals << endl;
/* If using Eigen
cout << "draws from a Gamma distribuiotn to populate Eigen matrix" << endl;
cout << gamma_rvs << endl;
*/
}
Writing stats.cpp
[18]:
%%bash
g++ -std=c++11 -I./stats/include -I./gcem/include -I./eigen stats.cpp -o stats.exe
[19]:
%%bash
./stats.exe
Poisson draws with rate=3 inton std::vector
1, 3, 5, 6, 1, 3, 1, 1, 2, 3,
evaluate the normal PDF at x = 1, mu = 0, sigma = 1
0.241971
evaluate the normal PDF at x = 1, mu = 0, sigma = 1, and return the log value
-1.41894
evaluate the normal CDF at x = 1, mu = 0, sigma = 1
0.841345
evaluate the Laplacian quantile at p = 0.1, mu = 0, sigma = 1
-1.60944
draw from a normal distribution with mean 100 and sd 15
120.059
draws from a beta distribuiotn to populate Armadillo matrix
0.6504 0.3613 0.5206 0.5956 0.6657
0.4931 0.0844 0.5793 0.3354 0.1347
0.8770 0.2274 0.7872 0.6645 0.5689
0.6129 0.9087 0.8134 0.6172 0.6005
0.6285 0.8030 0.6545 0.5177 0.3090
evaluaate CDF for beta draws from Armadillo inputs
0.5637 0.1375 0.3440 0.4677 0.5909
0.3022 0.0023 0.4398 0.1130 0.0088
0.9234 0.0390 0.7993 0.5887 0.4222
0.4975 0.9558 0.8394 0.5051 0.4760
0.5249 0.8237 0.5710 0.3395 0.0906
Solution to exercise
[20]:
%%file greet.cpp
#include <iostream>
#include <string>
using std::string;
using std::cout;
int main(int argc, char* argv[]) {
string name = argv[1];
int n = std::stoi(argv[2]);
for (int i=0; i<n; i++) {
cout << "Hello " << name << "!" << "\n";
}
}
Writing greet.cpp
[21]:
%%bash
g++ -std=c++11 greet.cpp -o greet
[22]:
%%bash
./greet Santa 3
Hello Santa!
Hello Santa!
Hello Santa!