NPRG041 C++ (cs/en) 2020/21 @ e-x-a.org

Infopage about the C++ lab of NPRG041.

Taught by: Miroslav Kratochvil
contact: kratochvil@ksi.mff.cuni.cz (please include the course name/code in the subject, e.g. [C++])
Rule of thumb: When in any doubt, mail me (or send me a message on Slack)

Groups:

Czech group: every Tuesday at 12:20, Zoom meeting ID 999 0947 4901
English group: every Monday at 17:20, Zoom meeting ID 982 1865 2618

Zoom meeting passcodes are the same as for the lecture.

Everyone is invited to join the dedicated Slack workspace, which shall support all basic communication needs during the semester, and is the preferred way for communication. Invitation link for Slack is available in SIS noticeboard module (mail me if you can’t find it, or don’t have SIS access). If you cannot use Slack for whatever reason, please make sure to check this website for weekly updates.

Reading material:

Very probably any good book about C++ will do. There are some in the faculty library. Because MFF students are generally more apprehending than the common programming crowd, I highly suggest you go directly to the Standard . (click on the 2014 Programming Language C++ link in Published standards)
I maintain a local copy of the C++ standard PDF
Cplusplus.com — especially following sections:
- Containers (list, vector, set, map, string, ...)
- Commandline I/O in iostream
- File I/O and in-memory I/O
- complex
- ...

Lab requirements

To get the credit, you have to attend the course reasonably (at least accordingly to your knowledge of the topic), do the homework (see below) and finish an individual project (also below).

You will be assigned several points for finishing the homework. These will be added to the points from the final test, therefore improving your final grade for the whole course. Details are available in the main course slides.

Depending on many factors, students from Erasmus programs may need a completely different set of rules to fit into different time limits. If you are an ERASMUS student, contact us!

Homework

There are two homework assignments. Please refer to ReCodex. Make sure you are registered in the correct student groups.

Assignment 1 is online since October 19th, the deadlines are on November 11th (czech group) and November 17th (english group).

Project

Each student has to create an individual project and submit it to get the credit. Topics of the projects may vary wildly, but you should discuss your topic before the end of November so that it is agreed upon. Size of the project is not an issue and largely depends on the topic, around 500 lines of (neat) C++ code is a good guideline (on the other hand, packing the same functionality into a smaller program using e.g. advanced language features is even better).

Bad topics:

Boring libraries
Boring games
Anything that does not behave C++-ish (e.g. code without a single class construction)
Anything that requires complicated or very specific user interface (e.g. bank account simulator, unless the interface is solved neatly in some novel C++ way)
Database workalikes, e.g. “evidence of hotel guests”, “evidence of soccer results”, etc.; unless the underlying database storage is somehow interesting.
Anything that has 100000 implementations already hanging around the internet
Also, there’s already too many of checkers, worms, tetris, pacman, snake, tictactoe, etc.

Good topics:

Small fast games (fun!)
Convenience programming libraries (convenient!)
Efficient data structures with demos (useful!)
Physics-like simulations of something (collisions, gravity, particles, etc. look cool)
Networking (internet!)
Virtual machines or interpreters for small programming languages.

Deadlines:

Topic agreed upon, written down in SIS: November 30th, 2020. Send me an e-mail to make sure the topic is confirmed and added to SIS.
Recommended time for submitting final version: March 2021.
Final version incl. documentation or example demonstration: April 10th, 2021. If you miss this deadline, we will assume that you are not aiming to obtain the credit. Projects that are first submitted after this deadline will not be evaluated at all. Corrections of previously submitted projects will be still possible.

Submission:

You must develop and submit the project using git in the corresponding MFF Gitlab group. The submission process is simplified — you just notify me that the project is ready there. Registration to Gitlab is open for all students with valid CAS UK credentials. After you register, let me know to assign you to the group.

Submission guidelines:

Make the code portable — it should not depend on the platform unless it is, by design, tied to that platform. UNIX projects should work on Linuxes in the computer lab. Please void Windows-specific projects
Avoid code bloat, library bundling and excessive media. Maximum size of your final program is limited to 1MB.
Do not over-engineer, avoid feature creep. The simplest project that satisfies the following conditions will do:
- There’s a reasonable amount of C++ that shows you know what you’re doing
- It does not crash, in particular it does not dereference invalid pointers, cause leaks, or torture the memory in any other way.
- It does not contain any inefficiencies that could be fixed by better C++. (Repeat: const references! avoid manual allocation!)
- It provably does what the topic says, and it can be demonstrated on an example use-case of reasonable complexity.
- The code has sufficient comments. If you are unsure if you should add comment somewhere, try to tear the surrounding program block or function out of context, and ask yourself if anyone can still fully understand what it does (and why). If not, it needs comments. Note that comments such as int getVal() {return value;} //returns the value are completely pointless and should be avoided.
  - Note that primitive accessor functions that give access to unnecessarily-private member variables, such as the one shown above, are artifacts of misinterpretation of OOP principles that vastly reduce the extensibility of the code (although common even in advanced Java and C# courses!) and should be avoided! In C++, avoid all such waste code that does nothing.
- If the topic is a data structure, include a comparison with a naive approach or some C++ counterpart (std::chrono provides timers). Note that your data structure does not need to “win” the comparison (that’s the topic of the HPC course). You should only provide a reasonable testing framework to assess the performance.
Having received your program, I should be able to convince myself that it works in less than around 15 minutes. You can help it a lot:
- Include a file INSTALL (.md) that describes how to make it work on a fitting computer configuration. (Better: How to make it work on computers in MFF’s computer lab. Best: Add a Makefile that works in the lab.)
- Include a file DEMO or TUTORIAL that describes what should I do with the compiled program to see all the important functionality as quickly as possible. (Better: add a shell script that does it. Best: Support make demo)
- If the DEMO requires some data for the demonstration, include it! (Advice: If I’m forced to create testing data by hand, it will take more than 15 minutes. Also, result will contain lots of hideous corner cases.)
- If the source code is big, include INTERNALS file that describes which parts of the functionality can be found in which part/file of the code. This is sometimes also called “programmer documentation” or “hacking guide”. Imagine it as a signpost, compare with artist’s impression thereof.
- If all documentation parts are favorably small, pack them together into one README.

Useful libraries for the projects:

For games and simulations with graphics, try OpenGL. There are lots of ways to get an OpenGL viewport for your program, the easiest of them is probably the GLUT library. You might also want to see SDL that is commonly used for portable games, or the newer alternative SFML. The following sites provide a good introduction to modern OpenGL: open.gl and learnopengl.com.
For a GUI in games, use ImGUI
For user interaction in console, use readline and possibly ncurses (tutorial here)
For parsing commandline arguments, use getopt
Using the standard Berkeley sockets for network communication is certainly adventurous (if writing a server application, remember to poll correctly ). Alternatively, use some reasonable wrapper for basic communication, like 0MQ.
If you need to parse/produce JSON and serialize any data, use nlohmann’s json
If you need to save data reliably to some kind of a database, use sqlite3

Lab timeline

Source code from the labs will be available here.

CZ 1 (Sep 29th), EN 1 (Oct 5th)

Slight introduction into C, basic data types and a bit of pointers. What does the multi-module compilation process look like (C really just spews out assembly, multiple assembly files are joined together in a linking process to create the final executable file).

CZ 2 (Oct 6th), EN 2 (Oct 12th)

Manual memory management and pointers. (Linked lists all over again.)

How to point to a piece of actual code (aka. function pointers). Note: We did not make it thus far in English class, but you can read about function pointers e.g. here , or see the usecase in the source code from the czech lab. Anyway, this is probably the last time we used raw function pointers, so you don not need to worry about them at all.

CZ 3 (Oct 13th), EN 3 (Oct 18th)

C++ built atop C: overloading, references, objects, custom operators. A slight hint of why format strings are not needed for C++ input/output (they are “derived” from overloading).

CZ 4 (Oct 19th), EN 4 (Oct 26th)

Wrapping fragile resources with a programmer-friendly class interface. (Constructors, destructors, RAII, rule of three&five).

SIDE NOTE: The zoo of manual memory allocation, deallocation, object construction, and destruction:

malloc(n) just allocates n bytes of memory
free(p) frees an allocated memory block that starts at address p, marking the memory as usable for “others”, i.e. reusable by other malloc’s, or eventually returning it to the operating system.
operator new(n) does the same thing as malloc(n) (except it may throw std::bad_alloc instead of returning NULL if the allocation fails)
operator delete(p) behaves mostly as free(p)

The things above do NOT initialize the memory anyhow. If you want actual usable objects to be “created” in the memory, you need to somehow indicate that the constructors and destructors need to get called:

new T allocates sizeof(T) bytes of memory, runs a constructor T() on that memory, and returns the pointer to the memory with the newly initialized object of T. (Similarly for new T(params).)
delete p runs a destructor on the memory pointed by p, the type of destructor is chosen based on the type of p, ie. if p is of type T*, the destructor ~T() is called. After destruction, the memory is deallocated (as with operator delete(p).
new T[n] allocates memory for n objects of type T and runs n constructors, ie. creates n initialized objects in an array.
delete p[] runs multiple (!!!) destructors on a memory block allocated by new T[n]

If you have acquired the memory in your own ways (e.g. if your operating system and libc do not have allocators) and just want to run the constructors and destructors on that memory in order to initialize/deinitialize the objects there so that they can be used safely, you can use placement new and manual destructor call:

new (p) T(params) does not allocate anything, it just runs the constructor T(params) at memory address p, thus effectively running the T(params) with this==p. Technically, the memory at p gets thus initialized, and *(T*)p can be safely used as object of the class T. Like, ((T*)p)->do_something().
p->~T() calls a destructor ~T() on memory p. The memory is not deallocated, only the object is deinitialized. (i.e., with reasonable objects, the resources held by the object stored at memory p are released.)

One such case when you want to use placement new and manual destructor call is if you want to allocate a big array but only initialize (“construct”) some of its elements. That may be the case with homework assignment 1.

Also see here: cplusplus.com/operator new

CZ 5 (Oct 26th), EN 6 (Nov 3rd, moved to Tuesday)

Using STL and containers (strings, sets, maps). We started a mergesort on lists, going to make it a bit more efficient next time.

CZ 6 (Nov 3rd), EN 6 (Nov 9th)

A bit more about (some) STL containers, brief overview of move semantics and general “copy avoidance” hints. We read and wrote from/to files.

CZ 7 (Nov 10th), EN 7 (Nov 16th)

OOP-style programming: Classes, inheritance hierarchies, virtual methods. Smart pointers help a lot —- run-time polymorphic data need to be “held” using a reference (ie. a pointer); smart pointers allow you to automatically destroy and deallocate these objects without much extra effort, thus preventing ugly memory issues. You should be able to deduce the reason why unique_ptr is move-only.

EN 8 (Nov 23rd), CZ 8 (Nov 24th)

Deriving from standard classes, overloading a whole spectrum of functions at once using variadic templates. Tiny useful helper proxy classes.

EN/CZ 9 (Nov 30th, 31st)

Postfix calculator, tricks with macros

EN/CZ 10 (Dec 7th, 8th)

Searching through a state space (solving the reduced 15 puzzle). BFS and path backtracking tricks.

EN/CZ 11 (Dec 14th, 15th)

Graph reduction by DFS (SMILES-like graph notation generator).

EN/CZ 12 (Dec 21st, 22nd)

Template metaprogramming (static factorial, static debug outputs). Dynamic programming (Levenshtein distance).

EN/CZ 13 (Jan 4th, 5th 2021)

Postfix calculator with polynomials, some operations on polynomials.

Bonus material

If you miss the “Gang of Four” and related design patterns in C++, you are correct. These were designed for large monolithic software pieces developed by unicultures (usually corporate, where the object design is more understandable for the non-programming folks and management). Within a systems-programming and HPC environment, you often want to fit into conditions that completely deny such style of software design and development process. The reasons include the performance (OOP is slow), efficiency (the designs kinda ignore the fact that your data often don’t fit in the main memory), data representation requirements (how do you make your object ontology dynamic and serializable?), program expressivity and polymorphism (the sole representation of objects makes their mutual composition extremely unwieldy), or plain language agility (not all software is best written in your favorite language, right?). I was recently pointed to the Game Programming Patterns book (it is available online in HTML form; thanks JB for the link), which is well suited even for non-game kinds of high-performance and efficient software, and generally captures the C++ approach to the problem very well.
The Art Of UNIX Programming (online version) by E.S.Raymond is a brilliant material not only about the origins of current computing practice and the C programming language (that emerged together with UNIX and gave rise to C++), but also contains almost all the most important insights about software development gained in last 50 years. Surprisingly, all of them still apply. Sometimes a bit philosophical. If you are developing a large system (or a part of one) for the first time, this book will tell you what design mistakes you are almost certainly going to make, what are they called, and how to avoid them. Much better than all other “software design principles” books combined.
Software speed rule: more efficient programs are faster for free. Often because some accepted truths from theoretical computer science simply do not hold on real hardware. The most striking (and interesting) fact is sometimes described as The myth of RAM.
Doug Lea’s Malloc — a webpage about the original default allocator of most C/C++ libraries, which makes an extremely good read about allocators. (anyway — you probably use that guy’s code several times per second just by silently watching this website)
If you like template programming but the syntax seems unwieldy, use this

...