Hexbyte  News  Computers The One On Dynamic Programming! | Blogarithms

Hexbyte News Computers The One On Dynamic Programming! | Blogarithms

Hexbyte News Computers

Dynamic programming (DP, as I’ll refer to it here on) is a toughie.

I’ve heard a lot of friends and juniors complain about dynamic programming and about how non-intuitive it is.

Plus, problems on DP are pretty standard in most product-company-based hiring challenges, so it seems like a good topic to address on a blog based on algorithms.

Here’s a heads-up to you though, since this article is the first in (maybe, depending on its reception), a multi-part series on DP, this article will focus on setting the groundwork of DP and I’ll dedicate most of this article to explaining how to arrive at the solutions to classical DP problems, and the general pattern of thought that I use when it comes to DP.

This article, I’ll be tackling the following –

  • Coin Change Problem
  • 0-1 Knapsack Problem
  • DP on grids

And I’ll be leaving some more classical variants of DP as exercises.

If you’re already fully confident of why/how the above DP algorithms work correctly, then reading this article may yield little value, and you may be better off waiting for the next article on DP. 🙂

Should I add future articles, I’ll tackle some non-classical variants of DP, leveraging intuition and logic.

Onto the article!

DP: An Intro

I’ve found myself solving the nicest of DP questions when I leverage recursion.

This is something I’ve often noticed in general, that for me, recursion is very intuitive.

While DP at times does seem confusing, I’ve found from personal experience that thinking of solutions and problems recursively makes reaching a DP solution that much easier.

Let’s take an example –

The Coin Change Problem

Let’s say you have an infinite amount of K different denominations of coins.

You need to find the minimum number of coins to possess a value of N dollars.

For example, if K = 2 and your denominations are {1, 2}, and you want to find the min. number of coins needed to generate N=9 dollars.

Here, it’s pretty obvious that’d be 5 coins (4 two dollar coins + 1 one dollar coin).

Note: A greedy algorithm of selecting the largest available denomination under N at every instant won’t give optimal answers.

Try out the test case: K=3, denominations = {1, 3, 4} and N = 10.

The greedy answer yields 4 coins (4 + 4 + 1 + 1) while the optimal answer is 3 coins (4 + 3 + 3).

Thus, clearly we need to check all possible coin selections we can make.

Now let’s model the problem recursively.

Let’s define a mathematical function f where f(N) tells me the minimum number of coins I need to generate a value of N.

Let’s say I have K denominations.

Now, if I were to take away a coin from my value of N, this coin could be of any denomination, correct?

Basically, N could be re-written as ((N-denom[i]) + denom[i]) for any valid 0<=i.

I also know, that denom[i] can be represented as 1 coin (a coin whose value equals denom[i]).

Thus, f(N) = f(N-denom[i]) + 1

This is our general case.

What’s our base case?

If N=0, we need 0 coins.

Similarly, if N<0, its an impossible case, which needs to be handled.

Compiling all this, we have:

>>> def f(n, denom):
	ans = 10**10
	if n<=0:
		return 0
	for i in denom:
		if n-i>=0:
			ans = min(ans, 1+f(n-i, denom))
	return ans

>>> f(10, [1,3,4])


Now there are a couple of things we need to ensure we’re handling, if we want to port this to DP.

  • No global variables should be modifed in the function
  • Simple, memorizable states (state-space reduction)

And… I’d say that’s it?

Now, we can make denom a global array as we’re not changing it.

This makes our function just have one parameter. The current value, N.

Thus, we have an O(N*K) DP solution to this, as follows -

>>> dp = [-1 for i in range(n)]
>>> def f(n):
	if dp[n]!=-1:
		return dp[n]
	ans = 10**10
	if n<=0:
		return 0
	for i in denom:
		if n-i>=0:
			ans = min(ans, 1+f(n-i))
	dp[n] = ans
	return ans

And that’s it!

The 0-1 Knapsack Problem

Okay, so let’s say you’re a burglar who has a bag (knapsack) that can carry a total weight of W.

There are N valuable objects in the house you broke into.

Each object i, has a weight of wt[i] and a value of vals[i].

Your task as a burglar is to devise an efficient algorithm that can maximize the net value of objects you can steal from the house provided you place them all in your knapsack.

I hope you don’t have to apply this in seriousness. 😛

So, let’s think recursively here.

In general, whenever problems involve multiple solutions where there’s an “option” to choose or ignore an element, I find it suitable to use the current index of the element as a parameter.

So this means that my f() method already has one parameter - int idx

Now let’s think about what our other parameter could be by simulating our burglary.

At any point, when we’re looking at the object at index idx, we have a choice - to take it, or to not take it.

The point of DP is to optimize our solution of “checking all possibilities”, without repeating checking already checked possibilities, right?

So, we have a choice – take the current object, or ignore it.

But let’s say we do take it – we still need to verify if our knapsack can hold it, right?

Lets look at this recursively.

To make sure our current possibility is valid, and to provide insight into our decision as to whether or not can we afford to take the next object, we NEED to know our current knapsack capacity that’s left.

So, our states are (current_index, weight_left).

Let’s see how we can model this as a recursive equation now -

f(current_index, weight_left) = max( f(current_index + 1, weight_left), vals[current_index] + f(current_index+1, weight_left - wt[current_index]) )

Look closely to the right-hand-side of that equation.

We’re stating:

f(idx, weight) = max( possibility where we don’t take current element, possibility where we do take current element )

It’s important to note that we’ve defined f(idx, weight) to be the maximum value we can obtain assuming we’re at index idx with weight weight left over.

Also, pay attention to the states in the latter possibility, we even update the weight_lef

Read More

Hexbyte  Hacker News  Computers Full-system dynamic tracing on Linux using eBPF and bpftrace

Hexbyte Hacker News Computers Full-system dynamic tracing on Linux using eBPF and bpftrace

Hexbyte Hacker News Computers

Linux has two well-known tracing tools:

  • strace allows you to see what system calls are being made.
  • ltrace allows you to see what dynamic library calls are being made.

Though useful, these tools are limited. What if you want to trace what happens inside a system call or library call? What if you want to do more than just logging calls, e.g. you want to compile statistics on certain behavior? What if you want to trace multiple processes and correlate data from multiple sources?

In 2019, there’s finally a decent answer to that on Linux: bpftrace, based on eBPF technology. Bpftrace allows you to write small programs that execute whenever an event occurs.

This article shows you how to setup bpftrace and teaches you its basic usage. I’ll also give an overview of how the tracing ecosystem looks like (e.g. “what’s eBPF?”) and how it came to be what it is today.

Table of contents

Hexbyte  Hacker News  Computers
How tracing works

What is tracing?

As stated before, bpftrace allows you to write small programs that execute whenever an event occurs.

  • What’s an event? It could be a system call, a function call, or even something that happens inside such a call. It could also be a timer or hardware event, e.g. “50ms has elapsed since last time”, “a page fault occurred”, “a context switch occurred” or “a CPU cache miss occurred”.
  • What can you do in response to an event? You can log something, compile statistics and execute arbitrary shell commands. You will have access to a variety of context information such as the current PID, stack trace, time, call arguments, return value, etc.
  • What are the use cases? Lots. You can figure out why an app is slow by compiling a list of slowest calls. You can figure out whether (and why) there is a memory leak. I use it to figure out why Ruby is using so much memory.

The great thing about bpftrace is that you don’t need to recompile apps. No need to manually add print calls or other debugging code in the target app’s source code. There isn’t even a need to restart apps. All of this, with low overhead to boot. This makes bpftrace especially useful for debugging production systems, or in other scenarios where compiling is a problem.

DTrace: the father of tracing

For a long time, the best answer to tracing is DTrace, a full-system dynamic tracing framework originally developed by Sun Microsystems (the inventors of Java). Just like bpftrace, DTrace allows one to write small programs that execute in response to events. In fact, bpftrace and many key pieces of the ecosystem are heavily contributed to by Brendan Gregg, a prominent expert on DTrace who now works at Netflix. Which explains the similarities between DTrace and bpftrace.

Hexbyte  Hacker News  Computers DTrace presentation
Solaris DTrace introduction (2009) by S. Tripathi, Sun Microsystems

At some point, Sun open sourced DTrace. Today, DTrace is available in Solaris, FreeBSD, and macOS (though the macOS version is as good as defunct because System Integrity Protection broke many aspects of DTrace).

You saw that right… Linux is absent from that list. It’s not an engineering issue but a licensing issue. DTrace was licensed under the CDDL instead of GPL. A port of DTrace to Linux has been available since 2011, but it has never been adopted by the upstream Linux developers. Early 2018, Oracle relicensed DTrace under the GPL, but by then it was already far too late.

The Linux tracing ecosystem

There’s no doubt that tracing is very useful, and so the Linux community sought to develop their own solutions. But unlike Solaris, Linux is not governed by a single vendor, and so there was no concentrated effort to develop a full-fledged DTrace alternative. Instead, the Linux tracing ecosystem grew slowly and organically, solving problems piecemeal. Only recently has the ecosystem grown to be a serious competitor to DTrace.

Because of the organic growth, the ecosystem can feel a bit chaotic, consisting of many different components. Luckily, Julia Evans wrote an overview of the ecosystem (note: 2017, bpftrace did not exist).

Hexbyte  Hacker News  Computers Linux tracing ecosystem
The Linux tracing ecosystem as described by Julia Evans

Not everything is equally important. Let me summarize what I think are the most important pieces.

Event sources

Event data can come from the kernel or from userspace (apps and libraries). Some of them are automatically available without further upstream developer effort, others require manual annotations.

Hexbyte  Hacker News  Computers Overview of the most important Linux tracing event sources
Overview of the most important Linux tracing event sources
  • On the kernel side, kprobes is the mechanism that allows tracing any function call inside the kernel. This way you can trace system calls, but also what happens inside system calls (because system call entrypoints call other internal functions). You can also use kprobes to trace non-syscall kernel events, e.g. “buffered data is being flushed to disk”, “a TCP packet is being sent over the network” or “a context switch is now occurring”.

    Kernel tracepoints allows tracing custom events that the kernel developers have defined. These events are not on the level of function calls. Kernel developers manually place TRACE_EVENT macros in the kernel code to define such tracepoints.

    There are pros and cons to both. Kprobes work “automatically” because they do not require kernel developers to manually annotate their code, but kprobe events can change arbitrarily from kernel version to version because functions can be added, removed or renamed at any time.

    Kernel tracepoints tend to stay more consistent over time, and can provide useful context information that may not be available in kprobes. With kprobes, one can access function call arguments. But with kernel tracepoints one can access any information the kernel developer decided to manually annotate.

  • On the userspace side, uprobes is like kprobes, but works on the userspace level. It’s for tracing userspace function calls.

    USDT probes (“Userland Statically Defined Tracing”) are like kernel tracepoints for userspace. App developers have to manually annotate their code with USDT probes.

    An interesting trivia: DTrace has long provided a C API for defining the DTrace-equivalent of USDT probes (through the DTRACE_PROBE macro). The Linux tracing ecosystem developers decided to stay source-compatible with that API, so any DTRACE_PROBE macros are automatically converted to USDT probes!

So in theory, strace can be implemented by using kprobes, and ltrace can be implemented using uprobes. I am unsure whether they already do that or not.


Frontends are apps that allow users to easily make use of the event sources.

Let’s consider how the event sources work. They roughly operate like this:

  1. The kernel exposes a mechanism – typically some /proc or /sys file that you can write to – to register an intent to trace an event and what should happen when an event occurs.
  2. Once registered, the kernel looks up the location in memory of the kernel/userspace function/tracepoint/USDT-probe, and modifies its code so that something else happens.
  3. The result of that “something else” can be collected later through some mechanism.

You don’t want to do all this by hand! So that’s where frontends come in: they do all that stuff for you.

Frontends come in many flavors. In the area of eBPF-based frontends, some are very low-level and require you to intiminately understand how to interface with the event sources and how eBPF bytecode works. Others are high-level and easy-to-use, but historically such frontends have had limited flexibility.

And that’s why bpftrace – the newest frontend – is my favorite. It is easy and flexible, like DTrace. But it’s quite new and so rough around the edges.


eBPF is the new hotness in Linux tracing land and is what powers bpftrace. So when you trace an event, you want “something” to happen in the kernel. How do you define what “something” is in a flexible way? Why, with a programming language (or, computer-science-equivalently, with some sort of machine code) of course.

But what language, or what machine code? Native machine code is very dangerous, because when run by the kernel you can easily corrupt the kernel, bypass security, etc. Wouldn’t it be nice if the language or machine code in question is limited in power (can only access safe portions of memory), and non-turing-complete (guaranteed to halt)?

Hexbyte  Hacker News  Computers An attacker tells kernel to run some x86 code whenever a TCP packet is sent. The code then emails kernel secrets to the attacker.
We don’t want this to happen

Enter eBPF, short for Extended Berkeley Packet Filter. It’s a high-performance virtual machine that runs in the kernel with the following properties/limitations:

  • All interactions with userspace happen through eBPF “maps” which are key-value stores.
  • There are no loops, so every eBPF program will finish within some bounded execution time.

But wait, “packet filter”? Yep: this was originally developed for filtering network packets. It’s a surprisingly similar problem: when a packet is received (an event occurred), you want to perform some sort of administrator-defined action (reject, drop, forward, log, etc). This has to be fast, and so a virtual machine (with optional JIT compilation) was invented. The “extended” part refers to the fact eBPF is an improvement on the original Berkeley Packet Filter, in the sense that it’s usable outside a networking context.

So there you go. With bpftrace you define what events to trace, and what should happen in response. Bpftrace compiles your high-level-bpftrace-language program to eBPF bytecode, listens on events and uploads the bytecode to the kernel.

The dark days before eBPF

Before eBPF entered the scene, the solutions were awkward, to say the least. SystemTap – which is sort-of the “most serious” Linux-land predecessor to bpftrace – compiles SystemTap scripts into C which in turn is compiled into a kernel module. That kernel module is then loaded.

This approach was very fragile and was not well-supported outside of RHEL. I never got it to work well on Ubuntu, which had a tendency to break SystemTap on every kernel upgrade due to changing kernel data structures. It was also slow because on every invocation of SystemTap it had to compile a kernel module. It is also said that SystemTap in the early days easily caused kernel panics.

Installing bpftrace

Let’s get our hands dirty! This tutorial teaches you to use bpftrace on Ubuntu 18.04. Newer Ubuntu versions are not recommended because we need to install some software for which packages have not been built for newer Ubuntu versions.

Installing prerequisites

First, install Clang 5.0, libclang 5.0 and LLVM 5.0, including all development headers. We’ll use packages provided by llvm.org because the ones in Ubuntu have an issue.

wget -O - https://apt.llvm.org/llvm-snapshot.gpg.key | sudo apt-key add -
cat <<EOF | sudo tee -a /etc/apt/sources.list
deb http://apt.llvm.org/xenial/ l

Read More