Switching To Archlinux

Arch Linux LogoI’ve been running Debian/testing on my desktop PC since 2004 without reinstalling anything and it has served me very well. Besides some minor problems, mostly with frustrating software like wine and Exim4, I’ve never ran into any serious problems. Debian is just great, but there are some reasons why Archlinux is better for me:

  • Arch uses a BSD-style initialization structure, I’ve never really understood Sys-V, I probably never will and in fact I don’t want to.
  • Arch packages are bleeding-edge. I hate compiling my own packages “outside the framework” which tends to leave my system like a dirty kitchen if I have to uninstall them.
  • “The Arch Way” – Keep it simple, stupid.
  • Because of KISS it’s lightweight per default and I certainly don’t need a one-size-fits-all-system. Purifying a Debian-system is possible, but most certainly a lot of hassle.

Another reason that has nothing to do with Debian is that I have neglected my way of organizing my files over the years. I have loads of unused configuration files, weird directories called “stuff/”, “stuff2/”, “random_stuff/” etc. I just want to start fresh with some changes:

  • using LVM
  • keeping my file system clean and efficient, having a sensible home hierarchy etc
  • keeping and improving my backup solution (it already is pretty good, but not perfect)
  • keeping my dotfiles and other configuration files under svn control
  • Replace all my passwords, it’s always good to do that once ina while. It will be annoying though because Ihave a LOT of them ;(

I’ve already spent two hours to check if all of my data is safe, I’ve created extra backups for my mail, GPG keys and diploma stuff on an old-fashioned optical medium, I think it’s time… see you in a new (digital) life… goodbye Debian

Invalid…

Invalid!

…and thus exempted from contributions :-)

My Dripping Faucet

my faucetHooray, it is world water week! Wait a minute… who cares? Certainly not my faucet in the kitchen which is trying to perform an acoustic variation of Chinese water torture on me every night. Maybe I’m too harsh and its intentions are to soothe me into sleep all the time with its rhythmic dripping-patterns which raises one question: Does my faucet stay in time? And if so: how rhythmically talented is it? Shouldn’t be too hard to determine.

I’m equipped with an Edirol R-09 which is capable of recording audio up to a sample rate of 96000Hz and there’s an awesome package for R called seewave that offers plenty of tools for sound analysis.

The first step is to record the dripping and keep the background noise low enough so that each drips is prominently represented by an amplitude peak. Then we have to extract distinct points of those peaks to get a sequence of points in time. Then we can calculate the intervals between those points and check the standard deviation of the resulting sample distribution.

Since my machine is pretty old i recorded the dripping at a sample rate of 8000Hz. Compared to how precise I’m able to record, I think that a resolution of 0,125ms is precise enough for my purposes. In fact I recorded one hour of dripping at 96kHz and just extracted 20min of the recording and downsampled it to 8kHz. R proves to be quite a resource hog anyway.

Here’s the R script to get my results. I tried to thoroughly document it so that every step is comprehendable.

# File-Name:       drips.R
# Date:            2010-09-09
# Author:          Christian Gießen
# Email:           cgie at informatik dot uni-kiel dot de
# Packages Used:   seewave, tuneR
 
# Copyright (c) 2010, under the GPL
# See http://www.gnu.org/licenses/ for further information
 
library(tuneR)
library(seewave)
 
# load our data
dripsoundfile<-"drips.wav"
drips<-readWave(file=dripsoundfile)
sr<-drips@samp.rate
 
# basic visualizations
dir.create("images", showWarnings=F)
 
# Oscillogram of the unprocessed soundfile
png("images/drips.png"
  ,width=600
  ,height=400
  ,res=100
  ,bg="transparent"
  ,type="cairo"
  ,antialias="subpixel"
)
oscillo(drips,f=sr)
title(paste("Oscillogram of",dripsoundfile))
dev.off()
 
# The variable amplitude_threshold is the percentage of the maximum amplitude of the soundfile,
# everything with a smaller amplitude will be filtered.
# The drip_duration variable is the number of samples that we expect the sound of a single drip to be.
amplitude_threshold<-65
drip_duration<-250
 
# Oscillogram of the filtered soundfile
png(paste("images/drips",amplitude_threshold,".png",sep="")
  ,width=600
  ,height=400
  ,res=100
  ,bg="transparent"
  ,type="cairo"
  ,antialias="subpixel"
)
drips_filtered<-afilter(drips,f=sr,threshold=amplitude_threshold,plot=T)
dev.off()
 
# Since the sound of a single drip extends over a certain number of samples
# we just consider the index of the last sample of a drip. # First we take the indices of samples with amplitude > 0
i_peaks_0<-which(drips_filtered!=0)
n<-length(i_peaks_0)



# Consider only indices of those samples which are followed by at least
# the number of zero-valued samples defined by drip_duration,
# that is effectively the index of the last sample of a drip as described avove
i_peaks<-i_peaks_0[(i_peaks_0[2:n]-i_peaks_0[1:(n-1)])>drip_duration]
 
# Convert samples to seconds, so the vector s_peaks now consists of the
# more or less exact point in time that my faucet dripped.
s_peaks<-i_peaks/sr
n<-length(s_peaks)
 
# Compute the intervals between two drips
s_peaks_1<-s_peaks[2:n]
s_peaks_diff<-s_peaks_1-s_peaks[1:(n-1)]
 
# Save the important data...
dripdata<-data.frame(seconds=s_peaks[1:(n-1)], diff=s_peaks_diff)
write.table(dripdata,file="dripdata.txt")
 
# ...so we can start from here once it's been computed
dripdata<-read.table("dripdata.txt")
 
# Plot the dataframe
png(paste("images/dripsdiff",amplitude_threshold,".png",sep="")
  ,width=600
  ,height=400
  ,res=100
  ,bg="transparent"
  ,type="cairo"
  ,antialias="subpixel"
)
plot(dripdata, pch=20)
dev.off()



# Plot the kernel density estimation
png(paste("images/dripsdensity",amplitude_threshold,".png",sep="")
  ,width=600
  ,height=400
  ,res=100
  ,bg="transparent"
  ,type="cairo"
  ,antialias="subpixel"
)
plot(density(dripdata$diff)
  ,ylab="Density"
  ,xlab="Interval bewteen two drips\n(in seconds)"
  ,main="Kernel Density Estimation"
)
dev.off()



# There are some obvious outliers in the data, so let's get rid of those.
diff_clean<-(dripdata$diff[dripdata$diff<=3.95])
diff_clean<-diff_clean[diff_clean>=3.5]
 
# Plot the estimation again and add the actual density of a N(m,s) distribution
# with the mean m and standard deviation s from our data
png(paste("images/dripsdensitycleaned",amplitude_threshold,".png",sep="")
  ,width=600
  ,height=400
  ,res=100
  ,bg="transparent"
  ,type="cairo"
  ,antialias="subpixel"
)
plot(density(diff_clean)
  ,col="red"
  ,ylim=c(0,17)
  ,ylab="Density"
  ,xlab="Interval bewteen two drips\n(in seconds)"
  ,main="Kernel Density Estimation (cleaned)"
)
curve(dnorm(x,mean=mean(diff_clean), sd=sd(diff_clean)),add=T,col="green")
legend("topleft",inset=0.0,c("estimation","normal density"),lty=1,col=c("red","green"))
abline(v=mean(diff_clean),lty=2)
text(mean(diff_clean),0,as.graphicsAnnot(formatC(mean(diff_clean),format="f",digits=2)), pos=3)
dev.off()


Quite nice, as i might add. So my faucet drips every 3.81… now what’s the standard deviation?

> sd(diff_clean)
[1] 0.02341201

So my faucet’s dripping follows a \mathcal{N}(3.81,0.023) distribution, interesting. That means that in 95 of 100 cases the next drip will be in 3.81s +/- 0.047s. Not too bad, but i wouldn’t want my faucet to be a drummer in a band, but then again: I’ve seen very bad bands who’d be better off with my faucet.

The Optimal Strategy

I’ve found an article called Optimal Play of the Dice Game Pig by Neller and Presser that discusses a variant of the dice game proposed by the Linux Magazin, the only difference is the winning score and that rolling a 1 leads to passing the die to your opponent instead of 6, so none of these ideas are mine, but the paper is really interesting and Pig or its variant turn out tobe quite complex. In the following I will explain finding the optimal strategy for the LM variant with the ideas from Neller and Presser.

The article starts with a review of previous work regarding Pig and that Knizia [1999] points out that a SLimS strategy is near optimal for every of your turns. But a SlimS strategy ain’t optimal since a turn-based strategy doesn’t need to be a strategy for winning the whole game. For example: if your opponent has a score of 49 it wouldn’t be a good idea to follow the SlimS(16) strategy described in my earlier posts since your opponent is very likely to win in the next round.

So the idea is to maximize the probability of winning which depends on the player’s score i, the opponent’s score j and the player’s score k achieved in his turn so far, so let p_{i,j,k} be the probability of winning at this point. If i+k\geq 100, p_{i,j,k} is obviously 1. More generally we simply define

p_{i,j,k} := \max\{p_{i,j,k}^{\text{roll}}, p_{i,j,k}^\text{hold}\},

in order to maximize the winning probability.

Well, what is p_{i,j,k}^{\text{roll}}? It’s the expected value of the random variable that yields your winning probability after rolling. Let’s define it: X:\haken{6}\to [0,1] with X(a):=p_{i,j,k+a} if a\in\haken{5} and X(6):=1-p_{j,i,0} which is the probability that your opponent does not win in his or her turn which is next, since you’ve rolled a 6. The expected value is simply

\E(X)=\frac{1}{6}\left((1-p_{j,i,0})+\sum_{a\in\haken{5}} p_{i,j,k+a}\right).

The probability p_{i,j,k}^{\text{hold}} is, as before, just the probability that your opponent does not win, but since you’ve passed the die, you can keep your achieved score and thus p_{i,j,k}^{\text{hold}} = 1- p_{j,i+k,0}.

Alright, but how do we get actual numbers? It looks kinda recursive and in fact that would be great, since we have a termination condition we could use: p_{i,j,k}=1 for i+k\geq 100. The big problem is that these equations are cyclic dependent. Calculating p_{i,j,0} needs p_{j,i,0} needs p_{i,j,0} … you get the point, the reason for this is that the game states can repeat because we can roll sixes and thus don’t receive any points that might change our state. We’d get infinite loops if we would implement it that way, or put differently: dynamic programming fails in this case.

A nice solution to this problem is to approximate the probabilities numerically with value iteration as described by Sutton and Barto. The dependencies of the equations are represented by transitions in our state space that consists of all possible (i,j,k) tuples. For each transition there is an expected reward which is to be maximized until a certain point by this method. Since I’m not too much into numerics and all we need to know is that it works, here’s the algorithm in pseudocode:

For all (i,j,k)\in\mathcal{S} initialize p_{i,j,k}:=0
  while \Delta \geq \varepsilon
    for all (i,j,k)\in\mathcal{S}
    p':=p_{i,j,k}
    p^\text{roll}:=p_{i,j,k}^\text{roll} (as described above)
    p^\text{hold}:=p_{i,j,k}^\text{hold} (as described above)
    p_{i,j,k}:=\max\{p^\text{roll},p^\text{hold}\}
    \Delta:=\max\{\Delta, |p_{i,j,k}-p'|\}

\mathcal{S} denotes the state space, that is, all possible or feasible combinations of (i,j,k). It is in fact a proper subset of

This is my implementation in R:

win <- 50
eps <- 10**(-5)
p <- array(0, dim = c(win, win, win))
roll <- array(0, dim = c(win, win, win))
 
get_p <- function(i, j, k) {   if ((i+k) >= win) 1 else if (j >= win) 0 else p[i+1, j+1, k+1]
}   
 
d <- 1 while (d >= eps) {
  d <- 0
  for (i in c(0:(win-1))) {
    for (j in c(0:(win-1))) {
      for (k in c(0:(win-i-1))) {
        pRoll <- (1/6) * (1 - get_p(j, i, 0)
          + get_p(i, j, k + 1)
          + get_p(i, j, k + 2)
          + get_p(i, j, k + 3)
          + get_p(i, j, k + 4)
          + get_p(i, j, k + 5)
        )
        pHold <- 1 - get_p(j, i+k, 0)
        pp <- p[i+1, j+1, k+1]
        p[i+1, j+1, k+1] <- max(pRoll, pHold)
        if (pRoll > pHold) roll[i+1, j+1, k+1] <- 1 else roll[i+1, j+1, k+1] <- 0
        d <- max(d, abs(p[i+1, j+1, k+1] - pp))
      }
    }
  }
}

It was a real bitch to do this in R. Why isn’t there a switch to let arrays start at base 0 in R, gah! And it’s pretty slow, too, it took quite some time:

real    101m58.255s
user    89m36.330s
sys    0m44.486s

The state space doesn’t equal \haken{50}^3, only feasible states are traversed. As you can see I’m keeping yet another array called “roll”, it’s a three-dimensional array, representing the search space (or rather a superset of it). It is initialized with zero-entries, but after the algorithm has finished its entries represent the decision whether to roll (1) or not (0), so this array actually is the decision table of the optimal strategy.

Now, if you take all states (i,j,k) at which you should roll and interpret them as coordinates to put a neat little box in $\R^3$ there you get the very interesting looking object from my last post. I didn’t know how to plot it nicelyin R or with gnuplot, so i just piped these filtered coordinates into a file, did some regular expression magic and put them in a form that good old povray can understand. Povray then renders the strategy landscape. I improved the picture by adding three axes through the origin. I still like this picture a lot.

optimal strategy

And from another perspective:

another representation of the optimal decision table

To be honest: I have mixed feelings participating in that contest now. First, the problem is already solved, so there’s little use to it if everyone would implement that strategy. Second, I’d feel bad because the ideas are not mine. I’ll participate nonetheless, because I’m certainly not the only one who’s implementing an optimal strategy and since it is still a game of pure chance my chances aren’t very high anyway, so I don’t really see it as cheating. It certainly was a lot of fun and I’ve refreshed my skills in bash, R, povray, sed and learned a lot from that paper, so I already have won, in a way.

For the sake of completeness: I let my bot compete against a SLimS(16) bot and the results were 5319 wins out of 10000 games which is significantly better than SLimS(16) at a p-level of less than Formula does not parse: 10^{-11}.

Hybrid Strategies Revisited

Just a quick post: In a very long rerun of the tests regarding HLimS(17,7) I came to the conclusion that HLimS(17,7) is definitely better than SlimS(16). Here are the results:

Bot          Matches  Wins   Defeats  WinRatio[%]  StartRatio[%]
HLimS(7,17)  100000   50659  49341    50.66        50.07
SlimS(16)    100000   49341  50659    49.34        49.93

Thanks to user Solarix from the official LM Wiki who implemented the wonderful multithreaded Qt-DiceOrDie-Server i used to get those results.

Testing for the alternative hypothesis that HLimS(7,17) is better yields the following results in R:

> binom.test(50659,100000,p=0.5,alternative="g")

 Exact binomial test

data:  50659 and 1e+05
number of successes = 50659, number of trials = 1e+05, p-value =
1.558e-05
alternative hypothesis: true probability of success is greater than 0.5
95 percent confidence interval:
 0.5039844 1.0000000
sample estimates:
probability of success
 0.50659

The p-value almost vanishes, so the alternative hypothesis is accepted. It’s significantly better, but only marginal as you can see.

Constructing a better strategy requires rigorous stochastic analysis and as you might guess… I did it. I’m very sleepy right now, but I’ll let you have a look at this:

Strategy Plot

Neat, isn’t it? This picture explains the strategy of the bot I’ll be competing with in the contest, but I’ll explain it in my next post, I’m too sleepy right now. Good night!

Analyzing a Hybrid Strategy

In my last blog I mentioned how baffled I was about the effectiveness of a very basic strategy for the mentioned dice game. I admit I’m not a coding nerd, quite the contrary: I made a grave mistake in the implementation and the results presented in my test runs represent a slightly different strategy. The original strategy was to keep on tossing the die until you sum up at least a determined score for your round (unless a 6 occurs), 15 for example. My implementation was evenmore simple: it kept on rolling a constant number of times, say 5 times.

It’s actually very similar to the original strategy and it reminds me of the difference between (1+1) EA and RLS, the only difference are the mutation steps: While (1+1) EA flips each bit indpendently with a constant probability (mostly 1/n), RLS picks one bit uniformly at random and flips it. Thus the expected value of bit flips is both 1, but (1+1) EA gets at least a chance of exploring larger parts of the search space, although big jumps aren’t probable. The similarity to the above mentioned strategies is the fact that if you follow the “roll limit” strategy (RLimS) with a limit of 5, you get a score of 15 in expectation for that round, provided that no 6 occurs (the expectation for a single throw is 3, not 3.5 since 6 doesn’t count, just in case you were wondering). Intuitiveley the “score limit” strategy (SLimS) seems more flexible since the available set of limits is larger, you can easily set your limit to 13, 14, 17 etc, while the expected scores obtained by RLimS are limited to multiples of 3. Anyway, i just wanted to mention this analogy which is related to the expectation, I’d really like to know if that’s of any scientific use.

Over night I ran several test-series for both strategies with reasonable parameters on the official test server by Linux Magazin. Each strategy was tested in at least 500 matches. These are the results:

Results of the test seriesOf course the comparison isn’t significant at all: Seemingly RLimS yields worse results, but the result is slightly skewed because the results for RLimS include a rotten apple: parameter 3. Anyway: As you can see, SLimS(16) has a win ratio of more than 50% which is, hands down, impressive, no matter what bots it had to compete against. What strikes me most is the fact that the test results are extremely similar. Parameter 3 for RLimS aside, it seems that no matter what limit you choose you’ll be well off.

The good performance of both strategies immediately motivates the analysis of hybridizing the techniques, that is: you choose to pass on the die if you’ve either reached a certain score or a certain number of rolls. To check this i started with running the best strategy, namely SLimS(16) locally against other bots first to check if it truly was the fittest. Each pair of different opponents competed in 5000 matches against eachother, so the base data isn’t too bad and it took quite some time. I present the data as a box plot and a simplified plot that only takes the mean ratio of each pair of bots,  these are the results:

As you can see, SLimS(16) still beats every other instance of any RLimS or SLimS bot and even my guess that both strategies correspond is clearly visible. There’s a second peak for SLimS in the upper twenties (and thus for RLimS around ten) which is probably due to the fact that if you achieve to roll for 10 times or up to a score of almost 30 you only need two or three large (and thus rather improbable) streaks to win because you’ll gather a lot of points anyway. It’s like a weak form of the brutal strategy to hope for a streak of no sixes up to a score of 50 which is, of course, very improbable, it’s less than5%. The global peak for SLimS is of course at 16 as it was to be expected. Don’t be fooled by the plot, SLimS(16) is only marginal above 0.5.

The next step is to consider above mentioned hybrid strategies. At first it might seem useless because instead of being conservative obeying one rule, you’re still conservative but you’re following two rules. That’s not the point, each strategy can be played out more or less risky. The idea is to find the sweet spot by combining both rules. It’s like trying to reach 26 with a multiple of 7 or with multiples of 4: you won’t reach 26 with limiting yourself to one of these numbers, but if you combine them… well you get the point.

This part was intended to be the most interesting one, but unfortunately i just deleted all my data for the hybrid bots that combine both limits, but never mind: SLimS(16) beat them all. The most promising bot so far was HLimS(7,17) (a hybrid of RLimS(7) and SLimS(17), but the results after 10000 runs still weren’t significant at a p-value of around 0.2 for an exact binomial test with alternative hypothesis thatHLimS(7,17) is better:

> binom.test(5044,10000,p=0.5,alternative="g")

 Exact binomial test

data:  5044 and 10000
number of successes = 5044, number of trials = 10000, p-value = 0.1922
alternative hypothesis: true probability of success is greater than 0.5
95 percent confidence interval:
 0.4961259 1.0000000
sample estimates:
probability of success
 0.5044

All other strategies didn’t even come remotely close.

Hybridizing doesn’t seem to be very effecticve unless I did any grave mistakes which i doubt, maybe I’ll come up with another idea, but in fact I don’t intend to do any further research on this since SLimS(16) is so powerful and even if there is a better strategy than that I doubt that it can be enormously grave. Type I errors are very high, too which diminishes the chances of winning in the programming contest anyway since the bots only have  100 matches each.

My conclusions: utterly fucked up pre-requisites for a problem in a programming contest.

Simple Strategies

The German Linux-Magazin is sponsoring a programming contest for the following problem: Two players compete in a dice game, every player can roll a six sided die as often as he wishes. As long as no 6 is rolled the numbers are added up and the player can pass the die to the other one whereupon his score is incremented by the cumulated sum he was able to obtain in his round. However, if a 6 is thrown he falls back to the score he started his round with and has to pass the die, a very simple game. The rolling of the dice is done by a central server and people can let their implemented bots compete against each other for testing issues until September 12, you’re not even bound to some obscure programming language as long as your code is executable with packages from an official Ubuntu 10.04 repository which is pretty much any language I can think of.

My first idea was to leave the decision of rolling/saving to be completely random with a fix probability p of saving, I played around with the values for p a little, letting my bot run 100 times against the other bots each time, but I couldn’t get a higher win/loss ratio than 20%.

My second idea was to make the (still random) decision depend on the difference of the scores. If my score is a lot higher than my opponent’s I can allow myself to be more risky. On the other hand: If my opponent is far ahead, it wouldn’t make much sense to adhere to some conservative strategy, which makes the whole problem symmetrical and motivated a normal distribution for the p-values with mean 0, a proper value for the variance and some mad scaling to map the interval of possible differences [-50,50] to something like [0.3, 0.7] accordingly. I ended up with

20\cdot\frac{1}{\sqrt{2\pi \cdot 100}}\cdot\exp\left(-\frac{x^2}{2\cdot 100}\right),

which did pretty much exactly what I intended. Well, besides some obvious stupid flaws like saving if you have 0 points in the beginning and other minor things that could be corrected with one-liners , the strategy didn’t prove to be much better than my first one, the best result I ended up with was 36%. I might add that I chose the good old Bourne Shell to implement my bot, and using bc for floating point arithmetics works great, but it is a little awkward.

So I paused a little while and scrounged a ciggie from my mother and asked her what strategy she’d pursue. I didn’t even think of it, but her strategy was simply not to save until a certain score in that round is achieved. Since that’s no more than a one-liner, I implemented it and boom! Now I end up with a win/loss ratio of around 50%, thanks mom!

These are my test runs with different score limits before my bot saves. The runs (1000 games each) aren’t really representative since I don’t know the quality of the bots I’m competing against but I won’t dig any deeper.

  • Limit: 6, 474 wins
  • Limit: 7, 514 wins
  • Limit: 8, 473 wins
  • Limit: 9, 501 wins

An application of the Fitness-Level method

In the last post I’ve presented the fitness-level method, a very basic approach to obtain upper bounds for the expected optimization time of evolutionary algorithms. In order to apply this method to a problem (you should know your fitness function f) you just have to abide by the following recipe:

  1. Define a <_f-partition (L_i)_{i\in \haken{k}_0} \in (2^{\{0,1\}^n})^{\haken{k}_0}
  2. Find lower bounds for the probabilities s_i of mutating from L_i into a higher level.
  3. Compute the upper bound.

The artificial problem of finding an element in \{0,1\}^n with as many ones as possible can be easily described as the problem of maximizing the function \textsc{OneMax}:\{0,1\}^n\to\haken{n}_0, x\mapsto\sum_{i=1}^n x_i.

Step 1: Let’s follow our recipe and define a partition. We’ll simply use the canonical decomposition of the search space regarding \textsc{OneMax}:

\forall i\in \haken{n}_0: L_i:=\{x\in\{0,1\}^n\;|\;\textsc{OneMax}(x)=i\}

This is also called the canonical f-based, or in this case \textsc{OneMax}-based partition. Obviously these sets form a partition, are ordered by <_\textsc{OneMax} and L_n contains all global optima, which in this case is the single point 1^n.

Step 2: Now suppose our search point is in L_i, that means that it consists of i ones and n-i zeroes. To mutate into a higher level only one of those n-i zeroes has to be flipped which leads us to

\P(\text{mutating up}) = \sum_{j=1}^{n-i}\binom{n-i}{j}\left(\frac{1}{n}\right)^j\left(1-\frac{1}{n}\right)^{n-j}\geq\binom{n-i}{1}\frac{1}{n}\left(1-\frac{1}{n}\right)^{n-1}=:s_i,

and since e^{-1} is sandwiched by \left(1-\frac{1}{n}\right)^n and \left(1-\frac{1}{n}\right)^{n-1} we have s_i \geq \frac{n-i}{en}. This sandwiching argument is very frequently used by the way, so make sure you will never forget it, few things are more useful than that.

Step 3: We can immediately apply the bound from the previous post:

\E(T_{\text{(1+1) EA},\textsc{OneMax}}) \leq \sum_{i=0}^{n-1} s_i = \sum_{i=0}^{n-1} \frac{n-i}{en} = en \sum_{i=1}^n \frac{1}{i} = en H_n = O(n \log n),

due to the complexity of the n-th harmonic number.

At first you might think that this ain’t too bad for an expected optimization time, then you think of the simplicity of the problem and you find yourself smiling at such a bad performance, but in the end you realize that the (1+1) EA is a black box algorithm! Think about it… nature just works.

The Fitness-Level Method

This is the first in a series of posts regarding techniques for analyzing the computational complexity of evolutionary algorithms. I mainly consider it part of my self-studies to clear things up and I don’t claim correctness or completeness.

For the sake of simplicity I’ll discuss the fitness-level method or method of fitness-based partitions for the (1+1) EA first.

(1+1) EA

  1. Choose x\in\{0,1\}^n uniformly at random.
  2. Create x' by flipping each bit of x independently with probability 1/n.
  3. If f(x') > f(x) then x:=x'.
  4. Repeat steps 2 and 3 forever.

First, the search space \{0,1\}^n is partitioned according to the fitness value: let k\in\N, (L_i)_{i\in \haken{k}_0} \in (2^{\{0,1\}^n})^k be a partition of the search space such that \forall i,j\in\haken{k}_0: i<j \Rightarrow L_i <_f L_j, and L_k only contains global optima. This is called an f-based partition and each set represents a fitness level, hence the naming.

Take x\in \{0,1\}^n, we find i\in\haken{k}_0 such that x\in L_i. The probability m_i(x) that a Mutation x' from x is in \bigcup_{j=i+1}^k L_j, provided that x is in L_i, is

m_i(x) = \sum_{y\in \bigcup_{j=i+1}^k L_j} \left(\frac{1}{n}\right)^{H(x,y)}\cdot\left(1-\frac{1}{n}\right)^{n-H(x,y)},

where H(x,y) denotes the Hamming distance.

Let T(x,i) be the random variable that describes the number of mutations until x advances from L_i to a higher level. Obviously T(x,i)\sim Geom(m_i(x)) since \P(T(x,i)=t)=m_i(x)\cdot\left(1-m_i(x)\right)^{t-1} for all t\in\N due to the fact that we count each mutation as one step and because the fitness cannot decrease. Thus we get \E(T(x,i))=m_i(x)^{-1}.

Defining T(i) := \max_{x\in L_i} T(x,i) and T := \sum_{i\in\haken{k-1}_0} T(i) we have \E(T) = \sum_{i\in\haken{k-1}_0} \max_{x\in L_i} m_i(x)^{-1}, hence \E(T) = \sum_{i=0}^{k-1} (\min_{x\in L_i} m_i(x))^{-1}.

Note that the expected optimization time \E(T_{(1+1) EA,f}) is bounded by \E(T) since we actually just considered lower bounds s_i := \min_{x\in L_i} m_i(x) on the probability of levelling up by mutation which leads us to

\E(T_{(1+1) EA,f}) \leq \sum_{i=0}^{k-1} \frac{1}{s_i}

The result is actually pretty obvious: every fitness level can only be left at most once. Adding up the upper bounds for the expected numbers of steps for each level yields exactly our result. In the next post I’ll show an application of this.

cMVO2 Test

This is the first post on my new blog and I’ll just start right off. In the past couple of months I grew fairly fond of kettlebell-training, kettlebells are just awesome tools to challenge your strength, endurance and flexibility while still having lots of fun, I really dig Turkish Getups and snatches with my 16kg kettlebell. One of the coolest routines I’m following is the 15:15 MVO2 Protocol described by master RKC Kenneth Jay in his book Viking Warrior Conditioning. Buying the book would actually be pretty useless because it’s summarized in a couple of words, so this is not meant as an endorsement at all.

Anyway, the mentioned protocol can be described as a long HIIT-snatch session except for the fact that your work/rest intervals are 15s each, switching arms after every set. The goal is to start with a comfortable weight (16kg is enough for me for now) and work up until you finish 80 sets, which means doing a 40min workout consisting of 20min work, 20min rest. I’ve been doing this protocol at cadence of 7 snatches/15s for a while now, always finishing after 40 rounds, which means performing 280 snatches and I feel very comfortable so I decided that I’ll try working up to 80 sets.

Kenneth Jay recommends finding one’s cadence with his cMVO2 test: You’re snatching your kettlebell for 5 minutes, performing 10, 14, 18, 22, MAX snatches every minute while switching arms every minute. MAX is the number of snatches you’re able to perform in the fifth round. Your cadence that elicits a VO2max response for the 15:15 MVO2 protocol is MAX/4 then.

I’ve always found that 7 is very comfortable for me and i just did the cMVO2 test for the first time and yeah, i got 29 reps in the fifth minute. In fact i thought i’d be a lot better, but the test itself is pretty demanding. Snatching for a whole minute at full throttle put a lot of stress on my forearm. In my next 15:15 MVO2 workout I’ll try to perform 50 rounds.

To keep track of my progress I’ll create a static wordpress page showing a diagram of my work volume/date.

To perform the test I’ve created a simple “pacemaker”-mp3 for my snatches, you can download it here. You perform a snatch at every high-pitch sound, switch hands it it’s a double, count your reps when you hear nothing. The low-pitch sounds help you to find your rhythm. Oh, and keep your arm locked out over your head when you’re resting between your snatches. Have fun…