Simple way to plot multi-color line in matplotlib

There are multiple ways of plotting a multi-color line in matplotlib. A special property of the plot command allowing plotting multiple datasets in columns of 2D arrays makes it possible to do this with a single plot command using the cycler for the colors. Here are a sample code segment for this.

Generate sample data

import numpy as np
import matplotlib.pyplot as plt
from cycler import cycler

x = np.linspace(-np.pi,np.pi,9)
y = np.sin(x)
rng = np.random.default_rng(123)
c = rng.uniform(size=(len(x),3))

plt.scatter(x,y,c=c)
plt.show()

Quick plotting with line colors

plt.gca().set_prop_cycle(cycler('color',c[:-1]))
plt.plot(np.c_[x[:-1],x[1:]].T,np.c_[y[:-1],y[1:]].T)
plt.show()

More careful plotting with point colors

y_ = (y[:-1]+y[1:])/2
x_ = (x[:-1]+x[1:])/2

plt.gca().set_prop_cycle(cycler('color',c[:-1]))
plt.plot(np.c_[x[:-1],x_].T,np.c_[y[:-1],y_].T)
plt.gca().set_prop_cycle(cycler('color',c[1:]))
plt.plot(np.c_[x_,x[1:]].T,np.c_[y_,y[1:]].T)
plt.show()

Day of the week calculation

For given date in year-month-day, the day of the week can be deduced for the rule of leap years [1] and the fact that January 1, 1 is a Monday [2].

Since a year is a leap year if the year number is a multiple of 4 that is not a multiple of 100 that is not a multiple of 400. The number of leap year up to year y is given by:

def num_leap(y):
    return y//4-y//100+y//400

For a common year, the numbers of days for the twelve months are:

[31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31]

The number of days before each month in a common year is calculated with:

p = 0
days_before_month = [
    [p, p := p + m][0] for m in
    [31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31]
]
print('[',', '.join(map(str,days_before_month)),']',sep='')

The result is:

[0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334]

Since the extra day of a leap year is added to February, the leap status of current year is only relevant when month number m<3. We can calculate the number of days from January 1, 1 as:

def day_number(y,m,d):
    return (y-1)*365+num_leap(y-(m<3))+days_before_month[m-1]+d

Or, pack everything in the function:

def day_number(y,m,d):
    return (y-1)*365+(lambda y:y//4-y//100+y//400)(y-(m<3))+[0,31,59,90,120,151,181,212,243,273,304,334][m-1]+d

[1] https://en.wikipedia.org/wiki/Leap_year
[2] https://en.wikipedia.org/wiki/AD_1

Walrus operator can inline a block of code into one-liner

Walrus operator, or assignment expression, was introduced into Python at version 3.8. This saves some additional line of assignment in storing-and-checking scenario following evaluations.

However, it also brings the containers data types, such as a list, a step closer to the Turing completeness. Following is a piece of real code:

[d:=get_scans(fn),b:=d[0],c:=d[1],c[np.isin(b,bs)][np.argsort(b)]][-1]

Or, in a more structured format:

[
    d:=get_scans(fn),
    b:=d[0],
    c:=d[1],
    c[np.isin(b,bs)][np.argsort(b)]
][-1]

This performs some additional processing from the result returned by get_scans(fn) and turn the whole thing into a value by extracting the last element of the list. One application of this kind of composite expressions is the possibility of replacing traditional loops with list comprehension. For example, without assignment expressions, we need a loop to collect the data:

r = []
for fn in fns:
    [b,c] = get_scans(fn)[:2]
    r.append(c[np.isin(b,bs)][np.argsort(b)])
r = np.array(r)

Using the walrus operator, we can do it in “one-line” (which is broken down to several lines for clarity):

r = np.array([[
    d:=get_scans(fn),b:=d[0],c:=d[1],
    c[np.isin(b,bs)][np.argsort(b)]
][-1] for fn in fns])

Some may consider this is an abuse of the container types of Python. But, as long as it is used with care, it can help to make elegant code without hindering readability.

Certificate for courier esmtpd

To enable SSL/TLS support for the ESMTP, you need to have a server certificate. Usually, the installation process of the package in a Linux distribution will create a default, self-signed certificate for you. However, if you want to create a proper certificate for your site, following is some simple steps to do so.

First, you need to generate a key for your server if you don’t already have one:

openssl genrsa -out server.key 2048

With that key, you can then generate a certificate request:

openssl req -new -key server.key -out server.csr

If you did not customize your openssl.cnf configuration file, the above command will prompt you for the details of identify for the server in the certificate. Answer all questions as you please except for the common name “CN”, which should be the host name to connect to your server.

Now, you need to get a Certificate Authority to sign your request. For example, if you have a demoCA setup for your openssl installation, you can do:

openssl ca -config openssl.cnf -policy policy_anything -out server.crt -infiles server.csr

This results in the certificate file server.crt. You then can combine the server key and certificate files to create the certificate file for the courier mail server.

cat server.key server.crt > esmtpd.pem

This used to be sufficient. However, the newer version (0.73) of courier requires a “DH parameters” block in the certificate file. This can be generated and appended with:

openssl dhparam 1024 >> esmtpd.pem

Now, you can point the “TLS_CERTFILE” in all the configuration files to the certificate esmtpd.pem and restart your server.

Example: data preprocessing with BASH

Case situation

I have run some batch jobs on a cluster to process data files for different systems (msc, ms, sh, rd) and parameters (i and w). The files are in different subdirectories:

[cjj@gust pattern]$ ls d-*/*.spd
d-msc/i275w042526.spd d-ms/i285w025017.spd d-rd/i295w042812.spd
d-msc/i280w040241.spd d-ms/i290w023034.spd d-sh/i275w051138.spd
d-msc/i285w036791.spd d-ms/i295w020787.spd d-sh/i280w047315.spd
d-msc/i290w031925.spd d-rd/i270w065151.spd d-sh/i285w043415.spd
d-msc/i295w026791.spd d-rd/i275w060475.spd d-sh/i290w039589.spd
d-ms/i270w034433.spd d-rd/i280w055777.spd d-sh/i295w035791.spd
d-ms/i275w030644.spd d-rd/i285w051257.spd
d-ms/i280w027133.spd d-rd/i290w046948.spd
[cjj@gust pattern]$

While, the output files are in the current directory:

[cjj@gust pattern]$ ls *.o*
i270w034433.spd.o172489  i275w060475.spd.o172496  i285w036791.spd.o172486
i270w065151.spd.o172495  i280w027133.spd.o172491  i290w023034.spd.o172493
i275w030644.spd.o172490  i280w040241.spd.o172485  i290w031925.spd.o172487
i275w042526.spd.o172484  i285w025017.spd.o172492  i295w026791.spd.o172488
[cjj@gust pattern]$

The format of the output log files are as follows:

[cjj@gust pattern]$ cat i270w034433.spd.o172489
MinTemplateNumber =  3
JT =  5
JN =  1
spikeResolution =  2
Number of initial spike patterns have been found : 562
ans = Creating surrogate data
ans = Creating time jittering surrogate data
ans = Creating neuron jittering surrogate data
Number of spike patterns have been valid by checking with sorrogate : 542
Number of spike patterns have been ruled out because of having less complex : 205
Number of valid spike patterns have been found : 337
[cjj@gust pattern]$

Problem task

Gather the stats in the log files as those marked in red.

Solution 1

This is done with a one-liner:

[cjj@gust pattern]$ for i in d-*/*.spd;do n=${i%/*};n=${n#d-};s=${i#*/};if [ -f ${s}.o* ];then w=${s%.spd};w=${w#*w}; echo ${n} ${s:1:2}.${s:3:1} $((1${w:0:2}-100)).${w:2} `grep ':' ${s}.o* | awk '{print $NF}'`;fi;done > matching_stat.txt

which can be broken down to:

for i in d-*/*.spd;do
n=${i%/*}
n=${n#d-}
s=${i#*/}
if [ -f ${s}.o* ];then
w=${s%.spd}
w=${w#*w}
echo ${n} ${s:1:2}.${s:3:1} $((1${w:0:2}-100)).${w:2} `grep ':' ${s}.o* | awk '{print $NF}'`
fi
done > matching_stat.txt

The data file generated is:

[cjj@gust pattern]$ cat matching_stat.txt 
msc 27.5 4.2526 81 75 22 53
msc 28.0 4.0241 237 217 103 114
msc 28.5 3.6791 393 371 156 215
msc 29.0 3.1925 335 322 132 190
msc 29.5 2.6791 445 437 144 293
ms 27.0 3.4433 562 542 205 337
ms 27.5 3.0644 1037 1006 331 675
ms 28.0 2.7133 1141 1093 341 752
ms 28.5 2.5017 1325 1274 462 812
ms 29.0 2.3034 1652 1609 747 862
rd 27.0 6.5151 1031 953 313 640
rd 27.5 6.0475 1042 963 345 618
[cjj@gust pattern]$

Portable library for C++ GUI programming

It has been for a few occasions that I find my self wanting to port my GUI programs (most of which were written with gtkmm) to other platforms for the enjoyment of my friends. However, while most GUI toolkits claim to be portable to all platforms (Linux, Windows, and Mac), they generally require installation of multiple shared libraries by the users.

This is a major show stopper for most of my friends and is sufficient to kill any interest in them on the first mention. Therefore, the only viable mean is for me to make and send them statically-linked, monolithic executables that they can happily click on to start the shows. (They generally don’t mind waiting a few minutes to download a bloated binary, as long as it remains a single step.)

My recent survey of the GUI library landscape brought my attention to GLUT. While it still requires installation on Windows, I can easily find static versions of FreeGLUT library for MingW that can be used to cross-compile, on Linux, statically-linked executables that can run independently on Windows. Furthermore, the GLUT framework, which, according to this, “is installed with every install of Mac OS X”.

However, GLUT library only provides facilities for managing windows and handling user inputs. It is by no means a GUI toolkit and you will have to draw all user-interactive elements by yourself (in OpenGL). I do find a GUI library, GLUI, that is built on and should be as portable as GLUT. However, after porting a couple programs to GLUI, I failed to find it enjoyable for me to break up the C++-elegant logic of gtkmm and redo my work in a less polished API.

What follows is the birth of gltk, it is an implementation of the gtkmm API on GLUT. I actually started with adding the libsigc++‘s signal-slot API to GLUI since the original callback mechanism only supported single static callback function and I needed more flexibility to port my programs. But, the hack soon proliferated into the entire source tree, and I decided it would be much more enjoyable for me to start something entirely from scratch.

After a somewhat persisting part-time effort that lasted more than a month, I have just made the first release of the library. It’s usable for a simple application that only needs some buttons, checkboxes, sliders, single line text labels or entries to control calculations. For myself, this represent over 80% of the applications that I would have considered porting. I am feeling pretty happy about it and I hope some others will also find it useful.

Project Homepage: http://gltk.ccdw.org/

Permutation

The following spaghetti prints all permutations of a string (“abcd”). I wrote it as an example to show the benefit of structured programming in my class.

#include <iostream>
using namespace std;
int main()
{
        char c[] = "abcd\n";
        int size = sizeof(c) - 2;
        int n[size + 1];
        n[size] = 1;
        int idx = size;
        char t;
new_round:
        for (int i = 0; i < idx; i ++) n[i] = i + 1;
        cout << c;
start_shift:
        idx = 0;
        t = c[0];
shift_next:
        if (n[idx]) goto shift_done;
        idx ++;
        c[0] = c[idx];
        c[idx] = t;
        t = c[0];
        goto shift_next;
shift_done:
        n[idx] --;
        if (idx == size) return 0;
        if (n[idx] == 0) goto start_shift;
        goto new_round;
}

The structured version is as follows:

#include <iostream>
using namespace std;
int main()
{
        char c[] = "abcd\n";
        int size = sizeof(c) - 2;
        int n[size + 1];
        n[size] = 1;
        int idx = size;
        char t;
        do {
                if (n[idx]) {
                        for (int i = 0; i < idx; i ++) n[i] = i + 1;
                        cout << c;
                }
                idx = 0;
                t = c[0];
                while (n[idx] == 0) {
                        idx ++;
                        c[0] = c[idx];
                        c[idx] = t;
                        t = c[0];
                }
                n[idx] --;
        } while (idx < size);
        return 0;
}

The idea is to rotate a string by n times, where n is the length of the string. While, before each rotation, rotate the n-1 substring at the front n-1 times. And while, before each rotation, rotate the n-2 substring at the front n-2 times. And so on… We can see that this is more elegantly done recursively:

#include <iostream>
using namespace std;
char c[] = "abcd\n";
int size = sizeof(c) - 2;
void rotate(int l)
{
        char ch = c[l - 1];
        for (int i = l - 1; i; i --) c[i] = c[i - 1];
        c[0] = ch;
}
void perm(int l)
{
        if (l == 1) cout << c;
        else for (int i = 0; i < l; i ++) {
                perm(l - 1);
                rotate(l);
        }
}
int main()
{
        perm(size);
        return 0;
}

The above thinking counts on the string to have all distinct chars. When this isn’t the case, a different approach is to consider the lexical order of the permutations. From a given permutation, we simply need to figure out the next in the lexical order until the order is “maximized”. It turns out that this is in the C++ STL. My reimplementation is as follows:

#include <iostream>
using namespace std;
void swap(char & a, char & b)
{
        char c = a;
        a = b;
        b = c;
}
// increase the order of string
bool incr(char * str, size_t len)
{
        size_t i = 1;
        while (i < len && str[i - 1] >= str[i]) i ++;
        if (i == len) return false; // no kink
        // found a kink
        size_t j = i - 1;
        while (j > 0 && str[j - 1] < str[i]) j --; // size kink
        swap(str[i], str[j]); // shave kink
        // reverse rest
        for (i --, j = 0; j < i; i --, j ++) swap(str[i], str[j]);
        return true;
}
int main()
{
        char c[] = "aabbc\n";
        int size = sizeof(c) - 2;
        do cout << c; while (incr(c, size));
}

Setup isatap router on debian

The ifupdown package on Debian does not support isatap as a mode for v4tunnel. Therefore, one can not simply create a single entry in /etc/network/interfaces to make it work. Anyhow, following are the steps I took to set it up.

  1. Install iproute and radvd
  2. Add “net.ipv6.conf.all.forwarding=1” to /etc/sysctl.conf
  3. Add /etc/radvd.conf containing:
    interface is0
    {
        AdvSendAdvert on;
        UnicastOnly on;
        AdvHomeAgentFlag off;
        prefix 2002:aaaa:bbbb:1::/64
        {
            AdvOnLink on;
            AdvAutonomous on;
            AdvRouterAddr off;
        };
    };

    (replace “2002:aaaa:bbbb:1” with the prefix of your ipv6 subnet)

  4. Since I have my default address connected to a 6to4 tunnel on my eth0 already, I need to add an additional ip4 address to eth0. In /etc/network/interfaces I add the following post-up, and pre-down scripts to eth0:
    post-up ip tunnel add is0 mode isatap local 192.168.1.12 ttl 64
    post-up ip link set is0 up
    post-up ip addr add 2002:aaaa:bbbb:1::5efe:192.168.1.12/64 dev is0
    post-up ip addr add 192.168.1.12/32 dev eth0
    pre-down ip addr del 192.168.1.12/32 dev eth0
    pre-down ip link set is0 down
    pre-down ip tunnel del is0

    (again replace “2002:aaaa:bbbb:1“)

  5. Restart the computer or do the following:
    $ sysctl -p
    $ ifdown eth0
    $ ifup eth0
    $ invoke-rc.d radvd start

On the client side, I just installed isatapd, added

ISATAP_ROUTERS=”192.168.1.12″

to /etc/default/isatapd, and restarted with “invoke-rc.d isatapd restart“. Then, everything works!

Processing command-line arguments in C++

I just released arg as a standalone library under LGPL. It’s a command-line parser for C++ programs. The aim is to keep the programming effort in adding command-line processing to C++ programs at a minimum. There are several considerations in this direction:

  1. simple and intuitive syntax
  2. requiring no redundancy
  3. localized codes
  4. extensibility
  5. completeness

The simple example as given on the arg homepage,

#include <arg.hh>
#include <iostream>
int main(int argc, char ** argv)
{
        arg::Parser p;
        int n;
        p.add_opt(‘n’).stow(n);
        p.parse(argc, argv);
        std::cout << n << ‘n’;
        return 0;
}

should be very close to the minimum as far as 1. goes.

Programming is for a programmer to describe what he wants the computer to do. Per point 2., he should not be asked to provide the same information multiple times. (Well, maybe except in situations where multiple confirmations are required: “Launch the missile. Please have the missile launched. Yes, I really want you to launch the missile! Launch the *&^%$ missile!!!”; Computer: “Aborted on Error: missile != *&^%$ missile”.)

When working on an item, e.g., adding a new command-line option, the programmer won’t be asked to go to multiple places in the codes if 3. is observed. While common and frequent usages should be supported and simplified in the library, new and novel applications will ask for 4. Finally, some rare, special, and/or tricky applications will demand 5. in the arsenal.

synchronize work spaces on different machines

While working on a set of files on different machines, it’s a common problem to keep things in sync. A real solution is to use a revision control system (with git being my current favorite). However, a quick fix is to use rsync. Following script tries to figure out which copy of the work space is newer and invokes rsync accordingly.

#!/bin/bash
if (( $# != 3 )); then
echo "Usage: `basename $0` <local file> <remote host> <remote file>"
exit 1;
fi
LOCAL_FILE="$1"
REMOTE_HOST="$2"
REMOTE_FILE="$3"

# make sure directories end with '/'
if [ -d "${LOCAL_FILE}" ]; then
LOCAL_FILE=${LOCAL_FILE%/}/
REMOTE_FILE=${REMOTE_FILE%/}/
echo "sync directory: '`basename ${LOCAL_FILE}`' with ${REMOTE_HOST}"
else
echo "sync file: '`basename ${LOCAL_FILE}`' with ${REMOTE_HOST}"
fi

# find out the last modification time in the entire directory
TM_LOCAL=`if [ -e "${LOCAL_FILE}" ]; then find $LOCAL_FILE -printf "%Ts %Pn"|sort|tail -n1; else echo 0; fi`
TM_REMOTE=`ssh ${REMOTE_HOST} "if [ -e "${REMOTE_FILE}" ]; then find $REMOTE_FILE -printf "%Ts %Pn"|sort|tail -n1; else echo 0; fi" < /dev/null`

echo Local Newest: $TM_LOCAL
echo Remote Newest: $TM_REMOTE
if [[ $TM_LOCAL < $TM_REMOTE ]]; then
echo -n "remote => local"
rsync -auz -e ssh --delete ${REMOTE_HOST}:""${REMOTE_FILE}"" "${LOCAL_FILE}"
echo ", Done!"
elif [[ $TM_LOCAL > $TM_REMOTE ]]; then
echo -n "local => remote"
rsync -auz -e ssh --delete "${LOCAL_FILE}" ${REMOTE_HOST}:""${REMOTE_FILE}""
echo ", Done!"
else
echo "Nothing to do!"
fi