Monday, September 17, 2018

New publications

Peer reviewed


"PC proxy: A method for dynamical tracer reconstruction," Environmental Fluid Mechanics, doi:10.1007/s10652-018-9615-7.

"Accelerating kernel classifiers through borders mapping," Journal of Real-Time Image Processing, doi:10.1007/s11554-018-0769-9.

Pre-prints/under review


"Solving for multi-class using orthogonal coding matrices," arXiv: 1801.09055.

"Solving for multi-class: a survey and synthesis," arXiv: 1809.05929.


A full list of my publications can be found on Google scholar. Note that the peer-reviewed versions of the first two papers differ from the pre-print submissions. In particular that of the second paper is significantly different from the final, published version. A one year embargo prevents me from making the published versions public, however.

Tuesday, July 4, 2017

How to write a scientific paper

First you need to come up with a stack of references. It always looks good to reviewers if you've got a reference list as long as your arm. You need a small army of other scientists on whose broad shoulders your authority may rest. It looks bad if each and every idea, no matter how trivial, is not at least hinted at in some other work. Heaven forbid you should come up with something original.

Always pepper your writing with plenty of faux Latin phrases. Terms like "in situ" "et al. "i.e." "vis a vis" "a priori" "post hoc" etc. It makes you sound educated.

Whenever you come up with a new method, algorithm or computer program, Christen it with a cute, catchy or flashy acronym. ARTS (Atmospheric Radiative Transfer Simulator) and ARTIST (I forget what this one stands for but it's for sea ice) are two nice ones from my own field. It's also good to abbreviate every technical phrase you use more than twice. It makes you look like you've got a firm, insider grasp of topics that no one else understands.

Try to squeeze in as many equations as possible. Concepts that are easy to explain in words can always be made more obtuse by converting them into formulas, er, sorry, formulae. Besides, like a long reference list, it looks impressive. If your reviewers don't have to take at least 3 Tylenol IIIs (six regular) after puzzling unsuccessfully over your article, it just isn't ready for prime time.

Sunday, April 3, 2016

The best derailleur ever made


The Huret Allvit? Really? Now hear me out. I know it has a bad reputation, but I'm not convinced that reputation is deserved. While the execution is not perfect, the design is positively inspired and it has a number of other design elements which, to me at least, scream "QUALITY". The problem with these design elements, and I will get into this, is that they are not idiot-proof. Consider:

1. The parallelogram design moves the top pulley further out as you move to larger sprockets in a design that predates the patented Suntour slant parallelogram by several years (1958 vs. 1964). Moreover, this motion is nonlinear--the motion is faster as the pulley cage is moved inwards, similar to how there is a bigger difference in teeth as the sprockets get larger.


2. The jockey pulleys have no teeth and fully adjustable ball bearings. Since the pulleys transmit no torque, there is no need for teeth. Both features will reduce drivetrain friction.

3. Every single pivot except for the cable hanger (which is riveted on) has a fully adjustable bearing with a locknut. Compare this with modern derailleurs which are almost always riveted together. Once the pivots wear out, the unit is done: you have to throw it away. With the Allvit, if the shifting becomes less precise because the pivots are loose, you just pull the thing off, disassemble it, clean up each part, and put it back together with the pivots nice and tight. This is almost certainly the source of the bad reputation these derailleurs had: if just one of these pivots is out of adjustment then it won't work well, if at all.

It was not all good, of course. The cable clamp was particularly poor and in conjunction with the strong parallelogram return spring and the thin gauge shifter cables of the time produced a lot of broken cables.

Towards the end of its life, the Allvit was being used on a lot of cheap, department store "ten-speed" and utility bikes. In order for it to work well, it needs to be maintained, something that's not going to happen with these cheaper bikes. The Japanese derailleurs, by contrast, are more idiot proof. They always work, no matter who attached them to the bike or how badly. They also tend to be more aesthetically pleasing. Note how on this vintage Shimano 600 derailleur the adjustment screws are placed neatly beside each other alongside the top parallelogram plate:
Contrast the more industrial look of the Allvit with one adjustment screw on the fixed top plate and the other down below jutting out of the bottom parallelogram arm.

Sunday, July 26, 2015

Move to github

I guess I've succumbed to pressure and moved over to github. Sourceforge was down for over a week and there has been a lot of bad press about it lately. I've rolled all of my software projects of any importance into a single scientific mega-library which you can find here:

libmsci

Not sure if this is a good approach. Probably not. I was just getting sick of dependency issues especially during release time. None of the libraries are all that big, so I thought, why not just roll them into one?

Also, be sure to check out the new homepage:

Peteysoft.github.io

Monday, June 1, 2015

Test driven development

Although I don't consider myself a computer programmer no matter how much code I churn out, nonetheless I feel I ought to keep up with some of the latest trends in software development.  Don't get me wrong, but I also suspect that many of these trends are also a lot of hot air and that they go in and out of fashion.  Certainly object-oriented programming is over-rated.

One idea that's caught my attention is so-called test-driven development.  I know that I don't write enough tests for my computer programs.

Unlike some other current fashions such as functional programming, this is something I can put into practice right away.  Again, I have my doubts about it so I will describe my initial attempts here.  I guess this is a somewhat boring topic, delving once again into the minutiae of scientific programming--hopefully I will have the wherewithal to write about more far-reaching and interesting topics later on some point.

To start with, I've picked an object class that's easy to test.  I have a family of classes that select out the k-least elements in a list.  Because some methods, such as a tree or heap, don't require storing the whole list, the object works by adding each item element-by-element and then extracting the k-least.  The base class looks like this:

template
class kiselect_base {
  protected:
    long ncur;
    long k;
  public:
    kiselect_base();
    kiselect_base(long k1);
    virtual ~kiselect_base();
    //add an element, set associated index to element count; 
    //return current size of data structure:
    virtual long add(type val)=0;
    //add an element and set the index, returns current size of data structure:
    virtual long add(type val, long ind)=0;
    //marshall out the k-least and associated indices:
    virtual void get(type * kleast, long *ind)=0;
    //test the selection algo:
    int test(long n); //number of elements
};

I wrote a family of these things because I wanted to test out which version was fastest.  It turns out that it makes fat little difference and even a naive method based on insertion or selection with supposed O(nk) performance is almost as good as a more sophisticated method based on quick-sort with supposed O(n) performance.  In addition to the k-least elements of the list, this version returns a set of numbers that index into the list in case there is auxiliary data that also needs to be selected.  This also makes slightly easier to test.

As you can see I've already added the test method.  Sticking it in the base class means that all of the children can be tested without any new code.  This is precisely in keeping with my thoughts about test routines: they should be as general as possible.  Forget having a small selection of specific test cases: this is a recipe for disaster.  Maybe it's not likely, but suppose your code just happens to pass all of them while still being incorrect?  Plus it's trivial to write code that passes all the test cases without being anywhere close to correct.

Rather, we need to be able to generate as many random test cases as we need.  Five test cases not enough?  How about 100?  How about one million?  Here is my first crack at the problem:

//trying to move towards more of a test-driven development:
template
int kiselect_base::test(long n) {
  int err=0;
  type list[n];
  type kleast[k];
  long ind[k];
  long lind; //index of largest of k-least
  int flag;
   
  //generate a list of random numbers and apply k-least algo to it: 
  for (long i=0; i
    list[i]=ranu();
    add(list[i]);
  }
  get(kleast, ind);

  //find the largest of the k-least:
  lind=0;
  for (long i=1; i<k; i++) {
    if (kleast[i]<kleast[lind]) lind=i;
  }
  //largest of k-least must be smaller than all others in the list:
  for (long i=0; i<n; i++) {
    //this is efficient:
    flag=1;
    for (long j=0; j<k; j++) {
      if (ind[j]==i) {
        flag=0;
        break;
      }
    }
    if (flag && kleast[lind]<list[i]) {
      err=-1;
      break;
    }
    if (err!=0) break;
  }
  return err;
}

Notice how we first generate a random list of any size so that we have effectively an infinite number of test cases.  But wait, this is kind of inefficient: it runs in O(kn) time.  Here I'm kind of torn: on the one hand the test algorithm should be as simple as possible so that it is easy to verify, but on the other, there seems to be no excuse that verification should take longer than the algorithm itself!  No question that in this case, it should take at most O(n) time, as this version demonstrates:

//trying to move towards more of a test-driven development:
template
int kiselect_base::test(long n) {
  int err=0;
  type list[n];
  type kleast[k];
  long ind[k];
  long lind; //index of largest of k-least
  int flag[n];
   
  //generate a list of random numbers and apply k-least algo to it: 
  for (long i=0; i
    list[i]=ranu();
    add(list[i]);
  }
  get(kleast, ind);

  //find the largest of the k-least:
  lind=0;
  for (long i=1; i<k; i++) {
    if (kleast[i]<kleast[lind]) lind=i;
  }

  //set flags to exclude all k-least from the comparison:
  for (long i=0; i<n; i++) flag[i]=1;
  for (long i=0; i<k; i++) flag[ind[i]]=0
    
  //largest of k-least must be smaller than all others in the list:
  for (long i=0; i<n; i++) {
    if (flag[i] && kleast[lind]<list[i]) {
      err=-1;
      break;
    }
    if (err!=0) break;
  }
  return err;
}

The problem we run into here is that the test algorithm is becoming quite complicated: almost as complicated as the original algorithm itself!  In some cases, such as parsing, it may need to be just as complex.  How do we test the test code?

Well, obviously there's a boot-strapping problem here!  At some point, we need human discretion and judgement.  My preferred test engine, and I have a small number of these lying around, is one that allows you to manually input any desired test case and then display the result.

Probably the best example of this approach is the date calculator I wrote to test a time class.  The time class (as well as the calculator that wraps it) allows you to make arithmetic calculations with dates and times and print them out in a pretty format.  Here is an example session that calculates the number of days between today and Christmas:

$ date_calc.exe
%d%c>(2015/12/25_2015/06/02)|1-0:0:0
206
%d%c>

Note that the minus sign (-) and the forward slash are already used in the date format so we substitute an underscore (_) and a vertical line (|) respectively for the equivalent arithmetic operations.  Another example is test wrapper for the following option parsing routine:

//returns number of options found
//if there is a non-fatal error, returns -(# of found options)

int parse_command_opts(int argc,   // number of command line args
              char **argv,         // arguments passed to command line
              const char *code,    // code for each option
              const char *format,  // format code for each option
              void **parm,         // returned parameters
              int *flag,           // found flags
              int opts=0);         // option flags (optional)

This subroutine is a lot more code-efficient than getopt.  There is a brief set-up phase in which you set each element of the void parameter list to point to the variable in which you want to store the option parameter.  Options without arguments can be left null and use a null parameter code, "%". As a simple example, suppose you want to return the parameter of the -d option to the integer variable, d:

int main(int argc, char **argv) {
  int d;
  void *parm[1];
  int flag[1];
  int err;

  parm[0]=&d;
  err=parse_command_opts(argc, argv, "d", "%d", parm, flag, 1);
  ...

The test program scans options for all possible format codes using an option flag that's (usually) the same as the code and prints out the parameters to standard out.  We can do it with whitespace (-b option):

$ ./test_parse_opts.exe -b -g 0.2 -i 20 -c t -s teststring
./test_parse_opts -g 0.2 -i 20 -c t -s teststring -b
number=0.2
integer=20
string=teststring
char=t
Arguments: ./test_parse_opts

or without:

$ ./test_parse_opts.exe -g0.2 -i20 -ct -steststring
./test_parse_opts -g0.2 -i20 -ct -steststring
number=0.2
integer=20
string=teststring
char=t
Arguments: ./test_parse_opts

Of course, this is one reason why interactive interpreters are so great for rapid development.  You don't have to write all this (sometimes very complex) wrapper code to test functions and classes.  Just type out your test cases on the command line.


*UPDATE: I realize that the test_parse_opts wrapper is a bad example since it's quite limited in the number of test cases you can generate.  Therefore I've expanded it to accept an arbitrary list of option letters with corresponding format codes to pass to the function:

$ ./test_parse_opts -b -p adc -f %d%g%s -a 25 -d 0.2 -c hello
./test_parse_opts -a 25 -d 0.2 -c hello -b -p adc -f %d%g%s
-a (integer)=25
-d (float)=0.2
-c (string)=hello
Arguments: ./test_parse_opts

Sunday, May 3, 2015

Is agnosticism a reasonable belief?

There is an argument against God that I've lately been seeing repeatedly.  It goes something like this: "just because you can't prove there's a tiny teapot orbiting Venus, doesn't mean it's not there" or "just because you can't prove that the East Bunny is real doesn't mean that he isn't."  Implying that both these situations are so unlikely that we might as well assume that they are false and the statements are therefore logical fallacies.

A God or gods, unfortunately, are not the equivalent of the Easter Bunny or a tiny teapot orbiting Venus or a dragon flying around Saturn's rings or whatever other absurd object you might be able to dream up.

Consider the meteoric advance in computer simulation technology: there's little doubt that virtual reality is just around the corner.  A related practice is the design of artificial life simulations.  It is possible that we are all a simulation inside of a giant computer.  Would it not be reasonable to call the creator(s) and/or master(s) of this simulation a god or gods?

The principle of sufficient reason seems to dominate our perception of reality.  It is reasonable to suggest that there might be a first cause or prime mover.  Might this not be God?

The very nature of God is that He is a "higher power" and therefore quite beyond our ken.  It is arrogant and naive to assume that we are the highest level of understanding that this universe has attained.

Saturday, March 28, 2015

For the love of solitude


Last week I spent some time in a small cabin ("chalet") in the woods. At the time it was completely deserted: the silence was delicious. Finally I could breath. Finally my thoughts were my own. I am always struck in these moments, first, how essential they are to someone of my temperament, and second, how it clears the mind so that the real thinking may finally begin.

Such solitude is increasingly hard to find. I point out in another article how the world is becoming a panopticon. Soon there will be satellites with sufficient resolution and coverage that they can observe us in real time. Not even walls will be enough to conceal us: the combination of penetrating microwaves, tomography and synthetic apertures will soon allow us (or rather our nosy leaders) to map the insides of buildings in 3-D from space.

I think I finally understand what's going on here; why I am driven to find more and more extreme isolation, so much so that loneliness and anxiety frequently overcomes any peace-of-mind that might be gained from the endeavour.

When you observe something, it changes.  Many interpretations of quantum mechanics have been attempted, many of them rather flaky: something is not there when you are not looking at it; there is a mystic union between the observed and the observer such that they cannot be distinguished.  Lets not even get into all the different "many-worlds" hypotheses.

No, it is much simpler than that, and from the mathematics, undeniable.  Chances are you can walk into any toy store today and buy a simple device consisting of an array of needles free to slide within a series of holes.  Using this device, you can take a temporary cast of your face (or any other object for that matter) by simply pressing into it.  Now at the same time your face is pushing these needles outwards, the needles are creating pock-mark depressions on your skin.  Granted, because skin is elastic, it will almost certainly spring back to it previous form, although you might feel it for some time afterwards.

Rays of light are just like those needles.  When you take leave to walk in the woods or cycle in the mountains, you are reclaiming your thoughts.  You are becoming fully yourself again, because as long as others watch you, your thoughts are not your own.


It might well be my epitaph.  Such solitude is no longer supportable.  I guess I consider myself a dying breed.