Tuesday, April 15, 2014

How to build a super-efficient motor-assisted bicycle

The moral philosophy is a weighty topic.  Sexual politics even more so.  So here is some lighter fair about how to build a better motor-assisted bicycle.  I have already written an article about how current products are rubbish and how they are hobbled by brain-dead regulations.

An under-powered vehicle needs a lot of gears to help it adapt to different terrain and driving conditions.  Modern bicycles with derailleur gears typically have a minimum of 16 gears while hub-gear bicycles will have seven (7) or more.  Transport trucks have 12 or more gears and vehicles designed for off-road use will usually have a special gear reducer that effectively doubles the number of gears.  Contrast this with personal automobiles, which in the past, could make do with as few as three and even now rarely have more than six (6).

A lot of motor conversion kits for bicycles have a very crude transmission set-up.  One older system involved simply dropping a rubber wheel onto the front tire, but newer chain drives often aren't much better: there is only one gear and the pedals are only useful to kick-start the engine.  On the one hand, transmission requirements for human versus gasoline power are very different because the former is high-torque, low RPM (revolutions per minute) and can produce generous torque at rest, while the latter is low-torque, high RPM and cannot produce any torque at rest.  On the other hand, to save weight, the transmission ought to be shared between the two.

This is the solution I'm currently working.  First the engine.  I found this 2 hp gasoline engine, complete with magneto ignition:
The engine can drive the rear wheel through a standard bicycle derailleur gear by connecting to this Shimano front-freewheeling chainwheel from the 1980s:

Through a ratchet mechanism, the chainwheel is allowed to spin in a clockwise direction even when your feet aren't moving.  What's missing is how to connect the engine to the chainwheel but seems obvious from the photos: just chain it directly from the small sprocket mounted on the end of the engine crankshaft to the larger of the two chainrings.

There's at least three problems with this solution.  The first is that the engine achieves its horsepower peak at around 10000 rpm.  In top gear a human will have to pedal at roughly 100 rpm to travel at a speed of 50 km/h.  This implies a gear reduction of roughly 100:1!  It's not quite that bad since you don't have to be in top gear when the engine is pushing the bike at top speed.  There are multiple chainrings: potentially you could use just one very small one when the engine is engaged as long as there is still a reasonable spread.

With this in mind, notice that the chainwheels has been drilled out twice: first with a large spacing to accept this monster, 60 tooth chainring shown left and again with a smaller spacing to accept a standard 24-30 tooth "granny" ring shown at right.
The cog on the engine is 14 teeth so even using the largest rear sprocket (roughly 1:1 ratio), you get a final drive ratio of 60:14=4.29.  At maximum rpm the rear wheel will still be turning at over 2300 rpm which translates to a maximum speed of 297 km/h!

The second problem is that the engine actually spins counter-clockwise, which hints at a solution to the first problem: just have the speed reducer in one with the reverse gear.  Two gears with a tooth ratio of about 20:1 (a compound gear might be better for such a large ratio), 17:1 in combination with the giant 60 tooth ring or 14:1 with the little granny should do the trick.  If both are employed, the ratio could be as low as 10:1.

Such large ratios suggest the use of worm gears but this makes it impossible to kick or jump-start the engine with foot power or momentum and a starter motor adds unnecessary weight.  And third, we still haven't addressed the issue of engaging and disengaging the engine.  In the setup described above, the engine is always engaged making it impossible to use the pedals except for starting the engine or possibly assisting it.  This would only be acceptable for the earliest prototypes.

As a more mature, practical solution, I envision the motor being mounted beside the seat-tube with the crankshaft running perpendicular to the bottom-bracket (pedal-crank) axle rather than parallel.  Two bevel gears connect the two: a small one at the end of the crankshaft and a very large one on the outside of the chainwheel.  The engine is engaged and disengaged with a lever either by making the engine crankshaft extensible or through a pair of movable idler gears.

If this project interests you, you can make a donation to the Peteysoft Foundation here.

New software release

I have just released a new version of the libagf machine-learning library.  This is probably the biggest revision since starting the software project some seven years ago.  The main addition is a generalization (called "multi-borders") of the "AGF borders" algorithm from binary to multi-class classifications.  Rather than pick a single method and implement that (such as one-against-one or one-against-the-rest), which would've been quite trivial, I rather chose to generalize the whole business.  For more details, please refer to the paper I submitted to Journal of Machine Learning Research.

I think this is one of the first times I've created software without having some specific application in mind.  I wrote it mainly for the challenge and because the library doesn't feel quite complete without it.  Not to say that it won't be useful for many prediction tasks.

Here are a few thoughts on the creation of the new software.

On preparation

In preparing the "multi-borders" software, one thing that struck me was just how quickly I got it up and running.  Perhaps this is just experience, but I think planning and preparation has a lot to do with it.  I had a very good idea what I wanted to achieve and while I don't usually write much down, I have everything planned out in my head.

There are two components to the software: one for training and one for classification.  At the training stage, all the software does is read the control file and output commands for the already-existing binary classification software and then output another control file for the classification stage.  I got the first stage working long before the second, largely because testing is so simple: most of the input can be made up on the spot (i.e. the arguments don't have to refer to real files) and you just check the output for correctness.  In both cases I spent maybe two to three weeks coding and got things up and running after three or four tries.

I object


I'm still not sure how I feel about the object-oriented programming paradigm.  Usually I think it's just over-rated and anything that can be done using classes and objects can be done at least as well and with more flexibility using all of the features that made C so distinct from other languages of the time: pointers (especially function pointers and void pointers), unions and structures.  The problem is because I'm still stuck in a very rigid mindset that says there's some kind of right way and wrong way of doing things that I haven't learned yet how to properly program in straight C.

Take the dreaded "goto"--everyone says it's bad, but it's less in simply using the goto as using it badly.  In the Elements of Programming Style, Kernighan and Plauger spend many pages showing how to clean up code with gotos.  Gotos definitely have their place, such as breaking out of a series of nested loops or cleaning up after different types errors without a lot of repeated code.  In a couple of programs I've written for the new software release (and elsewhere), I've demonstrated to my satisfaction how to write code that's just as hard to follow, using only structured programming constructs.  Void pointers provide an even better mechanism for shooting yourself in the foot, but also unprecedented power in an otherwise strongly typed, compiled language.

As I mentioned above, there were two components to the software.  The first part was written "the old-fashioned way" using standard, procedural programming.  The second part was written with objects and polymorphism.  Of the two, the latter was definitely the easier to get up and running.  Perhaps this is the main benefit of objects: it doesn't make the language any more powerful, just makes the programs easier to think about and debug.

It can also produce code that looks very elegant, at least when viewed from a certain angle.  The multi-class classifications are based on a hierarchical model: you divide the classes in two, divide them again and so on until there is only one class left.  The class structure of course follows this logic as well.  The elegant part comes in the fact that the same method is called all the way up the line with no break in the symmetry.  But in order to achieve this it includes a very brain-dead construct: a one-class classifier (that is, a "one-class-classifier-class"), a fairly large piece of code that doesn't do anything except return the class label of the top-level partition.

Finally, here's how we might be able to do better than the O-O framework using the features of C.  Here is the prototype for the re-factored subroutine for sampling the class borders from a set of training data in a binary classification problem:

template
nel_ta sample_class_borders(

            real (*rfunc) (real *, void *, real *),
                      //returns difference in conditional prob. plus derivs
            int (*sample) (void *, real *, real *), 

                      //returns a random sample from each class
            void *param,                    //these are just along for the ride
            nel_ta n,                       //number of times to sample
            dim_ta D,                       //number of dimensions
            real tol,                       //desired tolerance
            iter_ta maxit,                  //maximum number of iterations
            real **border,                  //returned border samples
            real **gradient,                //returned border gradients
            real rthresh=0);                //location of Bayesian border


Yes, I do like to freely combine different elements from different programming paradigms: it's what make programming in C++ so much fun!  I could've passed an object with two methods to this function, but chose instead to pass two function pointers and a void pointer.  The normal method of calculating conditional probabilities requires an array of vectors (coordinate data) plus an array of class labels.  You could encapsulate both in an object, but this seems a bit limiting: there are many other things you could potentially do with them: for instance you can pair the coordinate data with a different set of class labels, generated for example from a set of continuous ordinates.  By having them only temporarily assigned to the parameter list the program becomes more flexible.

This is one thing that always annoyed me about O-O programming: the idea that all the data has to be hidden and then you end up with a whole bunch of pointless methods that do nothing but assign to or return a field.  Sure in some languages (such as C++) you can scrap the data hiding, but then is it really object-oriented?  Instead you end up with the older C-style programming paradigm with structures, pointers and unions.

Finally, suggesting that the object being passed to this method is simply a binary classifier obscures the true generality of the method.  It is, in fact, a multi-dimensional root-finder.  rfunc could be any continuous, differentiable function that evaluates to both positive and negative values, not just a difference in conditional probabilities (i.e., a binary classifier).  It makes more sense (at least to me) to represent it as a function to which you pass a set of parameters, rather than as an object class.  Again, this to me is the beauty of the C++ language:  you can choose amongst a multitude of programming paradigms that most appropriate to the problem at hand.

Meta-programming

Another major addition to the program was a hierarchical clustering algorithm.  I wrote this to help me understand the travelling salesman problem, as I've mentioned elsewhere.  The program builds a dendrogram and then, using single-letter commands, allows you to interactively browse through it, assigning classes along the way.  It didn't take me long to figure out that you can output commands as they're being input and the feed them into the program again.  And from there, it's not a very big leap to see how you could write a program, that, instead of using the C++ objects directly to manipulate the dendrogram, uses the one letter commands from the interactive utility.  This, although the language for the dendrogram browser is very simple, is nonetheless a type of meta-programming.

It's also a technique I've used very little in the past.  Perhaps in part because it's very easy to abuse.  I remember once being in charge of maintaining a large, motley and rather finicky piece of software.  In one part, the operations to be performed were first written out to a file in a crude sort of language.  Then, this file was read in again and the operations were performed -- achieving absolutely nothing in the process.  I removed both pieces of code and rewrote them so that the operations were simply performed--with nothing in between!

Since learning lex/yacc and starting work on some of my own domain specific languages (or DSL, which itself is a very powerful concept) I'm starting to use this technique a bit more often.  The training part of the multi-borders program, for instance, does very little except parse a control file.  In the process, it prints out commands for training the binary classifiers that comprise the multi-class model.  I could've run these directly using a system call, however because parsing the control file takes so little time, I chose to compile them into a script instead.

Tuesday, April 1, 2014

The other day I came across a chest

in the woods, buried in snow.  I opened it up and sure enough there was some treasure inside: some coins from the Dominican Republic.  But I didn't take any.  I know, we've all read Treasure Island, we all dream of finding that massive windfall that will set us up for life.  But what if, when we stumble across a hidden chest, instead of taking what's inside, we add a little bit, because, after all, what's a chest without treasure?

Friday, March 21, 2014

Re-post

I notice two posts on this blog, "Women in science," and "All men are rapists," have been getting a lot of traffic.  Certainly they make nice "click-bait," but this blog was not meant to be political.  Of course it is the arrogance of scientists and philosophers to claim that they deal in "objective" or "eternal" or "universal" (or whichever other superlative you care to use) truths, rather than pieces of the moment.

As other posts make clear, I am interested in moral philosophy and the morality of equality is one of the most important developments in Western thought whose ramifications have yet to be fully worked out.  The idea, however, is taking considerable abuse from modern feminists.

In any case, here is an older post, that I think much more clearly reflects the "spirit" of this blog.

On the negation of modal verbs


I recently learned (it goes to show how diligently I've been practising my German) that Germans use the phrase "must not" in the opposite sense of English speakers to mean "need not." Thinking about this, I realized that there is an implicit "or" in any statement involving modal verbs and the sense of the negative depends upon which part of the logical proposition is negated.

For example, I would translate the phrase:

"You must do A"

into a logical proposition as follows:

^A -> P

where P is some form of punishment. Or:

A or P

We could negate the phrase either by negating the whole thing:

^A and ^P

Or we could negate only one part:

A or ^P = P -> A
(e.g., "You (must not) play in the street.")

^A or P = A -> P
(e.g. "You must (not play in the street.)")

The first example would seem to be how the Germans use the phrase since whether we do A or not, we will not get punished for it, while the third form is more in line with how we use the phrase, that is, A implies punishment. The second example seems rather more ambiguous and in fact inverts the construct: now, getting punished implies that we have done A.

It has rather deep implications, since all of ethics, law and morality is related to the use of modal verbs. Can we use this idea to justify breaking the Ten Commandments?

I translate,

"I shall go to the store,"

as:

F (uture) -> S (going to the store) = ^F or S

The sixth commandment becomes:

You shall (not kill.) = F -> ^ K = ^F or ^K = ^(F and K)

You (shall not) kill. = ^F -> K = F or K

or perhaps,

^(F -> K) = F and ^K

The second says that if there's a future, there may or may not be killing while the first and third say what we want them to say: if F is true, K must be false.

Thursday, March 13, 2014

My dream

Recently I asked a friend where she and her family go to walk in the woods, not realizing that not everyone considers it as necessary as breathing (the jury is not out--it is).  During Bible study one night, we discussed the question, "How did you used to find meaning in life?"  "How do you find it now?" with the implication that before finding religion, it would be normal to find meaning in drugs, sex, partying, etc.  Though I said nothing (the meeting was in Germany and my German is rather weak), I thought to myself that I was always closest to myself, that I found the greatest joy and meaning in life while experiencing the outdoors, whether walking, cycling, skiing, snow-shoeing, and I still do to this day.

The green-spaces in my vicinity are drying up.  Luckily there is one small patch of wood nearby and the owner has even generously cut trails through it.  I keep meaning to go meet him and thank him as I suspect this tiny wood may have saved my life.

Now I don't mean to be ungrateful, but I also believe the configuration of this space, which is right beside my house, is somewhat inauspicious.  It's not an easy concept to explain, but lately I've become interested in Feng Shui, not as some mystical art, but as basic common sense.

For instance: where do you camp?  If you park your tent along a sometime game-trail in a corridor between the trees on a windy night, as my sister and I once did, you will not sleep well.  In Feng Shui, we think of "energy" or "chi" moving out from the camp-site along the trail, but I think a better word for it would be "spirit."  Energy already has a precise definition in physics.  Spirit, by its very nature is ethereal--difficult to measure and without physical form.  Surely the low level of stress of the wind threatening to blow away the tent and the possibility of feral animals wandering into the camp-site would serve to reduce our spirit?

Quite the opposite of a camp-site, parks ought to "breathe," the spirit flowing uninterrupted through trails and other connections, but with sheltered pockets where it can "pool."

Living in Washington D.C., there we at least two routes out of the city that weren't open to motor vehicles.  There was the W & OD trail, built on an old rail line, that ran West along the Orange Line through Falls Church out to Leesburg, ending up just past the first hump of the Blue Ridge mountains and just shy of the Appalachian Trail (AT)  Then there was the C & O canal twopath that ran along the Potomac, joining the AT for a short distance.

I used to take the commuter train or cycle along the W & OD to escape the tension of city and hike along the AT.  These long distance trails were linked to a network of smaller trails: through Rock Creek park, along the rivers, yes, even through my own, ghettoized neighbourhood in Temple Hill near Anacostia.

I couldn't help but think of these trails as a lymphatic system in the body of the Capitol region.  Transporting the bad spirit from inner city crime, drug gangs (not to mention the dealings of the Whitehouse) out into the surrounding countryside, where it could be purified.

Back to the wood behind my house: it is bordered on three sides by houses.  On the fourth side you used to be able to ski to a golf course, but now the owner of the next property over (not the immediately adjacent one) has erected a fence which blocks you.

The last time I skied in Gatineau park I took lunch in one of the "chalets": wood-heated cedar cabins with picnic benches inside.  I'm certain that these were copied from the Scandinavian countries where skiing is a way-of-life.  I'm also certain that they got something wrong in the translation.  I imagined these being, not lunch stops for spandex-clad racers bombing along meticulously-groomed snow-highways that form useless circuits, but rather way-points for travellers skiing a single-track trail through the middle of the woods.  A trail that is, that actually goes from one place to another.

I'm trying to imagine a world where I can put on a pair of skis starting at my house and ski for the rest of the day without crossing or retreading the same stretch of trail.  Where I can sling a backpack and hike through the woods until I actually get somewhere, somewhere that I need to go, rather than back again to the noisy, inefficient motor-vehicle that transported me to the trail-head, ten times the distance I ended up hiking that day.

P ~ NP

Here is a second venture into theory: this time computational complexity.  In this case I don't even have an equation, just a rather vague conjecture.  It is inspired by information theory and based on close to 20 years experience working with numerical algorithms.  At the risk that someone else picks it up and manages to prove it, it goes something like this:
As the number of elements, n, approaches infinity, any NP problem (O(n!)) can be solved approximately to accuracy e in P time (O(P(n))), where e is an arbitrarily small number.

The discovery of successively more accurate approximate solutions to the travelling salesmen problem lends credence to the conjecture.

I haven't made a lot of progress proving it, just written some clustering software based on a "dendrogram", since building a dendrogram is one method of solving the travelling salesman problem.  The idea is, in an evenly distributed group (little or no clustering), apparently random changes to the path may not make a lot of difference to the solution: any "greedy" solution will get quite close.  By contrast, if there are multiple clusters, one of the most important things in the solution is how you move between the clusters.  This will have implications in the distribution of solutions.

Even if we accept the conjecture, there are still open questions.  Can you get guaranteed accuracy, or only expected accuracy?  (I believe there are approximate solutions to the travelling salesman problem with guaranteed accuracy.)  Does the order or coefficient of the polynomial solution relate to the accuracy (it stands to reason that it does) and if so how? 

At least the software works well--it might even be useful.




Wednesday, March 5, 2014

Rewriting Maxwell's equations

When I was in my fourth year studying physics, I felt a bit overwhelmed by the E&M courses.  Every thing we learned in these courses was a corollary, ultimately, of Maxwell's equations.  Yet if I had gone into the final having only memorized them or written them on my cheat-sheet, I was sure to fail.

Fast forward several (many) years and I'm working on atmospheric science stuff.  To do atmospheric science successfully requires an intersection of knowledge from many different branches of physics: classical and fluid dynamics, thermodynamics, electricity and magnetism.  We even use some quantum mechanics.  Here again, it's mostly fairly narrow theorems and rarely delving into fundamentals.  In the area of quantum mechanics, for instance, I've never touched a wave-function, just used spectral lines in radiative transfer calculations.

Some of the most interesting work I've done (although I didn't think so at the time) has been simulating surface emissivities, specifically sea water and sea ice.  It's a very fundamental problem: how electromagnetic radiation interacts with matter, but you wouldn't think so from the way it's actually done in the field.  Here again I was using corollaries of more fundamental theories, specifically the Fresnel equations which can be derived from Maxwell's equations.

Here's where it gets interesting: I started off with a radiative transfer equation that was derived fairly directly from Maxwell's equations.  But this equation had at least two errors so I re-derived it much more simply using the Fresnel equations as a basis.  This and a couple of other things got me thinking about Maxwell's equations.  The model required a series of dielectric constants for the media being simulated.  These constants are quite difficult to calculate, even for homogeneous media such as salt water, and most of the models for them are at least semi-empirical.  For sea ice, it's almost impossible and most of the models handed down to me have only a passing connection to reality.

So it really got me thinking about how to simplify the laws to make them more tractable and also wondering, are they really complete (or perhaps I should say complete enough, since I'm not sure any physics law can ever be complete, efforts to find a "final theory" not withstanding) if they need these finicky and difficult to calculate constants?  Physics laws like Maxwell's equations have deep "roots" and people would sooner question their own measurements or application of the laws rather than the laws themselves.  I've experienced this myself: you keep beating away at the solution until you get the "right" answer, that is the one that agrees with your measurements.

The first thing I came up with was the plane-wave solution: why do we always throw away the imaginary part?  Why not keep it: this is the other field.  If your solution is for the E-field, then the imaginary part is the B-field and vice-versa.  The vacuum equations become:

div(R)     =  p

curl(R)   =    i    d R
                   -----  ------
                  c^2  d t

where R=E+Bi and p is charge.  In light of the equivalence of the electric and magnetic forces--an electric field turns into a magnetic field under a Lorentz transformation--merging the two fields would seem to make sense.  It works for the plane wave solution and as I understand it, every other solution can be decomposed into a series of plane wave solutions.

This is about as far as I got with it.  As you can tell from my ham-fisted attempts to analyse my "Vernier" clock, I'm not that good at theory, although I'd like to do more of it.  To be fair my earlier mistake was more because I couldn't be bothered to take the time, sit down and work things out properly.  I'd like to write more about that later.

There are still problems with this: you can't really merge the electromagnetic permittivity and magnetic permeability into one constant because there will be cross terms.  So obviously I haven't made any progress towards understanding the electromagnetic properties of sea ice--unless that's the answer: there really are cross terms and we haven't accounted for them!


I'm thinking you might be able to take it further merging the reduced set of equations and so further reduce them to only one by generalizing the curl and incorporating time into a four-space.  We did something similar in fourth year--I dug it out recently but all the details escape me.

Two other things bother me about the Maxwell equations.  You can solve the equations using an complex permittivity.  In this case, the imaginary part is the conductivity (or some part of it--I'm pretty sure the two are not precisely equivalent).  But conductivity isn't part of the Maxwell equations.  Why is it directly proportional to the electric field, instead of related through the first derivative as in Newtonian physics?  This is obviously an approximation--an empirical hack.

Of course real physicists don't bother with the classical E & M equations since we've long since moved them, although I'm not sure how much use quantum electrodynamics is for real world problems such as the one above.  I still have a bunch of really dumb questions related to E & M and light, such as: What is the quantum formulation for a single photon? (Can you write out the Schroedinger equation for a photon?  How come we never did that in undergrad?) and how does it relate to the classical formulation for light?  How much information does a single photon transmit?