Tuesday, March 19, 2019

Random Thought

I have to say, that, in general, I do not believe in randomness. I'm sure there are some Quantum Mechanics (Maniacs?) out there who will beg to differ and provide supporting arguments, but until then....

Let's say I flip a coin. This particular flip comes up HEADS. Can you provide me with a proof that it could have been TAILS? Sure, sure, you can show that the next few flips might have different outcomes, and further that the next 1 billion flips will average dangnabbitedly close to 50% each. But that's not what I asked. I want proof that the original action might have taken a turn to the T-side. Since that has already (not) happened it is in the -- still apparently -- inviolable past and cannot be changed. So maybe it wasn't random at all?

Don't get me wrong, I'm not trying to argue that we can predict the future. Both complicatedness (many moving parts) and complexity (intersecting feedback loops) make that practically and theoretically impossible.

I'm just saying that we can predict the past.

Monday, February 4, 2019

Fixed Point Failure

A Fixed Point math library and Neural Net demo
for the Arduino...

Or: Multiple cascading failures all in one place!

Last year I found a simple self-contained Artificial Neural Net demo written for the Arduino at: robotics.hobbizine.com/arduinoann.html and spent a goodly amount of time futzing around with it. I now, almost, understand HOW they work, but have only a glimmering of insight into WHY. The demo does something really silly: The inputs are an array of bit patterns used to drive a 7-segment numeric display and the outputs are the binary bit pattern for that digit (basically the reverse of a binary to 7-segment display driver). Someone not totally under the influence of ANNs could do this with a simple 10 byte lookup table. But that is not us. On the plus side it _learns_ how to do the decoding by torturous example, so we don't have to bother our tiny brains with the task of designing the lookup table.

HOW ANNs work on the Arduino is:
  • a) Extremely slowly, because they use a metric shit-ton of floating point arithmetic; and,
  • b) Not very interestingly, because each weight takes up 4 bytes of RAM and there is only about 1Kb kicking around after the locals and stack and whatever else is accounted for -- the simple demo program illustrated here uses about half of that 1K just for the forwardProp() node-weights and then the backProp() demo uses the other half for temporary storage. Leaving just about nothing to implement an actually interesting network.
But. I thought I could make a small contribution by replacing the floating point -- all emulated in software -- with an integer based Fixed Point implementation -- whose basic arithmetic is directly supported by the ATMEGA hardware. This would also halve the number of bytes used by each weight value. Brilliant yes?

And in fact. My FPVAL class works (see below for zip file).  Except, err, well, it doesn't save any execution time. But more on that later....

Anyway. The FPVAL implementation uses a 2-byte int16_t as the basic storage element (half the size of the float) and pays for this with a very limited range and resolution. The top byte of the int16 is used as the "integer" portion of the value -- so the range is +/- 128.  The bottom byte is used as the fraction portion -- so the resolution is 1/256 or about .0039 per step. On first blush, and seemingly also in fact, this is just about all you need for ANN weights.

As it turns out, simple 16 bit integer arithmetic Just Works(TM) to manipulate values, with the proviso that some judicious up and down shifting is used to maintain Engineering Tolerances. This is wrapped in a C++ class which overrides all the common arithmetic and logic operators such that FPVALs can be dropped into slots where floats were used without changing (much of) the program syntax. This is illustrated in the neuralNetFP.cpp file, where you can switch between using real floats and FPVALs with the "USEFLOATS" define in netConfig.h.

Unfortunately it appears that a lot of buggering around is also needed to do the shifting, checking for overflow, and handling rounding errors. This can all be seen in the fpval.cpp implementation file. An interesting(?) aside: I found that I had to do value rounding in the multiply and divide methods -- otherwise the backProp() functions just hit the negative rail without converging.

I also replaced the exponential in the ANN sigmoid activation function with a stepwise linear extrapolation, which rids the code of float dependencies.

I forged ahead and got the danged ANN demo to work with either floats or FPVALs. And that's when I found that I wasn't saving any execution time.  (Except, for some as yet unexplained reason, the number of FPVAL backprop learning cycles seems to be about 1/4 of that needed when using floats[??]).

After a lot of quite painful analysis I determined that calling the functions which implement the FPVAL arithmetic entail enough overhead that they are almost equal in execution time to the optimized GCC float library used on the ATMEGA. Most of the painful part of the analysis was in fighting the optimizer, tooth-and-nail, but I will not belabor that process.

On the other hand, if you are careful to NOT use any floating point values or functions, you can save two bytes per value and around 1Kb of program space. Which might be useful, to someone, sometime.

So. What's in this bolus then is the result of all this peregrination. It is not entirely coherent because I just threw in the towel as described above. But. Here it is:


Thursday, January 31, 2019

Some Driveline Enhancements

Variations Too, again

So. I've been dragging my feet -- once again -- because everything just seemed too hard over the holidays, but I have made progress none-the-less....

While doing limited in-camera demos I found that the Variations second arm linkage just tore itself apart pretty consistently. This was due to there being nothing but a bit of stickyness holding the axle into the arm. I originally used two pins through the whole sandwich to keep the gear from spinning, but I didn't have anything really holding the layers together, and there was too much torque for the sticky to manage.

This has, perhaps, been remedied:
Improved(?) axle mounting

After the arm was all re-assembled, I drilled a hole longitudinally through the circular backing plate and the axle, and glued a 1" long by ~1/32" diameter nail into the hole. This of course requires solid drill press, or mill, mounting and careful attention to not breaking the (@$!@#) miniature drill. Here you can also see the two pins (little brass brads, also about 1/32" dia) that pierce the entire sandwich to prevent the gear from spinning on it's own.

Compare to the previous layout, where the above photo is looking straight on from the bottom:
So. After assembling and gluing all the little bits into their sandwich, one needs to fire up the machine shop and drill two transverse holes almost through all of plate-arm-gear layers -- the almost part being that we don't want to completely pierce the gear itself, thus the pins need to be shorter than the full thickness (which may vary according to the arm material). Then rotate the arm and drill a longitudinal hole through the backing plate and axle -- basically straight down, centered where the "Plexiglas backing plate" arrow points in the above -- Gear Linkage -- photo. THEN glue the relevant pins into the holes. I've tried both: Goop, which is a bit hard to get schmushed into the holes but sticks to the pins; and: filled acrylic-solvent glue, which can be squirted into the holes but only sticks to the pins in an advisory way. Fortunately the sticky provides little in the way of mechanical advantage, it only needs to keep the pins in place.

I did this for the two lower arm linkages and made the executive decision that the torque on the smallest, upper, arm did not merit the extra effort. YMMV...



I think this may be the end of the mechanical portion of our time together, save perhaps for cable routing which is still rather ad-hoc.

Tuesday, November 13, 2018

Sonar! The HCSR04 Library

For Variations Too I need some kind of distance sensor to see if there is anyone watching and how interested they might be. For the 'real' Variations I am planning to use a video camera and image processing, but this is the kiddie version.


I thought I had it all worked out because I've used this cheapo, err, inexpensive, HC-SR04 (aka 19605-UT) ultrasonic sensor, from mpja.com amongst, in other projects using my MSCapture library which turns Arduino Pin 8 into a Timer 1 counting input -- remind me to rant about this sometime, especially since my library is perfect for grabbing IR remote control signals -- but for now.

See the data sheet here.

But, if you've been reading along, you know that the Servo motor control library is also a big fan of Timer 1 and thus that acre of digital real estate is no longer available. So I had to reinvent the wheel using a different mechanism with Timer 2 which has only an 8 bit resolution and no external count input.

Therein lies the HCSR04 library in my code bolus.

It uses three interrupts (you are surprised?), two from Timer 2 and one from an external pin change signal. The counter is run with the maximum pre-scale of /1024, giving a 64 micro-second resolution which is not quite as fine as one would like, but it turns out that the sensor itself is not quite as fine as one would like either, so it sorta works out. It counts the timer overflow interrupts to add 4 more bits to the timing range, which is somewhat more than enough to detect the SR04's no-signal failure signal.

Two digital pins, and power, are connected to the sensor. One pin is the Trigger output which can be any available digital output pin. It sends a short positive pulse, where the falling edge starts a sonar sample cycle. The other is the Ping input pin which goes high from the end of the sonar ping until it gets an echo response -- or for a loooong time if it misses the echo (more below). The Ping input currently has to be Arduino Pin 2 or 3 -- because I couldn't make sense of the doc for attachInterrupt(), I think one can use other pins but the code will need a light re-wanking.

HC-SR04 signals

The library class has these methods:

    /** default constructor **/

    /** initialize the HC-SR04 sensor pins and timer interrupts
     **  leaves all the interrupts disabled
     **   use SR04.startPing() to start a sample cycle **/
    void init( uint8_t trigpin, uint8_t pingpin );

    /** Start a sensor ping cycle
     **  turns on TRIGPIN and enables interrupts
     ** Presumes that SR04init() initTimer2() have already been called.  **/
    void startPing(void);

    /** return true if there is a new distance value available
     **  will clear itself, so a second call will return false...  **/
    bool available(void);

    /** return the last distance value from the sensor
     **  if it's 0x0000 we didn't get anything...  **/
    int16_t getDistance(void);

After the class is initialized, a call to startPing() will send a trigger pulse and wait patiently for the results. Under the covers, the Trigger output pin is set high and Timer 2 is started, a count interrupt is fired after two counts and used to turn off the Trigger pin, thus starting the Ping cycle. When, and if, the timer wraps around on 256 64uS counts (~16.4mS), the overflow interrupt fires and a global status variable is incremented -- this allows for an extended count range, in this case up to 4 bits or x16. When the Echo input pin goes low, the input pin interrupt fires, all the counts are counted up, and the available() interface will signal by returning true -- just once. When that happens getDistance() can return something useful.


The speced range of good distance data is from about 10 to 360 (in 64uS increments). I did not subtract the two-four initial trigger counts so you can do that if you want a bit more accurate close range measurement.  If you multiply the count value by 1.1 you can get fairly close to the actual distance in CM.

However in a spot check, I did not get reliable distance counts beyond about 180, i.e. 200cm, so YMMV. Also it jumps around, failing for a number of cycles before coughing up an occasional good value.

A note on the bad values.... If there is no ping return received the sensor just keeps going until it gets tired. The spec says this should be 38mS after the trigger, but the reality seems to be closer to 150mS. So when there is nothing to sense, the time between cycles is quite long. When the ping goes missing, available() will eventually return true and the getDistance() method will return 0 -- this just makes it easier to see on a data plot. Should everything fail -- the counter will just keep counting up to it's 16x maximum and getDistance() will return 0x8000 (a negative number).

For a usage example look to the test.ino file. Connect the sensor to Gnd and +5v power, Trigger to Pin 2, and the Ping to Pin 3. Create a global HCSR04 object and call init(pinT, pinP) in setup(). Then call startPing(), available() in a loop, and getDistance() when available returns true.

So simple. Even I could do it!

Sunday, November 11, 2018

SCHervo Library and ServoTask

I made some of my usual "improvements" to the standard Arduino Servo motor control library.  Building on my Task Scheduler I've added a speed control, so the time it takes a Hobby Servo motor to move from it's current position to a new one can be controlled over a fairly large range.

It did take some reverse engineering...

The regular Servo library uses Timer 1 (on the 'standard' Arduino ATMega 328's -- I haven't worked this out all for other chips) to operate up to 8 (they say 12 but I think it gets a bit slippery after 8) servo motors. It does this by sequencing the ON pulses for each motor, one-after-the-other, which is pretty slick. Or would be IF the authors had mentioned what they were doing someplace. There's very little internal documentation -- Comments, Please! -- in the code. But I persisted.

My SCHervo library comprises a cleaned up and (hopefully accurately) commented original version, with the usual Schip Secret Sauce additions.  A simple addition is a turnOff() method which just shuts the motor off so it isn't using power trying to hold a position (or speed) that you don't care about. The more complicated addition is a Task to update the position -- and thus speed -- of each motor over a fixed interval. This allows the motor to transit in pseudo-continuous increments over a fairly extended period (up to about 1 minute). The motion is initiated using the startMove() method, can be monitored with ready() which returns TRUE when the motor has (just about) reached its new position, and stopMove() to make it stop at any time in-between.

But first lets review just this much: How do Servos work?

The basic idea is that you send the motor a positive pulse every 20mS (big T in the picture below), where the width of that pulse (little t) is proportional to the position one wants the motor shaft to take -- usually defined as from 0 to 180 degrees. Each degree position translates to a pulse width, which generally varies from 500 to 2500 micro-seconds, where 1500uS is a nominal 90 deg center position. Each motor is a bit different in it's range and widths, but that's the general scope.

picture of a servo drive pulse train
And here is a better description.

Another thing to remember is that the motor doesn't just suddenly go to the new position, but has a finite slew rate, generally something like 60 degrees in 200mS. This works out to 5 or 6 degrees in each 20mS refresh period, which is also the fastest the motor can get any new position input.

So... If we change the delivered pulse width in every 20ms period we can control the motor's angle change speed. And that's what the ServoTask() does. Recall that all of the 8 controllable motor's pulses are sequential, stacked such that the next starts after the previous finishes, and when they are all done there is a slack period until the 20mS refresh times-out. The ServoTask() is posted from the timer interrupt service routine at the beginning of the slack period, and -- in theory -- it will execute and update the desired pulse widths for the next period.

When a motor is started using startMove( endPosition, time, offFlag ) the code calculates how many micro-seconds should be added to the current pulse width in order to transition from the starting to end positions in the given amount of time. I got a little tricky and used a Fixed Point calculation with 3 bits of "sub-precision", to handle longer moves, but that's just between you, me, and the code.

However ... Fixed Point

If you know me, you know, I can't resist a few, more, comments. The Arduino using an ATMega 328 has no hardware support for Floating Point math. Should you make the mistake of including a float value in your program the linker will pull in about 1kB of code to support it. And further, should you blunder into actually using the float in a math-like-way, the result will take (relatively) forever. It's actually even worse, as the ATMega has only a small set of 16bit integer Multiply instructions (along with ADD and SUBTRACT) and NO Divide at all.

So what do you do when you would like to maintain some fractional components in your arithmetic? Why Fixed Point of course!  I'll leave it to wikiP to explain: https://en.wikipedia.org/wiki/Fixed-point_arithmetic

The tldr; of it is that you shift int values up by a consistent number of bits, do your arithmetic, and then shift them back down when you want to get a nice truncated integer again. This needs to be done judiciously because you only gain the precision of the up-bit-shift, and this number of bits is removed from the range of your values. In the case of my 3bit FixP values using a 16bit int, you get a precision of 1/8th or .125 in decimal and a range of +/- 0 to 4096 (12 bits plus sign). Which turns out to be just fine for calculating the micro-second values needed by the servos.


Note that the old-fashioned Servo.write() interface does not provide a way to determine when the motor is (almost) done moving. My startMove() method tries to account for the slew rate during fast motions by adding some guess at the amount of time needed. But this runs into a bit of trouble from the internal motor controller, which usually treats smaller moves as slower changes (this is how you are sometimes able to get speed control from motors that have been modified for continuous rotation). This behavior also slows the motor down when using the ServoTask() incremental stepping, so the internal how-long? guess is not always right.

I could do a little more hacking to better integrate faster slews -- it shouldn't be THAT hard, heh -- and this would also eliminate the need for the 'regular' write() interface (or draw it into the fold) such that, in all cases, the ready() method will really tell you when the motor is done moving

But. Otherwise. You should be able to just go ahead and use it now...

Saturday, November 10, 2018

Possible employment opportunity!

FB is trying to help by getting me job selling popcorn! If I could only find 5 of these I could forgo my Socialist Social security....

Thursday, November 8, 2018

Program Structure

OK then. Now here's some Architecture...

I've made a template file for the Arduino main program that uses the libraries and functions that we have so recently been discussing:


As you know, the Arduino system shields you from some of the nitty gritty by requiring only two methods:
  • setup() -- runs once at the beginning of time (after a reset);
  • loop() -- is run repeatedly thereafter;
In my template setup() calls methods to initialize messaging, any output devices that will be used, and the ADC and other inputs:

// the setup routine runs once when you press reset:
void setup()
    MessageTask_init();    // init the message system
    RunTask_init();        // init the output system
    ADCTask_init();        // init the input system

Where those methods are declared like this:

/** Do whatever necessary to initialize the message system
void MessageTask_init()
    // everyone uses comms
    Serial.begin( 9600 );
    // set message terminator to newline for text input

// a global system state, just for Sudhu...
//  actually... to keep track of what the system is doing
word runState;

/** Do whatever necessary to initialize the system output devices.
void RunTask_init()
    // set output modes on pins
    pinMode( BPIN, OUTPUT );
    // initialize running state
    runState = 0;

/** Do whatever necessary to initialize the system input devices.
void ADCTask_init()
    // set input modes on pins
    pinMode( APIN, INPUT_PULLUP );
    // start the ADC interrupt cycle
    analogRead( 0 );

A word about runState ... I insist on keeping track of the internal system state in order to execute sequences of behaviors and respond appropriately to inputs. So each of my programs has a global state variable which is manipulated by all the Task functions. The use of this will be more apparent if/when we get to the actual Variations Too code, but Sudhu was always teasing me about it so he gets credit here....

The loop() function does two things. Look for messages and post the MessageTask, and then do a pass through the scheduler's list of things to do. If any Tasks are ready to run, they get executed here, and then the scheduler() returns, allowing loop() to return, which then repeats itself. Note that interrupts will execute (except for brief elisions) throughout this, so new functions may be entered on the Task list at anytime.

// the loop() routine runs over and over and again forever:
void loop()
    // see if we have a new message and post the receive task
    postTask( 0, MessageTask, nMsg );
    // execute the schip task scheduler
    scheduler();    // schedule the world

The actual tasks that will be executed are declared like this:
 /** Task posted when there is a serial input message
 * @param nmsg -- ignored...
void MessageTask( word nmsg )
    // read the message string including '\n' terminator
    // and execute user functions

/** Posted from ADCTask to do something with new sensor values
 *  sState -- condensed bitmap of stuff that happened
void RunTask( word sState )
    // look at run and sensor States and do appropriate stuff

/** Task posted when ADC averaging gets a new set of values.
void ADCTask( word numADC )
    // rummage through input stuff and set flag bits in sState
    // execute RunTask(sState) after 0 millis (like real soon now)
 And the rest is, as they say, just implementing stuff....

We'll talk about some of that stuff anon.