My First Raspberry Pi Project

Tags

, ,

Since the Raspberry Pi is out I have been looking for a small project where it makes sense to use one and I finally got one! This year the British weather went abroad and left us with two complete weeks of 28+ degrees Celsius which is unheard of (more or less). This caused many problems such as melting motorways and cracked rails but above all it exposed the inadequacy of the ventilation in my personal server room (aka garden shed). Most of the services provided by those computers are not critical but there is one that is indispensable, especially in this weather. As it happens I am a member of a local sailing club that has a weather station. The weather station uploads the wind direction and speed to the website every 15 minutes which is great but does not give you any clue about the variation which is an important factor in inland waters (the club is in this beautiful lake). Without historical information it is hard to decide if there is enough wind to justify loading the car with all the boat gear and drive there.

So, a couple of years ago we solved this by creating a small python script that downloads the pictures with the speed and direction of the wind, OCRs them to extract the speed and does some poking to work out the direction and then creates an historical chart using matplotlib. Maybe I could have done something simpler using the weather station’s API (if there is one) but this was so much fun and I did not have to ask for access to the weather station. The other club members liked it so much that the charts are now part of the club’s website. So many people use them that it became a point of honour to keep the charts updated and one of the servers in the garden shed was partly dedicated to this. With the high temperatures the service became a bit unstable and that is where the Raspberry Pi comes in.

Start here is you are only interested in the Raspberry Pi

When the Raspberry Pi was launched I tried to buy one but soon gave up with all the queues and Ebay crazy prices. After all I did not have anything planned for it and it would end up in a box like (I would guess) so many have. But this time I had a good reason. I needed to replace this overheating  4 CPU AMD server with 8Gb of RAM and a couple of SSDs… just kidding it does not have any SSDs. So I bought a bundle off some company in Amazon, connected all the wires and turned it on. The bundle came with NOOBS installed in the SD card and I installed the recommended choice of OS: the Raspbian (not a very good name I should add). If you are new to the Raspberry Pi, NOOBS is some kind of application that makes installing a Raspberry Pi simpler than installing any Windows application. I was expecting some back and forth as most Linux installations require but no. It just installed all by itself, gave a few tips during the installation and there I was on a desktop. I am sorry to say this but LXDE, the desktop, could look a bit nicer but, hey, it is free and fully functional and I am not complaining.

OCR and number crunching on a Raspberry Pi

The program that downloads the images, OCRs them, creates the new charts and uploads them to the website is written in Python 2.7 and has a few dependencies. So the first task was installing them as well as updating the Raspberry Pi software.

First update the Raspberry Pi (hopefully this in the correct order, I never remember which comes first):

sudo apt-get update
sudo apt-get upgrade

Then install Tesseract OCR and Imagemagick. The latter is required for the image manipulations:

sudo apt-get install tesseract-ocr
sudo apt-get install imagemagick
sudo apt-get install python-imaging-tk

Imagemagick is a very cool tool for image manipulation from the command line but that is a post for another day.

I also need some fonts for the charts:

sudo apt-get install ttf-mscorefonts-installer

The reason I am using these fonts is because the script was developed on Windows and requires some common Windows fonts.

Then came the heavy duty dependencies. Matplotlib requires numpy:

sudo apt-get python-numpy
sudo apt-get install python-matplotlib

These took forever to install, maybe half an hour or so but they installed without a glitch. At this point I was in awe, I mean, this is a generic computer for less than £50 all included and I can install numpy!

And finally I got to my script. As you would expect there were a few rough edges caused by crossing from Windows to Linux such as paths but in a few minutes the script was up and running. I am always surprised when a Python script just runs on a completely different platform without any code changes. But then I had to automate the script to run every 15 minutes and that is where cron comes in. Three hours later (!) I got it working. This is an area where the Windows Scheduler wins hands down. It is much easier to muddle through these tasks in Windows than on Linux. On Linux you have to learn stuff and read documentation before you can produce results. I am still not sure which one is more beneficial to my life. But hey, I got it working and I even have a couple of tips: install exim4 (sudo apt-get install exim4) to receive the errors from cron by mail. They just disappear otherwise, especially if they are security errors. To read mail on the Raspberry Pi without installing a mail reader type mail in the command line. This is Outlook’s great grandfather and is a joy to use.

The other tip is to cd to the folder where the script runs before running it in cron. This is my cron job:

*/15 * * * * cd /home/pi/cvlsc &&  python windspeed.py

That way all the temporary files can be created without access denied errors which were the cause of 80% of the problems.

Performance

My script is not the best example as far optimization is concerned. Typically I load the same data over and again and save and read files at will which is normal since I wrote the script in a couple of sessions over a weekend after sailing all day (sailing a Laser is surprisingly tiring) and I only use Python sporadically. But still, the Raspberry Pi just goes through it without any problems even the OCR-ing is quite fast. The only place where it choked was the creation of two Wind Roses that use all the historical data (some 80k rows) for each chart. The Raspberry Pi took around 9 minutes to create them where the AMD server took less than a minute. So now I need to create a new script that updates those charts on a different schedule, maybe once a week, or the Raspberry Pi would be clocking 100% most of the time. All in all I am very happy with the performance of the Raspberry Pi but you do need to adjust your performance expectations.

In closing

So this how my Raspberry Pi first adventure ends. The service has been up for 2 days without any glitches and although the box is warm the CPU is at 55 degrees Celsius (cat /sys/class/thermal/thermal_zone0/temp shows the CPU temperature) which seems ok given the 28+ in this office and this is not going to last…

Starting a support forum

Tags

, ,

If you want to get started with a company or community forum I believe the best offer out there is Vanilla Forums. I say best not as in the one with the most features or the largest community but as in the one that is simple and which, hopefully, can be grasped in a few hours, including software changes.
The installation is rather simple. Much like any other LAMP application you need to create the database, copy the files into a folder (I put mine at http://www.gepsoft.com/forum/) and do some configuration. This may seem complex if you are not acquainted with this type of web application but they provide very good step-by-step instructions.

And then it begins…

Installation is the simple part. It is the forum management that will eat into your time. As soon as your forum hits Google you will be flooded by dozens of bots. These are scripts that create new user accounts and try to post spam in the comments and forum entries or that post links in their Activity page (more on that below).  For me it took a couple of weeks to go from the odd annoying signup to a flood of dozens of new users that needed to be deleted every morning.

It all comes down to the registration facilities. One option – called Basic – is to have the user solve a CAPTCHA and verify their email. That was my first choice because the users can start posting immediately which is very convenient for my real users. Unfortunately it proved way too open for the nasty ones. As it happens the bots are able to somehow solve Google’s CAPTCHA, verify the email and post a bunch of spam very quickly. I don’t know how many times they fail but I got several successes a day. But even before that there was another well hidden attack. They posted spam links to their Activity page which is a page where you can post comments to your profile (I am not sure what that is for to be frank). So I changed to Approval registration where your users register and get put in an Applicants queue where you manually approve one by one. Unless you are expecting a flood of users this seems like a reasonable solution. The problem with this solution is that Vanilla does not use the CAPTCHA for this type of registration so you are flooded with dozens of fake approvals a day and it is very easy to miss a real user in the middle of all garbage. On top of that there is no way to bulk delete users so it gets old very fast.

The Solution So Far

The good news is that Vanilla supports plugins and there are a number of those dedicated to this problem. But before we install anything there is a small change that stops the Activity spam I mentioned before. When a user signs up in basic mode (the CAPTCHA one) he is put in the Confirm Email role. While he is in this role he cannot post to the forum but, for some reason, he can add comments to his Activity page. To avoid this, go to the Admin Dashboard, select the Roles & Permissions tab, edit the Confirm Email role and uncheck all the boxes except the Allow Signin role. This way they will not be able to post anything until they confirm their email.
Finally, and I think this should be in the base installation, add the BotStop plugin. This is a very simple plugin that adds a custom question to the signup form such how much is 2 plus 1 and it is a life saver. Surprisingly, and I am say this in fear, even the default values stop all the bots from signing up. The only problem I found is that this plugin is not compatible with OpenID so I had to drop OpenID for the moment but the bots are eerily silent. I am even contemplating dropping the email confirmation step if this state of affairs does not change.

Concurrency vs. Parallelism

Tags

I have watched countless demos of how parallelism can be added to your code by changing a small portion of it or by adding an attribute that magically divides the processing time by the number of cores in your CPU. Usually the demos are prime numbers calculators or even worse sleep statements. These do scale linearly with the number of cores in your machine but they are not doing anything useful or realistic. The reality of most compute intensive systems is the abundance of shared state. Most CPU bound code operates on one or more big arrays that are allocated at the start of the application and, hopefully, deleted at the end. This assumes that we are not talking about functional languages, I should add. After all scientific computation is traditionally done in Fortran which is a very procedural language followed by C and C++ that lend themselves very well to for loops.

When you parallelize a piece of code that uses shared state you get variations of a problem called False Sharing where your cores fight with each other for cache memory. I remember a few years ago in my first efforts to parallelize with OpenMP the algorithms we use in GeneXprotools. How happy was I to see all four cores light up! That did not last very long though. The resulting code run slower than the original serial, single core, code built with the venerable Visual Studio 97 C++ compiler! And the difference was even higher when compared with serial code compiled with the newer compiler with all the possible optimizations on. Of course this was due to how the algorithm was implemented. After all the GEP algorithm is one of those embarrassingly parallel problems, so long as you write it from the ground up with that in mind it should be faster than a serial implementation. Let me say at this point that I don’t doubt this, we did implement a parallel-minded version of GEP that is faster but not convincingly so when other factors such as code complexity were considered.

Use Concurrency Instead

Instead we are looking elsewhere. After all we have a very fast implementation that has been improved and bug free for many years so we might as well use it. And that is what is happening with the next version of GeneXproServer. GeneXproServer is a batch processor that creates mathematical models in an automated way. It uses GeneXprotools run as a template and an xml job file with processing instructions (you can learn more about this here if you are interested). Instead of parallelizing the algorithm we are running a number of instances concurrently. We do this by launching different processes which, by definition, share nothing with each other thus achieving near perfect linearity with core number. That is, for twice the number of cores we are able to perform twice the amount of work. In a world where Random Forests, Model Ensembles and Mini-Batching are dominant this is a big win. We can produce a number of mathematical models that is directly proportional to the number of cores of the machine in use.

Unfortunately that is not all. There is another form of shared state lurking underneath your code: Input/output either in the form of a database or disk files. As far as I know mechanical hard drives are quite sequential. Nowadays they have cache memory that helps but given enough data they have to commit the data to disk and that is indeed serial. SSDs, on the other hand, are much more forgiving and go a long way to solve, or at least hide, this problem and are recommended.

One solution is to avoid writing to disk or database until the end of the computation. There is some risk involved since a power failure or a bug will interrupt the computation with total loss, but when you can afford it, it is hard to beat the advantages in terms of reduced development, reduced code complexity and most of all, scalability and future proofing of your application.

 

MSTest tests fail in the command line but work fine in IDE

Now that GeneXproTools 5.0 is done and dusted I am pushing forward with the next version of GeneXproServer. One of the outstanding issues was to centralize all the unit and integration tests in a single nant file. This is a reasonably complex process because we have tests written in python, c++ and in c#. The latest ones are MSTest unit tests and, according to MSDN, can be run from the command line as such:

mstest /testmetadata:gxps5.vsmdi
or
mstest /testcontainer:gxps5_tests.dll

The first one runs the tests using Visual Studio’s configuration whereas the second one runs all the tests in the final binary. Let’s try the first one:

Loading C:\_svn\gxps5\gxps5.vsmdi...
Starting execution...
No tests to execute.

What do you mean no tests to execute??? Let’s try the second one:

...
Passed gxps5_test.Dataset_Tests.Load_Database_DataSet
Passed gxps5_test.Dataset_Tests.Load_Excel_DataSet
Passed gxps5_test.Dataset_Tests.Load_File_DataSet
Failed gxps5_test.Functions_Tests.Funtion_Add
Failed gxps5_test.Functions_Tests.Funtion_Remove
Failed gxps5_test.Functions_Tests.Funtion_Set
Failed gxps5_test.Job_Tests.Consolidated_Run
Failed gxps5_test.Job_Tests.Fixed_UseSubFolder_Requires_SubFolder_N
...

Wow, I didn’t know I had that many fails… in fact I wasn’t expecting any. I went back to the IDE and run the tests and they all pass(!). A quick look around and it looks like it is random. Some tests are very simple and fail in the command line, others are complex and pass and vice versa. So off to google I am. And indeed the first hit is promising “Tests pass in IDE but fail with MsTest.exe”. Although it gives good clues it is not very detailed.

The solution is to stick with the first example but there are a couple of configuration steps we need to perform: First you have to go to Visual Studio (2010 in this case) and select the menu Test->Edit Test Settings->Local(local.testsettings). Once there select the Deployment tab and check Enable Deployment. This enables the Add Directory on the right which you must use to select the build destination folder of your project. I believe this is required because these tests use a number of unmanaged dlls and the build folder is not standard. Secondly, you have to create a list of tests. Select the menu Test->Create New Test List… give it a name (All for example) and press OK. This creates an empty list. Now you have to select the All Loaded Tests on the left, select all the loaded tests and drag them to your new list (why is this so fiddly, there must be a better way).

So now the tests run properly using:

mstest /testmetadata:gxps5.vsmdi

I was surprised that it took so long to do this.

The Easiest Way to Run ASP.NET on Linux

Tags

, ,

Edit: As pointed out in the comments (thanks Brian), turnkey now has a mod_mono stack:  http://www.turnkeylinux.org/asp-net-apache

I am in the process of moving a number of sites from a Windows shared hosting to a Linux virtual machine and one requirement is to be able to run ASP.NET. I need the VM to be quite stable so I went looking for an easy to way to create local virtual machines for testing and experimenting. I really did not want to spend time installing and configuring linux VMs and luckily I found the project TurnKeyLinux. This is a fantastic collection of ready to use virtual machines and ISO files with pre-installed software stacks such as LAMP servers, WordPress servers, file servers, etc, etc. The project is quite active and has a commercial arm that helps you host the virtual machines with Amazon. They don’t have a pre-packaged ASP.NET server but it is quite easy to create one as you will see. I started by downloading the LAMP Stack. I chose the VMDK image since my servers have VMWare Server installed. This is a small-ish 221 Mb zip file. I copied the zip file to the server, expanded it to a folder and added it to VMWare Server in the configuration site. As you turn it on and open the console (I am assuming here that you are familiar with VMWare server) the installation will prompt you for passwords for the root account and a few other services. It will also ask you if you want to use TurnKey’s backup service which you can skip and finally it will ask you if you want to update the machine with the latest patches. You can also skip this step because it will auto update itself later on. Finally the installation will finish with this screen:

Setting Up ASP.NET

Open a web browser and navigate to: https://192.168.0.113:12320/ which is a shell prompt to the virtual machine. Login with the root account and the password you set before and you should see a screen very much like this one:

You can install the software from this prompt or you can use an SSH client such as Putty. At this point we have a server with Apache 2, MySQL and PHP installed and running. You can navigate to the machine with a web browser and see TurnKey’s default page. The next step is installing mod_mono which is Mono’s Apache handler. All TurnKey builds derive from Ubuntu which is handy because Ubuntu is one of the best documented Linux distribution out there. I followed this page of installation instructions of mod_mono and although it is a bit out of date it has all the required instructions. In fact, since it was done for older builds of mod_mono and Ubuntu, it has more steps than the minimum required to serve ASP.NET pages so I am listing below what is needed as of March 2012:

1. In the prompt above start by typing apt-get update followed by enter. This will run a few processes and download a few files. You should answer Yes to any prompts.

2. Next you need to install mod_mono proper by typing apt-get install libapache2-mod-mono mono-apache-server2 followed by enter. This will take a while to process since it will download Mono, install it and configure Apache. You should also answer Yes to any prompts.

3. Then you need to enable mod_mono by typing the following: a2enmod mod_mono_auto

4. Finally restart Apache with the following instruction: /etc/init.d/apache2 restart

And that is it. The last step is to copy the ASPX pages and code files to the virtual machine for testing. Create a simple hello world ASP.NET page and code file and copy them to the root of the webserver which is at /var/www and navigate to the page. It should just work.

EDIT: It is also possible to install ASP.NET on the bare core TurnKey distribution. I used the TurnKey Core 12.0 RC ISO file. With it I created a virtual machine accepting all the defaults. After the machine was ready I went through the steps above but step 4 failed. I rebooted the machine and then run apt-get install apache2 which solved the problem and Apache is now happily serving ASP.NET pages. If your VM is short on memory or you just don’t need MySQL and PHP then this is a better choice although slightly more complicated to setup.

P.S. To copy the files to the virtual machine I suggest that you use Filezilla. You can find a very detailed tutorial on using Filezilla to connect to a Linux virtual machine here.

Building a grammar for the R Language

Tags

,

One of the less well known features of GeneXproTools is the ability for the users to add support for new programming languages. Out of the box, GeneXproTools, can export the models it creates to fifteen or sixteen programming languages depending on the type of problem. The “secret” to this flexibility is the fact that, internally, GeneXproTools uses Karva notation which is a very compact language that is easily converted into a tree. This tree can then be cross-compiled into almost any programming language. As in a traditional compiler, this component of GeneXproTools is divided into a backend and frontend. The frontend processes the Karva notation as I described above whereas the backend, assisted by a resource or grammar, recreates the model in the programming language of choice. Adding a new programming language is as simple as creating a new grammar.

GeneXproTools’ Grammar Concepts

A grammar is simply an XML file that tells GeneXproTools how each function is represented in the language of that grammar. For example, and choosing the javascript grammar as an example, the function power is described as:

<function uniontype=”” terminals=”2″ symbol=”Pow” idx=”5″>Math.pow(x0,x1)</function>

Let’s ignore the first attribute for the moment. The second one, terminals, defines the arity of the function (the number of inputs), the third and last parameters are fixed values used internally in GeneXproTools that cannot be changed. Finally, the inner text of the node is the description of the power function as it will appear in the code. Note that the inputs must always have the format x0, x1… xn-1 where n is the arity of the function. As of this writing each grammar defines 279 functions so it is a good idea to start the new grammar by copying one that resembles the new language to save time.

Some functions are special and require that the initial parameter, uniontype, be defined. These are the functions used to link the various genes in the model. At the moment there are four such functions per grammar and here is the javascript addition function as an example:

<function uniontype=”{tempvarname} {symbol}= {member}” terminals=”2″ symbol=”+” idx=”0″>(x0+x1)</function>

It is quite simple. The uniontype contents match the beginning of each line of code in a model. A typical model in javascript would be:

function gepModel(d)
{
var vTemp = 0.0;

    vTemp = (d[0]*((Math.sqrt(d[3])+d[2])+(Math.sqrt(d[1])/d[2])));
vTemp += ((d[2]*d[3])*Math.sqrt((Math.pow(d[3],3)*(d[1]/d[2]))));
vTemp += (d[2]*(((d[3]*d[3])*(d[1]-d[3]))+((d[3]/d[1])/d[1])));

    return vTemp;
}

The uniontype represents the code in bold and set to red . When we translate the same model to MATLAB the result will be:

function result = gepModel(d)

varTemp = 0.0;

varTemp = (d(1)*((sqrt(d(4))+d(3))+(sqrt(d(2))/d(3))));
varTemp = varTemp + ((d(3)*d(4))*sqrt(((d(4)^3)*(d(2)/d(3)))));
varTemp = varTemp + (d(3)*(((d(4)*d(4))*(d(2)-d(4)))+((d(4)/d(2))/d(2))));

result = varTemp;

In this case the corresponding uniontype is defined as such:

<function uniontype=”{tempvarname} = {tempvarname} {symbol} {member}” terminals=”2″ symbol=”+” idx=”0″>(x0+x1)</function>

So the tempvarname is the variable that accumulates the models’ results, the symbol is the operator used to  link the genes and the member represents the body of the gene. Again, if you choose a grammar that resembles the new language as a starting point then you can pretty much leave these attributes unchanged.

Another important aspect of the grammars are the “helper functions”. These are functions that require a special implementation for that language. For example, Visual Basic does not have a native Mod function so we have to define it as a callable or helper function. In this case the function is defined as:

<function uniontype=”” terminals=”2″ symbol=”Mod” idx=”4″>gepMod(x0,x1)</function>

And the function gepMod is defined in the helpers section of the grammar as such:

<helper replaces=”Mod”>Function gepMod(ByVal x As Double, ByVal y As Double) As Double{CRLF}{TAB}gepMod = ((x / y) – Fix(x / y)) * y{CRLF}End Function{CRLF}</helper>

As you can see there are special characters to help with formatting body of the function and the x character is also reserved and should be replaced with {CHARX}.

There are several other aspects to the grammars that I am not going to cover in this blog entry but if you get stuck contact me either here at this blog or through Gepsoft’s support.

Grammar Functions

The first thing to do is choosing an existing grammar as a starting point and R is similar to both javascript and Visual Basic (at least from a grammar building point of view). I ended up selecting the latter because of R’ power operator which matches Visual Basic’s. The GeneXproTools’ grammars live in the folder C:\Program Files (x86)\GeneXproTools 43\grammars\ and there are two types of grammars: the Math grammars and the Boolean grammars. The Boolean grammars are used to generate code for Logic Synthesis models whereas the grammars named Math are used for all other model types. In this post I am not covering the Boolean grammars (although they are basically the same) so I started by duplicating the file vb.Math.00.default.grm.xml and renaming it to r.Math.00.default.grm.xml. The second step is to open the file in a text editor such as notepad (I suggest using Notepad2 because it colorizes the contents nicely and validates the xml while you write) and change the grammar first node to:

<grammar name=”R Language” version=”4″ ext=”r” type=””>

If you now start GeneXproTools, open a run and go to the Model Panel you will find that the R Language entry was added to the Languages list:

R Language in GeneXproTools

The next step is quite labour intensive and entails translating all the functions from Visual Basic to R. First we start by translating the uniontypes. The only difference here is the equals signal so R’s uniontypes will have this format:

uniontype=”{tempvarname} &lt;- {tempvarname} {symbol} {member}”

Note that the less than symbol symbol must be encoded otherwise the grammar would not be a valid XML file. With this done we jump into the list of functions. Many functions can be left untouched but others must be translated to the R equivalents. Most of the times it is a matter of small differences, for example, “Log” translates to “log”, but others are more complex such as the 3Rt function which requires a helper function in Visual Basic but that can be expressed using the ternary operator in R. This process is error prone and it is a good idea to, every now and then, open the grammar in a browser that validates the XML such as Internet Explorer.  Also some judicious search and replace can greatly reduce the burden of hand editing each function.

Another major part of the work involves translating the helper functions from Visual Basic to the R Language. These functions are under the helpers node and mostly are one or two liners. The only odd bit are the layout rules. Whenever you need to insert a Tab you must add {TAB} and to add a new line you have to use (CRLF). In most cases you do not need to worry about the layout of the helpers unless you want to prettify the code.  Interestingly, the R grammar is probably the the grammar with the least number of helpers of all.

Finalizing the Grammar

After all the functions and helpers have been translated we are left with a few loose ends to fix. Firstly let’s look at the “headers” node. The header corresponds to the model’s function declaration, for example:

<header type=”default” replace=”no”>Function gepModel(ByRef d() As Double) As Double{CRLF}</header>

Which must be changed to:

<header type=”default” replace=”no”>gepModel &lt;- function(d){CRLF}{</header>

There are two entries in the headers: the first one is the generic case and the second one is specific to Classification runs that require the declaration of a variable called ROUNDING_THRESHOLD. Again, it is a simple case of changing the bits that are different in R while maintaining the same semantics of the code.

The next node that needs a bit of tweaking is the randomconstants node. This node encodes the declaration of the random constants used in the model and they are declared as constants (Const) in Visual Basic but in R they are simple variables.

<randomconst type=”default” replace=”no”>{TAB}Const {labelname} As Double = {labelindex}{CRLF}</randomconst>

Changes to:

<randomconst type=”default” replace=”no”>{TAB}{labelname} <- {labelindex}{CRLF}</randomconst>

The node constants is very similar to the previous one and only requires a small adjustment. Here are both versions:

<randomconst type=”default” replace=”no”>{TAB}Const {labelname} As Double = {labelindex}{CRLF}</randomconst>

Changes to:

<randomconst type=”default” replace=”no”>{TAB}{labelname} <- {labelindex}{CRLF}</randomconst>

Similar changes must be applied to the node footers.

Finalizing the Grammar

The last nodes needed to complete the grammar are the parenstype where a value of 1 means use square brackets and zero means normal brackets , the commentmark which must be set to # for the R Language and the startindex which is the lower bound of a list or an array (1 in this case).

These adjustments bring the grammar very close to complete and are are enough for it to work correctly. The next step is testing the grammar which is a rather more involved process that entails creating models with all the functions and testing them with different sets of data to ensure that the results of the grammar generated code are as close as possible to the native processing of GeneXproTools.

A GeneXproTools model converted to the R Language

Finally, you can download the grammar described in this post from here to copy into the grammars folder under your installation of GeneXproTools. i hope you found this post useful and if you have any questions just post them in the comments below.

Raspberry Pi Is The Software Appliance

The Raspberry Pi is not the first cheap ARM based computer in the market but a combination of laudable objectives with some very good technology and marketing is making it an instant hit. But what I find intriguing is the possibility of it becoming a viable software platform. If it does, it will be the first time we will be able to deliver desktop-like software pre-installed on a computer with an overhead of maybe $50. No more deployment woes and the customer actually receives something with the purchase. Now, this computer appears to be roughly as powerful as what we had ten years ago and even though this may sound bad, many applications can survive just fine within these constraints. For example, a very large number of Point Of Sale (POS) computers could be replaced with the Raspberry Pi. Other more extreme applications may call for disposable computers. Weather balloons and the surging amateur space projects come to mind. But there are other possibilities. What about combining a Raspberry Pi with a 3D printer? Instant software appliance manufacture for a few hundred dollars. Use the printer to create modular parts each containing a Raspberry Pi. Each part plays a specific role and slots into the others forming a composable system. How is that for geek overload? A bit more down to earth are the many thousands of horizontal software packages created for specific machines such as industrial robots, presses, etc. The norm now is to ship a complete computer with some software pre-installed at the cost of a few hundred dollars. As soon as we have available a monitor with an embedded Raspberry Pi these dust gathering boxes will go away.

This poses the interesting question of what will be the best framework to build all this software on. The Raspberry Pi runs a number of linux distributions but at 256Mb of memory it will be interesting to see how anything above C and C++ behaves performance wise. In my opinion, we need the likes of Java and Mono to deliver good enough performance. And good enough performance for me is to match the experience we had back in 1997 with Visual Basic applications. At this level it is possible to create the type of software I mentioned above within short timeframes and small budgets. If we need to resort to C or C++ for the software then we will loose the rapid application development side of things. Which opens another can of worms which is the tooling. We need a simple to use IDE and build tools. Ideally we would develop the applications on a Mac or PC using an IDE such as MonoDevelop or Eclipse and test them in an emulator or on the board itself. Very much like what we do for mobile development. Another interesting possibility is if Android is ported to the Raspberry Pi. This would solve the software stack problem as long as the performance was there. It is still early days and there is a lot of work to be done on the software side but if the Raspberry Pi succeeds we may look back to 2012 as the year when a new paradigm was born.

Edit: It just started!

MS Chart workaround (1 of n)

As I wrote before I am using the MS Chart for the more conventional charts in the development of the next version of GeneXproTools. One particularly weak feature of this chart is the Zoom which is not very useful, has very cheesy scrollbars and also has  a few bugs here and there. One I came across recently was that when zooming in until the vertical line of the zoom cursor is not deleted when you stop dragging and release the mouse. It just stays there even if you zoom out completely. I don’t know if this is a side effect of hosting a Win32 component in WPF or if it is a bug but nevertheless I managed to come up with a very simple workaround. Maybe a bit too hacky for the more retentive but effective.

The solution was to handle the mouse down and mouse up events of the chart control and toggle the cursor colour between transparent and red (its default colour). Here is the XAML snippet:

<Charting:Chart
    x:Name="chart1"
    MouseDown="Chart1MouseDown"
    MouseUp="Chart1MouseUp" />

And here are the handlers:

private void Chart1MouseDown(object sender, MouseEventArgs e)
{
    if(e.Button == MouseButtons.Left)
    {
        var cursorX = chart1.ChartAreas[0].CursorX;
        cursorX.LineColor = Color.Red;
        var cursorY = chart1.ChartAreas[0].CursorY;
        cursorY.LineColor = Color.Red;
    }

}

private void Chart1MouseUp(object sender, MouseEventArgs e)
{
    var cursorX = chart1.ChartAreas[0].CursorX;
    cursorX.LineColor = Color.Transparent;
    var cursorY = chart1.ChartAreas[0].CursorY;
    cursorY.LineColor = Color.Transparent;
}

I could not find a more elegant solution but this seems to work fine. The chart is a Scatter Chart and so far this is the only workaround. Let’s hope that the n in the title remains low.

And the best WPF chart is…

One of the the main features of the upcoming version of GeneXproTools is data visualization. Some of our charts, usually the more innovative, are home brewed but there is no point in rolling out yet another scatter chart when there are so many free and for pay high quality alternatives. Or so I thought.

We decided a while ago to migrate the front end of GeneXproTools to WPF a decision that so far we are happy with, so logically we should pick up a WPF chart component. After some searching I found this thread at stackoverflow detailing most free and for pay alternatives but as soon as you dig into them you will find out they are divided into three groups:

  • Very expensive charts
  • Very slow charts
  • Non WPF charts

Of course there are non WPF charts that are very expensive (and possibly slow) but the point here is that I could not find a reasonably priced WPF chart that would be able to show a few tens of thousands of points in scatter mode without stuttering or loading delays. The WPF chart that came closest to it was Microsoft’s Dynamic Data Display which has an interesting API but in the end it was dropped due to UI inconsistencies and because it looks like a dead project. The paid for alternatives that are reasonably priced were a disappointment. Some have good performance characteristics but their focus is different from ours. They are very quick when redrawing a limited number of points but they take wall time to redraw large numbers of points (in the tens of thousands ballpark).

Well, then what is the alternative? The alternative is to go back in time and use the Microsoft Chart for WinForms. It draws tens of thousands, if not hundreds, without breaking a sweat, includes many different chart types including a FastPoint and even less mainstream ones such as a simple Histogram and it is free. The chart itself was licensed to Microsoft by Dundas which appears to have dropped it. That is a pity because it included a very nice configuration tool that is not available with Microsoft’s version. The downside is that you cannot use window transparency in the application and some flickering when the chart is loaded for the very first time but it seems to be a fair compromise.

Minifying WPF. The MiniGrid.

Tags

,

We have been all composing Html by hand since Mosaic or Netscape hit the shelves late in the last century. Since then we have had a love and hate relationship with it. We love it because it is responsive and easy to compose by hand. We hate it because we have to do it by hand. But let’s drop Html for the moment since we have a newer and worst offender: WPF’s Xaml. Xaml is verbose and WPF’s Xaml is up there with the RSI gods. That is because, I guess, it was designed to be machine generated by Blend or Visual Studio but these two designers fall short of the mark in so many ways that it feels 1997 all over again with Blend for Frontpage and Visual Studio for Visual Interdev. No disrespect to the Visual Studio team implied, it is a hard problem to solve.

Meanwhile I got tired of writing RowDefinitions, ColumnDefinitions and VisibilityConverters by hand and decided to do something about it. That is how the MiniGrid was born. It is not a “minification “in the strict Wikipedian sense of the word but an extension to WPF’s Grid. For this release I added three properties: Visible, RowsPattern and ColumnsPattern. Visible is Boolean and it flips Visibility between Visible and Collapsed to remove the need for Converters when binding to IsChecked or IsEnabled. The other two are bit more complex.

RowsPattern and ColumnsPattern

These two properties replace the need to define Grid.RowDefinitions and Grid.ColumnsDefinitions. Instead these definitions are defined inline as a pattern. The pattern can either be a series of “a” or “*” characters for Auto and Star or it can be a number for a pixel size. The definition of a grid with three rows and three columns could be:

<Controls:MiniGrid RowsPattern=”aaa” ColumnsPattern=”a*a”/>

This grid has three Auto rows, the first column is Auto, the second is Star and the third is Auto again. A definition with pixels would be:

<Controls:MiniGrid RowsPattern=”20 25 20″ ColumnsPattern=”100 100″/>

In this example the Grid will have three rows with 20, 25 and 20 pixels from top to bottom and two columns both 100 pixels wide.

The interesting bit is that the patterns work with the designer in both Visual Studio 2008 and 2010 Beta 2 and also in Blend 3 where the patterns can be changed in the property tab. The main limitation is that you cannot mix Auto/Star with numbers and I will remove this in a future version if I ever come across the need. The code is very simple, MiniGrid derives from Grid and adds three Dependent Properties that are intercepted when their value changes.

You can download the code from here.