A collection of computer systems and programming tips that you may find useful.
 
Brought to you by Craic Computing LLC, a bioinformatics consulting company.

Wednesday, May 30, 2012

rbenv - quick start

rbenv is a way to handle multiple versions of Ruby on a single machine and is an alternative to rvm.

I have been using rvm for some time and it works fine, but rbenv seems to have a slight edge in terms of simplicity and, perhaps, transparency in the way it handles version specific gems.

I installed it on a new machine with Mac OS X Lion using homebrew
$ brew update
$ brew install rbenv
$ brew install ruby-build
The only wrinkle was that it requires a regular installation of the gcc compiler, whereas Xcode installs llvn-gcc. You can find the good version at https://github.com/kennethreitz/osx-gcc-installer

You can see which versions of ruby are available using
$ rbenv versions
You can install them like this
$ rbenv install 1.9.3-p0
$ rbenv install 1.8.7-p302
$ rbenv install jruby-1.6.7
Not surprisingly, Jruby required that Java be installed but it was smart enough to trigger the download automatically.

You need to add two lines to your .bashrc, .bash_profile or whatever config file is appropriate for your preferred shell. Then create a new shell to pick up the definitions.

export PATH="$HOME/.rbenv/bin:$PATH"
eval "$(rbenv init -)"
The documentation is pretty straightforward in terms of setting different versions of Ruby in different projects.


Things get a bit confusing when it comes time to set up gems. rvm places copies of each gem in directories that are linked to each ruby version. By and large, rbenv doesn't mess with them.

With rbenv you install specific gems into a single system wide directory, as you would with a single Ruby version. For a given project you specify the required gems in a Gemfile in the project directory and you run

$ bundle exec

to make sure that you have correct versions of all the gems for that given project.
If you have installed a specific gem but find that it does not work or does not appear to exist then run this to, hopefully, fix the issue.


$ rbenv rehash



Tuesday, February 21, 2012

Running Ruby executable in a world-writable directory

Just now I needed to run a Ruby executable that was located in a world-writable directory.

Normally this is something that you never want to do - anyone could put an executable into your directory  and/or replace your script with their own. BUT sometimes you just need to do this - in my case I am in a secure environment, working with collaborators but we have not been able to get IT to set us up as a 'group' as yet.

The problem is that Ruby will issue a warning about "Insecure world writable directory" every time you run your script.

The real solution is to get past the need for those permissions. But in the short term I just want to get rid of the warnings.

There are two options:

You can pass the -W0 (that's a zero) flag to ruby
$ ruby -W0

But I use this 'hash bang' line at the start of all my scripts:
#!/usr/bin/env ruby
and I can't figure out how to pass the flag correctly in this case

The alternative is to add this line right after the hash bang line
$VERBOSE = nil

That does work. It clearly silences any other warnings that I might want to see but at least it gets me past my current issue.


Wednesday, February 15, 2012

Compiling Ruby 1.9.3 from source

I had to install Ruby 1.9.3 from source today.

Ruby comes with a load of documentation but I never use it. It's just easier to look it up on the web. But creating and installing that documentation takes a huge amount of time.

To disable it pass this option to 'configure'
$ ./configure --disable-install-doc

The other issue I had to deal with was that libyaml was installed in a non-standard location. With some libraries you can pass a specific option to 'configure' but not here... Instead you have to add the include and lib directories to Environment variables, like this:
$ export CPPFLAGS="-I/my/location/include"
$ export LDFLAGS="-L/my/location/lib"

'configure' will pick those up without any additional arguments.



Friday, January 27, 2012

Getting absolute paths with the Unix ls command

The standard Unix command ls lists filenames and directories in the specified directory. The default behaviour is to list just the filenames as including the full pathname would clutter the screen.

But sometimes you want the absolute paths. I need this all the time if I want to create a file containing a list fo filenames. The obvious command to get all the YAML files, for example, is:
$ ls -1 *yml
A.yml
B.yml

In order to get the full pathnames you need to use this:
ls -1 -d $PWD/*yml
/home/jones/A.yml
/home/jones/B.yml







Wednesday, January 25, 2012

Wolfram CDF Player and Chrome Browser

Wolfram Computable Document Format (CDF) is a way to embed interactive documents into web pages, in particular those that perform calculations in response to a user changing the parameters. For example you can create graphs of functions that will change as the function is modified. It is an extension of the Wolfram Mathematica software.

It looks really promising for some applications and you should check out their demonstrations - some impressive, some not so much.

You 'play' CDF files using a browser plugin - just like Flash - and these are available for all the current browsers.

I'm running it on Google Chrome on Mac OS X on a fairly recent laptop. It performs OK depending on the specific application and the amount of data it is asked to push around. But when you close that window or move to another page the CDF player process continues to run. In my case that was taking 5% of my cpu and 66MB of memory and it continued to do so for perhaps 10 minutes after the page was closed.

This sort of drain on your cpu, due to the plugin, is fairly common - just look at everything going on in Activity Monitor when you are browsing an 'active' web page with ads, etc.

In Chrome you can go to Window -> Task Manager, select a process and End it - but that didn't appear to do anything in my case.

CDF looks very interesting but if it requires too many resources, and then fails to release them properly, then it is not likely to be broadly adopted. It is something to keep an eye on, for sure.


Wednesday, January 4, 2012

Disabling Spotlight (mds) on Mac OS X (Snow Leopard)

I run a lot of command line scripts on my laptop - some of which can run for hours. I want to continue using the machine for reading mail, etc., but I don't want any other intensive task sucking up the cpu cycles. So I shut down iTunes, don't watch any videos, etc.

But sometimes I see some other process taking all my cycles. The odds are that it is either something to do with Flash or it is a process called mds.

mds is the indexing software that powers Spotlight - the built in Mac search facility.

I suspect that when I'm generating gigabytes of data and hundreds of files in one of my compute jobs, mds is responding by trying to index them at the same time.

I don't use spotlight at all, so let's turn it off and see if that helps.

This turns it off:
$ sudo mdutil -a -i off 

This turns it back on:
$ sudo mdutil -a -i on

Turning it back on will presumably trigger a big mds run as it plays catch up, so run this command only when you can afford the cycles.


Thursday, November 17, 2011

strsplit in R

The strsplit function in the R statistics package splits a string into a list of substrings based on the separator, just like split in Perl or Ruby. The object returned is a List, one of the core R object types. For example:
> a <- strsplit("x y z", ' ')
> a
[[1]]
[1] "x" "y" "z"
> class(a)
[1] "list"

If you are not that familiar with R, like me, the obvious way to access an element in the list will not work:
> a[1]
[[1]]
[1] "x" "y" "z"
> a[2]
[[1]]
NULL

So what do you do? There seem to be two options:

1: You can 'dereference' the element (for want of a better word) by using the multiple sets of brackets
> a[[1]][1]
[1] "x"
> a[[1]][2]
[1] "y"
... but I'm not going to write code that looks like that !!

2: You can unlist the List to create a Vector and then access elements directly
> b < unlist(a)
[1] FALSE FALSE FALSE
> b <- unlist(a)
> b
[1] "x" "y" "z"
> class(b)
[1] "character"
> b[1]
[1] "x"
> b[2]
[1] "y"
Much nicer !





Archive of Tips