Craic Computing Tech Tips: February 2011

Thursday, February 24, 2011

Rails, UTF-8 and Heroku

I've had problems with Ruby character encodings over the years, especially when pulling text with non-ASCII characters in from remote sites. I thought I had it mostly sorted but the past few days showed me that was not the case.

I have a MySQL database on my local machine and a Rails 3 app that pulls in text from remote sources, stores it in the database and does stuff with it. I am deploying the application on Heroku prior to public release. Heroku uses PostgreSQL exclusively.

I was under the belief that all the components in my system were set up to use UTF-8 encoding and therefore moving text with non-ASCII characters around should be fine. But in practice that was not the case - characters like 'α' that looked fine on my local machine showed up as 'Î±' on Heroku, etc. So clearly I was doing something wrong. Rather than go into all the gory details, this is the way to do it right...

Bottom-line: Make everything use UTF-8 explicitly ... EVERYTHING

0: Backup your current database!

1: MySQL
Here are the contents of my /etc/my.cnf file:

[mysqld]
datadir=/usr/local/mysql/data
bind-address = 127.0.0.1
character-set-server = utf8
max_allowed_packet = 32M
[client]
default-character-set = utf8
[mysql]
max_allowed_packet = 32M

Even if your tables are held in utf8, you should add these lines. You want the following mysql command to look as shown:

mysql> SHOW VARIABLES LIKE 'character\_set\_%';
+--------------------------+--------+
| Variable_name            | Value  |
+--------------------------+--------+
| character_set_client     | utf8   |
| character_set_connection | utf8   |
| character_set_database   | utf8   |
| character_set_filesystem | binary |
| character_set_results    | utf8   |
| character_set_server     | utf8   |
| character_set_system     | utf8   |
+--------------------------+--------+

2: Rails
I'm using Rails 3 - can't tell you how this works in Rails 2.x
a: In config/application.rb make sure this line is uncommented:

    # Configure the default encoding used in templates for Ruby 1.9.
    config.encoding = "utf-8"

MySQL and Rails use different variants of utf8/utf-8 - make sure you are using the right one. And note the comment above this line - this sets up utf-8 encoding for templates ONLY.
b: In your database.yml, specify the encoding for the databases - for example:

development:
  adapter:  mysql2
  host:     localhost
  encoding: utf8
  [...]

Here you are telling the database adapter that the database uses utf-8.
c: mysql2
Notice that I am using the mysql2 adapter instead of mysql. At this point (Feb 2011) the mysql gem is NOT encoding aware. Replace mysql with mysql2 in your Gemfile and run bundle.
d: In each Model that uses text add this line at the very top of the file:

# encoding: UTF-8

This tells Ruby that we're using utf-8 in this model. I don't see a way to set this at the application level so you have to have to add it in all relevant model .rb file. I also don't like defining something with a comment line. I can't see how to define this in, say, an irb interactive session.

With all those components in place, you should be all set. Try entering non-ASCII characters into a form - such as accented characters or greek/math symbols. These should be displayed correctly in the browser and in the mysql command line client.

With regards to Heroku, assuming you have your app already set up, you should be able to do a 'heroku db:push' to copy the database into PostgreSQL on Heroku and the characters should display correctly on the remote pages. You will see reference to using 'heroku db:push' with explicit database URLs that include an encoding option, such as '?encoding=utf8'. If your MySQL is set up correctly then this should be unnecessary.

A critical part of running apps on Heroku is the ability to pull the database back to your local database using 'heroku db:pull'. Before getting all my components set up with utf-8, this step failed for me. With everything using utf-8, and after adding the 'max_allowed_packet' lines to my my.cnf file, this process works fine.

But because I was working with data in before everything was truly utf-8, I had some instances of text in the database that had been incorrectly encoded - and thereby effectively corrupted. I could see what the 'corrupt' characters looked like and I knew what the correct versions should be. Because everything is now using utf-8 I could simply do a substitution on the text. For example:

str.sub!(/ÃŸ/, 'β')

I gathered up the character mappings that I needed (which were not may in my case) and wrote up a class method that I cloned in each model with the issue. I then ran those in the Rails console to correct the bad characters. The method is:

  def self.make_utf8_clean
    mappings = [  ['Î±', 'α'],
                  ['ÃŸ', 'β'],
                  ['Î²', 'β'],
                  ['â€™', '’'],
                  ['â€œ', '“'],
                  ['â€\u009D;', '”'],
                  ['â€', '”'],            
                  ['Ã¶', 'ö'],
                  ['Â®', '®']
              ]
    # Get the list of String columns
    columns = Array.new
    self.columns.each do |column|
      if column.type.to_s == 'string'
        columns << column
      end
    end
    
    # Go through each object -> column against all mappings
    self.all.each do |obj|
      columns.each do |column|
        mappings.each do |mapping|
          value = obj.attributes[column.name]
          if value =~ /#{mapping[0]}/
            s = value.gsub(/#{mapping[0]}/, "#{mapping[1]}")
            obj.update_attribute(column.name.to_sym, s)
          end      
        end
      end
    end
  end

This looks at your model and figures out which columns are of type String. It goes through all records and all character mappings, replacing text and updating the database as needed. Your mappings array could be much larger. There may be a better source of these, but this is a start.
You run this in a rails console like this:

Loading development environment (Rails 3.0.4)
ruby-1.9.2-p0 > YourModel.make_utf8_clean

It's a hack but it helped my 'fix' quite a few records that would have been a pain to recreate.

Character encodings are HARD - Yehuda Katz wrote a nice article on the issues. For most purposes (unless you work with Japanese text) UTF-8 is your best choice for encoding and so I'm using it exclusively. Java and Python both made the same choice and things are probably easier to set up in those worlds. Ruby has it's roots in Japan and so it is not surprising that it could not go down that path.

From now on, I'm going to make sure everything I touch is configured for UTF-8. There are fews reasons not to at this stage and it allows you to handle most languages.

Wednesday, February 23, 2011

Tables named with reserved words in MySQL

Trying to fix a string encoding issue in a MySQL database I realized that one of my tables was called the same as a MySQL reserved word - specifically I have a table called 'references'.

This was created from a Rails app and I have been using this with no problems for a few months, so you can use reserved words, or at least that one.

Problem is that direct SQL statements in the MySQL client like this don't work:

mysql> describe references;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that 
corresponds to your MySQL server version for the right syntax to use near 'references' at line 1

The solution is to prefix the table name with the database name like this:

mysql> describe mydb.references;
+------------------+--------------+------+-----+---------+----------------+
| Field            | Type         | Null | Key | Default | Extra          |
+------------------+--------------+------+-----+---------+----------------+
| id               | int(11)      | NO   | PRI | NULL    | auto_increment |

Tuesday, February 22, 2011

Setting up PostgreSQL on Mac OS X 10.6

I needed to set up PostgreSQL on my Mac in order to troubleshoot a problem with a Rails application.

Here are the steps that I followed:

1: Using Homebrew, install and build PostgreSQL
Homebrew will give you commands to create a Launch Agent that starts the server on a reboot

$ brew install postgresql
$ cp /usr/local/Cellar/postgresql/9.0.3/org.postgresql.postgres.plist ~/Library/LaunchAgents
$ launchctl load -w ~/Library/LaunchAgents/org.postgresql.postgres.plist

2: Setup PostgreSQL
You should look into setting administration user - I skipped that for my purposes

$ initdb /usr/local/var/postgres
The files belonging to this database system will be owned by user "jones".
This user must also own the server process.
[...]

3: Create a specific database and verify that it is running

$ createdb mydb
$ psql mydb
mydb-# select version();
mydb-# \h

4: Install the PostgreSQL ruby gem
You'll see different names for this gem - just use 'pg'. Using the ARCHFLAGS env variable is important. I did not need to specify the explicit path to the Homebrew installation of the PostgreSQL software.

$ env ARCHFLAGS="-arch x86_64" gem install pg

Monday, February 21, 2011

Problem with sqlite when installing the taps ruby gem on Mac OS X

Just worked my way through one of those installation problems where trying to install A fails because B is missing and trying to install B fails because... and so on.

1: Trying to push a database to the Heroku hosting service failed because I did not have the 'taps' gem installed
2: 'gem install taps' failed because it couldn't compile a component that interacts with sqlite databases. I'm not using sqlite but there is no way to skip this part of the code - hmm...
3: Mac OS X has sqlite installed by default (or at least with Xcode installed)
4: Explicitly telling taps where to find the lib and include files didn't fix it

So the problem lay in my sqlite installation

Lots of mentions of the issue on Google - some of which said to install a new version with MacPorts

In the past I have found MacPorts to be a very frustrating experience - just wanting to install a single library has led to literally hours of watching it fetching and installing all sorts of apparent dependencies. So I tend to just get the source code for whatever I want and compile it manually, but that can be a pain in and of itself.

I've heard very good things about Homebrew as a MacPorts replacement so I figured I should take a look.

1. Set up a 'staff' group so that you can install code without having to sudo everything
2. Get Homebrew
3. Install your library

$ sudo dscl /Local/Default -append /Groups/staff GroupMembership $USER
$ ruby -e "$(curl -fsSL https://gist.github.com/raw/323731/install_homebrew.rb)"
$ brew install sqlite

Simple... Brilliant... I'm sold...

Homebrew installs code in /usr/local/Cellar by default so I need to tell the ruby gems where to find that - and somewhere along the line I saw a sqlite ruby gem. So I figured I should install that first as a check that things are working before trying 'taps'.

$ gem install sqlite3 -- --with-sqlite3-include=/usr/local/Cellar/sqlite/3.7.5/include \
                                            --with-sqlite3-lib=/usr/local/Cellar/sqlite/3.7.5/lib
$ gem install taps

It worked...
You may see mention of a sqlite3-ruby gem. That is now called sqlite3 - it's the same thing.

And finally I can run '$ heroku db:push' and send my database to my Heroku app.

Phew...

Friday, February 18, 2011

Rails, Devise and custom User models

Devise is an excellent solution for user authentication in a Rails application.

Ryan Bates has done two great Railscasts episodes on Devise - #209 Introducing Devise and #210 Customizing Devise.

The default Devise configuration uses a simple sign up process - you give it an email address, a password and password confirmation. Follow the installation instructions and it should just all work.

But in my current application I need a bit more. I want the user to enter their first and last names, I want to assign a role and I want to link that individual user to a company account. Devise is capable of handling all this but the README on github doesn't really explain how and I, for one, get a bit nervous messing with the code of my authentication solution.

It turns out to be incredibly easy as long as you don't try and be too clever.

Be going into the steps given below, I recommend trying out a basic off the shelf Devise installation in a test application first just so you know that it works on your machine and you can see what files it creates, etc.

In these steps I'm going to use Devise with a User model that contains some custom fields.

1. Before creating your own User model, do a basic Devise installation into your app

$ gem install devise  # or in Rails 3 add it to your Gemfile, and 'bundle'
$ rails generate devise:install  # follow the instructions given
$ rails generate devise User
$ rails generate devise:views # this generates sign_in, etc views under 'app/views/devise' - not user!

2. Modify your User model by adding custom fields to attr_accessible
Here I'm adding :first_name, :last_name, :active, :role

attr_accessible :email, :password, :password_confirmation, :remember_me,
        :first_name, :last_name, :active, :role

Add your own validations, etc to the model. For testing, at least, require the presence of at least on of your custom fields.

3. Modify your Migration for the User table and run the migration
In my case I added these lines:

     t.string :first_name
      t.string :last_name
      t.boolean :active, :default => true
      t.string :role, :default => 'user'

Run the migration with 'rake db:migrate'

4. Modify your Sign_up form
This lives in app/views/devise/registrations/new.html.erb
NOTE: You can have Devise install its views under app/views/user but I prefer to keep Devise specific views in their own directory
Add fields to the form for your custom fields e.g.

<%= f.text_field :first_name %>
etc.

5. Try it out - sign up a new user
Go to the URL /users/sign_up
Your input to the custom fields should go into the database and any validations against custom fields which fail should give you the proper error messages and highlighting in the sign_up form
In my experience this 'just worked'

6. But the whole reason you want custom fields in the User model is to work directly with them...

For this you need a User controller and views. Devise does not give you either of these.
Either run a scaffold generator and skip the model or copy over another controller and set of views.
Now you have the regular set of actions for your model.
Go to /users and you should see the user(s) that you added, /users/1 will show you that user with whatever columns you choose to display.

In my case I display my custom fields and the email in my show and index actions and just ignore the rest of the Devise specific fields.

You want to be careful with the User new/create/edit/update actions. If you create a new user via that path then they will have no password, etc so you might want to remove new/create. The edit/update actions are useful if a user want to change their name and other 'profile' information, but don't mess with the Devise-specific fields via this route.

Basically, the Devise side of things and your custom User model can coexist quite happily. Make sure you don't mess with the fields that Devise requires and don't use the same column names.

I would also avoid using virtual attributes in the custom fields. I tried this and couldn't get it to work. Not a big deal for my case.

When I started integrating Devise into my app I had a sinking feeling that the custom fields would be a real problem. Quite the opposite - this turned out to be really easy.

Great kudos to the folks at Platforma Tec - Jose Valim and colleagues for a really nice piece of work.

Basic image rollover effect in jQuery

There are so many fancy image effects that you can write in jQuery that it is easy to overlook the basics. Here is basic image rollover script.

I have two images 'logo.png' and 'logo_highlight.png'. I want to display 'logo' by default and then replace it with 'logo_highlight' when I roll the mouse over it.

Here is my image tag in the html:

<img id="logo" src="images/logo.png" alt="Logo" />

And here is the script (assuming that you have the jQuery loaded)


<script>
$(document).ready(function() {
    $('#logo').hover(function(e) { 
  this.src = this.src.replace('logo', 'logo_highlight');
 },
    function(e) { 
  this.src = this.src.replace('logo_highlight', 'logo');
 });
});
</script>

The script attaches a 'hover' event handler to the DOM element with ID 'logo'. This has two functions that are applied when the mouse enters and leaves the element respectively.

On entry, the image 'src' attribute is updated. The new one is derived by replacing the string 'logo' in the filename of the original image with 'logo_highlight' in the new version. In other words, the image tag now sources the 'logo_highlight' image.

When the mouse leaves the element, the second function is executed and that replaces the highlighted image with the original.

Short and sweet...

Wednesday, February 16, 2011

Authentication in Mongo and Mongoid

Mongo has primitive authentication - just basic user/password authentication per database.

It's preferred mode of operation is no authentication in a trusted environment. That's fine, but it's not always possible. I want to run mongo on a Amazon EC2 node and access it from remote clients so I need to use authentication. On top of that, I already have the database running without authentication on a node.

Here are the steps you need to make the migration to a server with authentication...

1. Create an admin user on the database
Open up a mongo shell on the machine running the server

$ mongo
> use admin
> db.addUser("your_admin_user", "your_password")
> exit

2. Restart your Mongo server with --auth
It is CRITICAL that you restart with the --auth option. Users and passwords are simply ignored without this option.

$ mongod --auth

3. Set up database specific users

$ mongo
> use admin
> db.auth("your_admin_user", "your_password")
> show dbs
> use your_db
> db.addUser("your_db_user", "your_password")
> db.system.users.find()
> exit

4. Set up authenticated access from your application
I work in Ruby and use Mongoid as the Object Document Mapper to access Mongo. Mongoid, in turn uses the Ruby Mongo Driver. If you are using Mongoid outside of Rails then you will need a configuration block along the lines of this;


Mongoid.configure do |config|
  name = "your_db"
  config.database = Mongo::Connection.new.db(name)
  config.database.authenticate("your_db_user", "your_password")
end

Note that you are authenticating with the Ruby Mongo driver - not with Mongoid.

If you are working with Rails then you'll need to add username and password into your config/database.yml file. I see that the Devise authentication gem can work with Mongo to handle authentication of individual users but I've not explored that yet.

5. Clearly there is an issue having your password in plain text in your code
The bottom line is that you probably don't want to trust Mongo authentication for critical data. In that case, you really need to set up Mongo access in a secure environment and perhaps handle interfacing this with the outside work through a separate gateway application, say a Sinatra app that handles all authentication itself.

For my needs I have non-critical data - I just want to prevent access to arbitrary users (i.e. port scanning scripts) and only access from a few defined scripts on specific machines. So for now this will work for me.

With mongo authentication in place, how do you handle backing up and restoring the database?

On the machine hosting the server you can use these two variants of the dump and restore commands:

$ mongodump -d your_db -o . -u your_db_user -p your_password
[...]
$ mongorestore -u your_db_user -p your_password your_db

To work with all databases you would use the admin user

In order for someone to break into your database someone has to
1: Guess/crack your admin username and password
or
2: Guess/crack your specific database, your db username and password.

You have to evaluate the chances of this along with the value of the data in the database before going down this path.
You can also configure the database to use a non-standard port. There is no harm in this but it offers minimal to no additional security as many malicious scripts will scan across all ports on a machine looking for one that responds.

Caveat emptor...

Tuesday, February 15, 2011

Always index columns that you want to sort on in Mongo

I'm using Mongo as a non-relational database for a few projects. In general it's working out great. MySQL would work too but I like not having to explicitly create a database or run migrations. Plus I figure you can't really understand the strengths and weaknesses of a technology unless you build a real application with it.

I work in Ruby and use the MongoMapper and Mongoid Object Data Mappers to talk to Mongo.

One issue that I do not like is the requirement that you explicitly create an index for every column that you think you will want to sort on. If you don't then all the data gets loaded into memory for the sort and you get an error like this:

[...]/gems/mongo-1.2.1/lib/mongo/cursor.rb:86:in `next_document': 
too much data for sort() with no index (Mongo::OperationFailure)

And if you want to sort on two columns then you need an index on the combination of the two.

You can add indexes at any point - it takes some action but it's not that big a deal. But it doesn't 'just work'... in MySQL it does - an index might give you better performance but it doesn't blow up without one.

You'll hear people claim that the NoSQL databases are schema-free, giving you a lot of flexibility. I don't really buy that argument - in most applications you want a clear schema.

Where I do see the benefit is that, with NoSQL databases, your schema resides your Model - not in the DB itself - and that is where it belongs. When you want to change the schema you just change the Model - no database migrations - very flexible.

But, with Mongo at least, if you have to define indexes ahead of time in order to sort even relatively small numbers of objects then that nullifies some of that benefit.

Using Mongoid in Ruby applications outside of Rails

Mongoid and MongoMapper are two Ruby ODM (Object Document Mapper) gems for the Mongo database.

I've used both to a limited extent and they seem comparable for my needs. Mongoid seems to be getting a bit more traction than MongoMapper and it certainly has better docs.

My current project uses Mongo in a standalone Ruby application - no Rails in sight - but the docs are almost totally focused on Rails. Here is how you use Mongoid outside of Rails.

I'm storing relevant RSS entries in the database. My model looks something like this (heavily truncated):


class RssEntry
  include Mongoid::Document

  field :entry_id
  field :title
  field :authors, :type => Array
  field :timestamp, :type => Time

  index :timestamp
  index :title
end

Be sure and define your indexes carefully for fields that you want to search on, otherwise Mongo will run out of memory when searching even modest datasets. I see this as a weakness of the database. See important note on creating indexes below!

and the application looks a bit like this (edited):


#!/usr/bin/env ruby
require 'mongoid'
$:.unshift File.dirname(__FILE__)
require 'mongoid_test_model'

Mongoid.configure do |config|
  name = "mongoid_test_db"
  host = "localhost"
  port = 27017
  config.database = Mongo::Connection.new.db(name)
end
[...]
entry = RssEntry.create({
    :title => title,
    :entry_id => id,
    :authors => authors,
    :timestamp = Time.new
})

And if you are using the defaults of localhost and 27017 then you can leave those definitions out.

NOTE: Simply defining an index in your model is NOT enough. You have to explicitly create the index. When you use Mongoid with Rails it sets up a rake task so you can run 'rake db:create_indexes' but outside of that environment you need to do this yourself.

You'll want to write a simple script/rake task to set this up, in which you call create_indexes on EACH class in your model that uses Mongoid. For example:


#!/usr/bin/env ruby
require 'mongoid'
$:.unshift File.dirname(__FILE__)
require 'mongoid_test_model'

Mongoid.configure do |config|
  name = "mongoid_test"
  host = "localhost"
  port = 27017
  config.database = Mongo::Connection.new.db(name)
end

# Call on each of the relevant models
RssEntry.create_indexes()

Previously, you could specify auto-indexing within your models but this has now be deprecated or removed, so ignore any references to that.

Monday, February 14, 2011

Using Twitter for System Notifications

I finally figured out something that many, many people have been doing for quite a while - using Twitter as a way to deliver notification of system events.

Twitter is a great way to deliver short messages to many people via many forms of media and devices. The default is that any message is available to anyone in the world. But you can also configure a Twitter account to be private, requiring the owner to explicitly allow access to other users. In the extreme case the owner can deny access to none but him or her self.

Twitter handles all the messaging, all you need to do is have your server, web application or whatever, send a message to your private account whenever some event takes place. For example, I run long calculations on servers at work and I want to be notified when a job completes.

You can find a load of UNIX command line twitter clients and libraries in all the main languages. So finding or building a suitable client is straightforward.

I'll show you how to build a simple client in Ruby.

If you want to send tweets to a private account then you will need proper authentication credentials.

For this you need to use OAuth - username/password authentication has been deprecated.

1: Sign in to Twitter as the owner of the private account
2: Go to http://dev.twitter.com - you'll still be signed in
3: Click on 'Register an App' - now you're not really creating a new twitter application but pretend that you are - give it a name - and you want to select that it is a 'client' application and that it should have 'read write' access to the account.
4: Now go to 'your apps' and click on the new dummy app.
5: Scroll down and get the 'Consumer key' and 'Consumer secret' - you'll need these in your code.
6: Those are required for your application, but in addition you need a key and secret for the actual twitter account that you will want to write to.
7: On your app settings page, on the right sidebar, click on 'My Access Token' and get 'Access Token (oauth_token)' and 'Access Token Secret (oauth_token_secret)'.

Now we can write some code.

8: Get the 'twitter' Ruby gem

$ gem install twitter

9: Write a small ruby app. This simple example takes a message on the command line, configures the client with the FOUR OAuth tokens/strings and then updates the private twitter account with the message:

#!/usr/bin/env ruby
require 'twitter'
abort "Usage: #{$0} message" if ARGV.length == 0
# Hard-wired to my private twitter account
Twitter.configure do |config|
  config.consumer_key = 'your-app_key'
  config.consumer_secret = 'your_app_secret'
  config.oauth_token = 'your_account_token'
  config.oauth_token_secret = 'your_account_secret'
end
client.update(ARGV[0])

10: It's that simple...
11: chmod a+x your script and run it with a message - check your private twitter account and you should see it.

It's easy to think up (and code) custom notification scripts for this. As long as you have a network and as long as Twitter is up (OK, it has had some issues) then you don't need to worry about anything to do with distributing your messages. You can get them on your phone or your desktop, and you can leverage the work of others to display popup windows on your desktop, play tunes, flash lights, etc, etc.

Just remember that when you create your private Twitter account that you go into the settings and make sure that it is indeed set to private.

One extension that I've thought about is having my script take an optional URL, say pointing to the results from a computational run, and using a URL shorting service like http://bit.ly or http://goo.gl to let me include that in the tweet. Unfortunately none of the 'big name' services allow you to have private URLs so that might be a problem in some applications. But it's worth considering for some applications.

PATSY - a web service that makes patents easier to read

I've just launched PATSY - a new web service that reformats US patents to make them much easier to read than their original format.

The text of patents is typically very dense and difficult to read.

They are written as legal documents and inevitably this results in verbose and sometimes arcane text. Every component of invention will have all possible variants enumerated and this can result in sentences of ridiculous length with these variants delimited by commas. On top of that, the US patent office still prints patents as two narrow columns of text of each page - a format that might work in newspapers but which in technical patents is nonsensical.

The underlying problem is that the patent offices should define and enforce a modern way of text formatting that is both easy to read and easy to parse in software.

But as this is not likely to happen any time soon, I decided to write an application that reformats the text of patents into something more palatable.

You enter a patent number into PATSY and it fetches the web page from the US patent office web site. It scans the text and splits up paragraphs into component sentences. Furthermore it splits sub-sentences by punctuation such as semi-colons. Simply adding this spacing makes a big difference.

But PATSY goes much further. It highlights a series of phrases that are typically of interest - such as 'preferred embodiment' and 'SEQ ID NO'. It recognizes references to other patents and hyperlinks these to either their patent office site or to PATSY directly. In some cases, references to scientific publications can be identified and links are added that will take the user to the NIH PubMed site of abstracts, and from there the original publication can be accessed in most cases.

PATSY only works with US patents right now and some of its features are geared towards biotechnology patents. The text parsing is not perfect but even at this early stage in its development, it can really make dense blocks of text much easier to read. In cases where the result is unclear, you can click the head of each text block to see the original text before any processing.

While it is in this early stage, PATSY is completely free. If it turns out to be useful to a lot of people then I may offer it via subscription to heavy users, while retaining free access to occasional users.

Please try PATSY out and send me feedback at info @ craic.com.

Technical aspects:

PATSY is written in Ruby using Sinatra as a lightweight web application framework. It runs on Heroku which is a hosting service for Ruby web applications that sits atop Amazon web services. My steps involved in setting it up are described here.

My experience with Heroku for this application thus far has been great. They allow you to set up applications with limited resources at no charge. If and when PATSY starts to get some traction then I can scale it up by adding more of what they call 'dynos'. That will incur some cost but there is no commitment or up front payment, plus the process of scaling is incredibly easy.

Friday, February 11, 2011

JavaScript Bookmarklet that can create a New Window

The popup-blocking features of current browsers can be a problem if you are writing a JavaScript Bookmarklet that wants to open a new window. For example, I want to select text in an arbitrary window and then have a remote server operate on the text and return its results to a separate window, or tab. A bookmarklet is a great way to do this.

One approach to writing these is to make the bookmarklet simply call a JavaScript script on a remote server, which does the real work. This results in a simple bookmarklet and lets you perform arbitrary operations in the remote script.

But when the end result is the creation of a new browser window this approach will fail...

Modern browsers view this a potential exploit and will only allow the creation of new windows as the result of direct user interaction - i.e. the user clicks something.

All is not lost - it just means that you need to put all your code in the bookmarklet itself. This is messy but for many scripts this should not be a problem.

Here is my example. It gets the currently selected text and adds that to the URL of a remote service. It opens that URL in a new browser tab or window, depending on the specific parameters. If no text has been selected then it prompts to user to enter some. This first version will open the new page as a new tab in most browsers.

<a href="javascript:(function(){
// Get the current selection
var s = ''; 
if (window.getSelection) { 
s = window.getSelection(); 
} else if (document.getSelection) { 
s = document.getSelection(); 
} else if (document.selection) { 
s = document.selection.createRange().text; 
} 
// Prompt for input if no text selected
if (s == '') {
s = prompt('Enter your text:');
}
// Open the target URL in a new tab
if ((s != '') && (s != null)) {
window.open('http://example.com/yourapp?id=' + s);
}
})();">BOOKMARKLET</a>

You would want to remove the comments from the bookmarklet, but you don't need to strip the newlines or minify the code.

The default of most current browsers is to create new tabs instead of new windows. Users can set their preferences to override this but most will not.

You may want to force the creation of a separate window. Think this through carefully - it may annoy some users if you start generating loads of windows. In some cases it is appropriate. It used to be that you could force this by passing '_blank' as the name of the new window but this does not appear to work in all browsers. Instead you need to explicitly specify one or more window properties, like width and height.

This is a messy solution but it works. In my application I just replaced the window.open call with this form:

window.open('http://patsy.craic.com/patsy?id=' + s, '_blank', 
'height=600,width=1024,status=1,toolbar=1,directories=1,menubar=1,location=1');

The options string in the third argument specifies what the new window should look like. You may need to experiment with these. With Google Chrome on the Mac these do not give me the expected result - the address is not editable and there is no bookmarks bar. I also found that simply using 'status,toolbar,etc' without the '=1' did not work, although you will see this listed as a valid syntax.

Friday, February 4, 2011

Running a Ruby 1.9 Sinatra app on Heroku

I just got my first significant application running on Heroku. It uses Sinatra instead of Rails and uses Ruby 1.9 - as a result the steps to get the application up and running were slightly different from the Heroku Quickstart Guide, which is tailored towards Rails apps.

1. Setup
Setup a Heroku account and set up the SSH keys
Create your app and make sure it runs correctly on your local machine.
Make sure that all paths are relative to the application root.
Run the Sinatra app from a config.ru file.

2. Setup Bundler
I had an issue with this originally but this is what worked. I only need the sinatra gem so my Gemfile is:

source :gemcutter
gem 'sinatra'

and my config.ru file is:

require 'bundler'
Bundler.require
require 'sinatra'
require './my_app.rb'
run MyApp.new

Note that although you need to have the heroku gem installed on your system in order to upload to Heroku, you do not 'require' it in your app.

3. Setup git and commit the project.

4. Create the Heroku app
You want to specify an application name, otherwise Heroku will give you an arbitrary one. You also want to specify the run environment at Heroku that will be used. They refer to this as the 'stack' and for Ruby 1.9 you want to specify this directly and currently the correct option is 'bamboo-mri-1.9.2'. The create command is:

$ heroku create my_app_name --stack bamboo-mri-1.9.2

5. Push the git repository to Heroku with:

$ git push heroku master

The messages that follow should look something like this (some lines removed):

-----> Heroku receiving push
-----> Sinatra app detected
-----> Gemfile detected, running Bundler version 1.0.7
       Unresolved dependencies detected; Installing...
       Fetching source index for http://rubygems.org/
       Installing rack (1.2.1) 
       Installing tilt (1.2.2) 
       Installing sinatra (1.1.2) 
       Using bundler (1.0.7) 
       Your bundle is complete! It was installed into ./.bundle/gems/
       Compiled slug size is 500K
-----> Launching... done
       http://my_app_name.heroku.com deployed to Heroku

Now go to that URL and your app should be running.

If there was a problem and the app failed to start then look at the logs:

$ heroku logs -n 100

You can set up apps with minimal (or no) database storage for free on Heroku. This is a great service as it lets you experiment to your heart's content.

The idea behind Heroku is to remove from you the burden of server configuration. For my simple application this seems to work remarkably well.

Tuesday, February 1, 2011

Ruby 1.9 and incompatible character encodings

I run into issues pulling remote text data into a Ruby 1.9 / Rails 3 app, which is using utf-8 encoding by default. The problem apparently comes from non-Ascii characters in binary or so-called ASCII-8BIT encoded text. I don't have a proper way to translate the offending characters as yet but my workaround is to strip them out and/or replace them with an ASCII character.

This regex implements the workaround. Be sure to use the 'n' modifier on the regex. This specifies that the encoding of the text should be ignored and thus multibyte characters are treated as separate bytes.

    str.gsub!(/[^\x00-\x7F]/n,'?')

Far from perfect, but it gets the job for me right now.

Craic Computing Tech Tips

Thursday, February 24, 2011

Rails, UTF-8 and Heroku

Wednesday, February 23, 2011

Tables named with reserved words in MySQL

Tuesday, February 22, 2011

Setting up PostgreSQL on Mac OS X 10.6

Monday, February 21, 2011

Problem with sqlite when installing the taps ruby gem on Mac OS X

Friday, February 18, 2011

Rails, Devise and custom User models

Basic image rollover effect in jQuery

Wednesday, February 16, 2011

Authentication in Mongo and Mongoid

Tuesday, February 15, 2011

Always index columns that you want to sort on in Mongo

Using Mongoid in Ruby applications outside of Rails

Monday, February 14, 2011

Using Twitter for System Notifications

PATSY - a web service that makes patents easier to read

Friday, February 11, 2011

JavaScript Bookmarklet that can create a New Window

Friday, February 4, 2011

Running a Ruby 1.9 Sinatra app on Heroku

Tuesday, February 1, 2011

Ruby 1.9 and incompatible character encodings

Contributors

Archive of Tips

Thursday, February 24, 2011

Wednesday, February 23, 2011

Tuesday, February 22, 2011

Monday, February 21, 2011

Friday, February 18, 2011

Wednesday, February 16, 2011

Tuesday, February 15, 2011

Monday, February 14, 2011

Friday, February 11, 2011

Friday, February 4, 2011

Tuesday, February 1, 2011

Contributors

Subscribe To This Site

Archive of Tips