Home > Uncategorized > GSoC weekly report #11

GSoC weekly report #11

BAM output in viewing tool

For a long time, sambamba view had no option to output BAM files. That was related to awful quality of its code, which is now no longer the case. (Looking at the code in its current state and to the ‘Head First Design Patterns’ book, seems I’ve applied Template Method pattern.)

Other refactoring

A lot more things should be refactored, of course. E.g. all the tools now have hard-coded functionality in them (merging, sorting, indexing) while in principle, what one would like to see is methods ‘sortBy!Coordinate’, ‘index’ for BamFile, and moving merge functionality to MultiBamFile struct providing the same methods as BamFile. But that’s something I’ll work on after GSoC, not now.

Ruby DSL for selecting reads

I’ve described it in a Cucumber feature, a simple example of what it’s capable of (in fact, the following code just calls the command-line tool with –count and –filter=… parameters):


puts bam.alignments
        .referencing('20')
        .overlapping(100.Kbp .. 150.Kbp)
        .select {
            flag.first_of_pair.is_set
            flag.proper_pair.is_set
            sequence_length >= 80
            tag(:NM) == 0
        }.count

Man pages

I went down the same road as Clayton and created man pages with Ronn. For those who’re not yet aware, it’s a tool written in Ruby which allows to make both HTML and man output from files with Markdown syntax. Very convenient :)

Script for creating Debian packages

Also I wrote a Ruby script today to easily create debian packages for both i386 and amd64, and put them to Github downloads. Hopefully, now that I’ve done it I’ll make releases more often.

Future plans

We’ve agreed on importancy of having tools used by more people, and the decision is to put them on Galaxy Tool Shed. That should help in popularizing sambamba, my tool has some cool functionality to offer :)

I had a look at what’s available on Tool Shed, with special interest of what stuff for dealing with SAM is presented there. I’m planning to add an example of filtering because this is one of areas where sambamba really shines, and maybe also indexing.

After that I’ll start optimizing my D and Ruby code, and also reducing memory consumption. Perhaps, I’ll spend on that all remaining time. If not, I have a lot of other stuff to do, like stdin/stdout support.

Advertisements
Categories: Uncategorized Tags:
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: