Home > Uncategorized > GSoC weekly report #6

GSoC weekly report #6

Ruby bindings

The bindings now work with JSON output from command-line tool sambamba which is to be installed. All the old Cucumber scenarios are passing except one not-very-important one. I’ll write documentation during this week and pack that into a gem.

SAM input

I got it working, and introduced new type SamFile which is very similar to BamFile except it doesn’t provide random access. Unit tests ensure that parseAlignmentLine(toSam(read)) == read for all valid reads (otherwise invalid fields are default-initialized)

In order to allow invalid data, I made a simple rule invalid_field in Ragel, which just reads until next tab character:

mandatoryfields = (qname | invalid_field) '\t'
                  (flag  | invalid_field) '\t'
                  (rname | invalid_field) '\t'
                  (pos   | invalid_field) '\t'
                  (mapq  | invalid_field) '\t'
                  ... // and so on

Parsing is now about 3x as slow as in samtools, but that has nothing to do with Ragel, the main reason is too much memory allocations. I did some profiling, and doubling the speed won’t take a lot of effort. As for tripling, I’m not that sure, but I’ll try :)

Sambamba CLI tool

It accepts both SAM and BAM files as input, and can output either SAM or JSON (speed is the same for both cases). Also it allows filtering by quality and/or read group, and accepts samtools syntax for regions.

More on wiki: https://github.com/lomereiter/sambamba/wiki/Command-line-tools

About these ads
Categories: Uncategorized Tags:
  1. June 27, 2012 at 8:40 pm | #1

    I think JSON support is a killer feature.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: