Home > Uncategorized > GSoC weekly report #3

GSoC weekly report #3

Caching module

A week ago, I wrote a post about memoization. I implemented a generic memoize function which uses chosen cache strategy. The strategies can be easily added, and the cache can be accessed from multiple threads, each value being calculated only once.

I have implemented least used and FIFO cache strategies. Later I will study other cache algorithms and measure how they perform. Now simple FIFO cache is used for storing decompressed BGZF blocks.

Random access

is now covered with unit tests which show that accessing with index file produces the same result as naive filtering. Also, it now uses caching so that a block doesn’t get decompressed twice.

It also works from Ruby, in general I try to keep bindings up to date.

Testing

Recently we discussed with Pjotr this topic, and decided that I write a simple CLI application for fetching alignments. It’s ready now: https://github.com/lomereiter/sambamba/blob/master/CLItools/bam-fetch/bam_fetch.d

The output is still slow, I’ll investigate how to make it faster. I already rejected D standard library output methods (see my previous post) and now I’m using only functions from stdio.h for output.

My current ideas are:

  • get rid of functions from printf family and use only fwrite; *printf are slow because of parsing format string
  • get rid of fputc calls, collect bytes in a string buffer instead

Although *printf are convenient, they are not fast enough for big amounts of data. There’re not so many types which I need to convert to strings — mostly integers, and very rarely some float values. I’ll just copy-paste some existing implementations. For instance, this one: http://code.google.com/p/stringencoders/wiki/PerformanceNumToA (they claim that number is converted to string 20x faster than with the aid of snprintf).

Advertisements
Categories: Uncategorized Tags:
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: