SciDB An Open Source Data Base Project by Paul Brown medical emergency trumped presence Outline Science Why Our data science folks are unhappy with RDBMS project what we are doing about it O 100 petabytes Nearest neighbor queries time series queries Snow Cover in the Sierras Why SciDB Big science very unhappy with RDBMS Astronomy HEP Fusion Bio Remote sensing Oceanography Why Experience Tried to use Postgres for science databases Failed of Sequoia 2000 mid 1990s badly Main science data type is an array horribly inefficient to simulate arrays on top of tables Required features absent provenance uncertainty version control SQL operations wrong regrid not join Why SciDB Net result Mentality of roll your own from the ground up for every new science project Realization by the science community that this is long term suicide Community seemingly wants to get behind something better Great commonality of needs among domains A Little Context XLDB 1 Oct 2007 Message from previous slides came across loud and clear Dewitt Stonebraker agreed to move the ball down the field A Little Context Asilomar March 2008 2 day workshop 18 people Flesh out requirements biggie is open source commercial quality petabyte scale DBMS Considerable commonality across science disciplines Core team of science and DBMS types identified to push things forward Research issues identified A Little Context The Next Year Initial design completed Along with an initial implementation Shown at VLDB Lyons Recruiting of initial team Detailed use cases specified Funding Situation Tried and failed to get from NSF DOE NASA Tried and failed to get from foundations Tried and failed to get from industry Last resort was VC s Company Zetics funded in March 10 Present Day Structure About 25 employees consultants volunteers co ordinated by Suchi Raman Design co ordinated by Mike Stonebraker and Paul Brown Support marketing business website co ordinated by Marilyn Matz Data Model Nested multi dimensional arrays natural

