15th Meeting (September): Modelling data in a schema-free world? / Cascalog

Jan: Modelling data in a schema-free world?
Even though most NoSQL databases follow the “schema-free” data paradigma, it is still import to choose the right data model to make the best of the underlying database technology. This talk provides an overview of the different data storage models available in popular NoSQL databases.In the era dominated by relational databases, it was common sense to use the relational data model for all tasks. Not following the commandments of normalisation was often considered to be an anti-pattern and only justified in extreme cases. However, rigid schemas and high normalisation are often a bar to rapid development and deployment.

Now with NoSQL databases going mainstream, developers are confronted with the other extreme: most NoSQL databases are schema-free, meaning they don’t require (and don’t even allow) the developer to define the data schema up front. Still, different NoSQL databases are particularly suited for just specific types of data and queries. While all these databases excel in some particular areas, they often suck in others. What a database is good at is determined by the underlying storage model used, e.g. key/value, columnar, document, or graph storage. Developers need to be aware of the fundamental differences of these data storage models so they can pick the right solution for the job.
 
Watch the video:

View the slides
 
 
TJ: Cascalog
Cascalog (a very pleasant layer over Hadoop).

http://clojure.com/blog/2012/02/03/functional-relational-programming-with-cascalog.html

Why is it pleasant? Because unlike mainstream computing, it leverages certain pre-computer ideas of computation which were, naturally, optimized for fitting in people’s heads and writing briefly on paper. Because many of these ideas were developed before computers. (We call these ideas “logic”.) It turns out that these ideas offer stunningly easy ways to describe queries.

I will query Hadoop live, in front of you. Each example will only be a few lines long, even though they implement powerful ideas like implicit joins. To leverage your intuitions — even if you’ve never used Hadoop — I will contrast with another powerful declarative language: SQL.
 
Watch the video:
Tayssir John Gabbour about Cascalog

This entry was posted in Talks. Bookmark the permalink.