Airbnb/Apache Superset – the open source dashboards and visualization tool – first impressions and link to a demo

Today I spent some time looking into Superset, the analytics and BI open source tool from Airbnb which is now being incubated into Apache. Superset is into the Tableau and PowerBI arena and it is quite mature already, mature enough for business users too (though not as customizable as other solutions).

You can try Superset yourself by visiting the demo I set up at superset.assemblinganalytics.com (user and password “demo”).

Superset dashboards look like this:

enter image description here

It does not have the same polish of Tableau from a design perspective yet, but dashboards do look beautiful and definitely much better than what business users are used to today from the MS Office software. One thing to notice right from the start is that Superset is set to support as many visualizations as possible by leveraging existing open source projects (like plot.ly), this is a big differentiator versus other dashboard solutions where visualizations are less numerous.

But they key differentiator of Superset is that it is open source and that is built on an extensible framework (Flask) in a programming language that is becoming the de-facto go-to language for data science (Python). I have run into issues with traditional enterprise software vendors many times on missing feature xyz or even visualization xyz, in this cases if you are lucky you can get that feature in the next release 6-8 months from now, if not (which is usually the case) you just have to deal with the missing feature. One example of that is the lack of an Excel export in Tableau, which is due to Tableau just not listening to its users and wanting to push online usage rather than data export, so bottom line I had to build something myself, but it’s a hack and right now not supported.

Being open source Superset is also more flexible, adding features or visualization is something you could do if you can program, or you could sponsor if you have a budget. Talking about budget, Superset is completely free, no limitations other than the computing power of your server. This means that adding an additional, say, 1,000 users is going to be very cheap. Cost is important as developing a data-driven organization means all your employees have access to not only advanced analytical dashboards but also to powerful tools to do additional analysis on their own, this is one of the weak points of Tableau.

The database to dashboard architecture is also very powerful. With some Python and a database, you can quickly grab data from multiple places and, for instance, build dashboards from Google Sheets data.

OK, so Superset is cool and might have a bright future, but is it ready for enterprise usage now? The answer is: it depends, analytics culture plays an important role here. To highlight pros and cons more in detail, I’ll go over a point by point analysis comparing with tools I have used in my work experience (Tableau, Cognos and BusinessObjects).

Please remember those are first impressions about Superset.

Learning curve

Superset is much easier to start with than Tableau and let’s not even compare to old stuff like Cognos or Business Objects. Superset wins hands down in this department.

Online vs desktop

Superset works on all most used browsers and has been built as a web app, so it does not require any additional desktop installations for power features. As such, Superset is the only tool out there that can be used completely in the web browser which is an advantage. You might think Business Objects is also used online, but the way I have seen it being used by most users is to extract Excel files to then work on desktop, this is not a fully online data exploration workflow.

Data exploration

Superset’s mission is to make it easy to explore data and find insight. This is something that has a lot of potentials as it can scale (a big part of that is the lack of license costs on a per user basis). Below you can see how this data explore works, it changes depending on the chart type that you want to visualize.

enter image description here

It’s snappy and quite responsive, but there is no refresh button so sometimes you have to save in order to “refresh”, a bit un-intuitive but it works. The experience is not as interactive as Tableau and there is a lot of small details that are missing. Overall it works, but it needs to develop further. Reading around, the team at Airbnb seems to be very focused on making this experience better, so I expect there will be improvements in the future. However, it’s already a quite impressive tool and once you get into the flow you can use it to get business insights already.

I plan to test this further with my own data, so stay tuned for more commentary on this one!

Adding data

As of now, uploading data in Superset is not possible, you have to upload data separately in a database. While I understand why this is the case – data management does not belong into an analytics tool – I do see some challenges with that, especially for beginners. You can, however, build a relatively small Flask app to handle this use case (which is uploading Excel files, .csv, etc.). One thing I am wondering is if the Superset navigation bar on top can inherit additional drop down menus. Ideally, users would not even realize they are not using Superset when doing data operations and uploads.

Visual filters

This is a big one for me and right now it’s a miss in Superset. In Tableau users can click on graphs and that can trigger additional actions (usually a filter on other graphs). Filtering features, in general, are a bit limited in Superset and I hope that will improve over time. I did some research on the roadmap, but I could not find any discussions about action triggers on charts.

Tooltips

The tooltips in Superset cannot be customized. Tableau has very powerful tool tips that can display a lot of information on various fields, this is not the case in Superset.

enter image description here

As you can see above, the highlighted tooltip is good, but no additional information can be added to it.

Calculations

Superset queries databases, there is a nice SQL editor to build custom SQL, but that’s where it stops. This means that right now if you step out of SQL you can’t build the calculations you need.

enter image description here

However, Superset runs on top of Pandas, which is THE python library for data crunching and manipulations (in many cases it can replace R and SASS easily). So, I think that SQL Lab might expand one day in something more than just SQL, maybe adding custom functions that run on top of Pandas data frames? I think this direction would make sense, so I wouldn’t be surprised to see it coming at some point.

Some ideas of future developments

One strong advantage of Superset, which right now does not have as many features as Tableau or even PowerBI, is that it is open source. You know what else is open source? WordPress. What if Superset goes the direction of WordPress? What if Superset becomes the WordPress of analytic CMS’s? Superset could easily go in that direction as by using the blueprints framework of Flask it can be extended with additional web apps and plugins – easily. Although there doesn’t seem to be an integration for this use case yet if this happens Superset might scale beyond being a dashboard web application to be a full-fledged BI and data science solution. This architecture makes sense as by leveraging plugins you can fit analytics to the exact needs of your organization. Compare that to running a full fledged Business Objects solution, which is an “all in but not what you need” solution, and you’ll see the benefits.

Some ideas that come to mind of additional functionality that could be built as a plugin:

  • A notification system. Think a system where admins can schedule email reports or alert rules for sending a notification when some event happens (i.e. you just converted a big customer today, etc.).
  • A data management system. Think running simple ETL tasks using a simple online data pipeline tool. I am guessing a web interface for Airbnb Airflow would suffice, but I did not check this one out yet
  • A wiki
  • A forum
  • A certification system to rank the quality of dashboards (maybe some dashboards for executives needs to be checked by finance first before they get an “updated as of…” flag, etc.)
  • An integration with one of the data science notebook or workbench applications
  • And much much more

I think there are so many ways this could go, which is why I am studying Superset and I plan to keep an eye on it. To be completely fair, some of the above could be done also with Tableau and PowerBI by leveraging their javascript API, but so far that ecosystem has not produced any quality plug-ins I can think of. Also, Superset is built from the ground up to be used on the web browser and to be extended modularly, it’s a much better premise for flexibility and extensibility than closed source software.

Conclusions

Is Superset a replacement of all other BI tools out there? No. Can it grow to be a strong competitor? Yes. I do suspect Superset will always do things slightly differently though, so the organizations leveraging it do need to have a certain culture of analytics, but there is potential. Superset can live together with other analytics tools such as Tableau or Power BI. It is important to keep in mind that Superset scales better as you don’t incur enormous license costs and it’s built with data analysts in mind. Superset wants everybody in the organization to be a data scientist, at least a bit. Other tools are more geared towards power users that distribute reports and dashboards. Ultimately, it’s this difference in vision that might set Superset apart from other similar tools in the enterprise world.

Related Articles

4 thoughts on “Airbnb/Apache Superset – the open source dashboards and visualization tool – first impressions and link to a demo

  1. Hi there. Glad to see reviews of open source bi. I think you may be unfair to existing open source equivalent solutions by saying this is the first… you’ve got several others worth checking, like redash and metabase. I have been tracking superset for a while and am excited it’s under apache now, but still waiting for it to mature and get more functionality, like embedding and parameters. Things, by the way, both redash and metabase do…

    1. Fair comment. I did have a look a Redash and was not fully convinced, but I agree they have their potential too. I am changing the sentence in this post and taking out the word “first”.

      Superset seems more “enterprise” due to the governance model. You can define governance at a very granular level and that is important when working in an enterprise setting.

      If you ask me, they all lack a way to easily manipulate, connect and, in general, playing with data. Even Tableau, while good at this, is not great at this. I think Superset, being deeply integrated with Pandas might be on the right path (maybe building a Pandas data layer to be able to do more advanced calculations on dashboards?).

  2. Great article – really useful to learn about tangible, open source alternatives to the likes of Power BI etc.

    Is the demo still available? The link seems to be down at present.

Leave a Reply

Your email address will not be published. Required fields are marked *