Today I spent some time looking into Superset, the analytics and BI open source tool from Airbnb which is now being incubated into Apache. Superset is into the Tableau and PowerBI arena and it is quite mature already, mature enough for business users too (though not as customizable as other solutions).
You can try Superset yourself by visiting the demo I set up at superset.assemblinganalytics.com (user and password “demo”).
Superset dashboards look like this:
It does not have the same polish of Tableau from a design perspective yet, but dashboards do look beautiful and definitely much better than what business users are used to today from the MS Office software. One thing to notice right from the start is that Superset is set to support as many visualizations as possible by leveraging existing open source projects (like plot.ly), this is a big differentiator versus other dashboard solutions where visualizations are less numerous.
But they key differentiator of Superset is that it is open source and that is built on an extensible framework (Flask) in a programming language that is becoming the de-facto go-to language for data science (Python). I have run into issues with traditional enterprise software vendors many times on missing feature xyz or even visualization xyz, in this cases if you are lucky you can get that feature in the next release 6-8 months from now, if not (which is usually the case) you just have to deal with the missing feature. One example of that is the lack of an Excel export in Tableau, which is due to Tableau just not listening to its users and wanting to push online usage rather than data export, so bottom line I had to build something myself, but it’s a hack and right now not supported.
Being open source Superset is also more flexible, adding features or visualization is something you could do if you can program, or you could sponsor if you have a budget. Talking about budget, Superset is completely free, no limitations other than the computing power of your server. This means that adding an additional, say, 1,000 users is going to be very cheap. Cost is important as developing a data-driven organization means all your employees have access to not only advanced analytical dashboards but also to powerful tools to do additional analysis on their own, this is one of the weak points of Tableau.
The database to dashboard architecture is also very powerful. With some Python and a database, you can quickly grab data from multiple places and, for instance, build dashboards from Google Sheets data.
OK, so Superset is cool and might have a bright future, but is it ready for enterprise usage now? The answer is: it depends, analytics culture plays an important role here. To highlight pros and cons more in detail, I’ll go over a point by point analysis comparing with tools I have used in my work experience (Tableau, Cognos and BusinessObjects).
Please remember those are first impressions about Superset.
Superset is much easier to start with than Tableau and let’s not even compare to old stuff like Cognos or Business Objects. Superset wins hands down in this department.
Online vs desktop
Superset works on all most used browsers and has been built as a web app, so it does not require any additional desktop installations for power features. As such, Superset is the only tool out there that can be used completely in the web browser which is an advantage. You might think Business Objects is also used online, but the way I have seen it being used by most users is to extract Excel files to then work on desktop, this is not a fully online data exploration workflow.
Superset’s mission is to make it easy to explore data and find insight. This is something that has a lot of potentials as it can scale (a big part of that is the lack of license costs on a per user basis). Below you can see how this data explore works, it changes depending on the chart type that you want to visualize.
It’s snappy and quite responsive, but there is no refresh button so sometimes you have to save in order to “refresh”, a bit un-intuitive but it works. The experience is not as interactive as Tableau and there is a lot of small details that are missing. Overall it works, but it needs to develop further. Reading around, the team at Airbnb seems to be very focused on making this experience better, so I expect there will be improvements in the future. However, it’s already a quite impressive tool and once you get into the flow you can use it to get business insights already.
I plan to test this further with my own data, so stay tuned for more commentary on this one!
As of now, uploading data in Superset is not possible, you have to upload data separately in a database. While I understand why this is the case – data management does not belong into an analytics tool – I do see some challenges with that, especially for beginners. You can, however, build a relatively small Flask app to handle this use case (which is uploading Excel files, .csv, etc.). One thing I am wondering is if the Superset navigation bar on top can inherit additional drop down menus. Ideally, users would not even realize they are not using Superset when doing data operations and uploads.
This is a big one for me and right now it’s a miss in Superset. In Tableau users can click on graphs and that can trigger additional actions (usually a filter on other graphs). Filtering features, in general, are a bit limited in Superset and I hope that will improve over time. I did some research on the roadmap, but I could not find any discussions about action triggers on charts.
The tooltips in Superset cannot be customized. Tableau has very powerful tool tips that can display a lot of information on various fields, this is not the case in Superset.
As you can see above, the highlighted tooltip is good, but no additional information can be added to it.
Superset queries databases, there is a nice SQL editor to build custom SQL, but that’s where it stops. This means that right now if you step out of SQL you can’t build the calculations you need.
However, Superset runs on top of Pandas, which is THE python library for data crunching and manipulations (in many cases it can replace R and SASS easily). So, I think that SQL Lab might expand one day in something more than just SQL, maybe adding custom functions that run on top of Pandas data frames? I think this direction would make sense, so I wouldn’t be surprised to see it coming at some point.
Some ideas of future developments
One strong advantage of Superset, which right now does not have as many features as Tableau or even PowerBI, is that it is open source. You know what else is open source? WordPress. What if Superset goes the direction of WordPress? What if Superset becomes the WordPress of analytic CMS’s? Superset could easily go in that direction as by using the blueprints framework of Flask it can be extended with additional web apps and plugins – easily. Although there doesn’t seem to be an integration for this use case yet if this happens Superset might scale beyond being a dashboard web application to be a full-fledged BI and data science solution. This architecture makes sense as by leveraging plugins you can fit analytics to the exact needs of your organization. Compare that to running a full fledged Business Objects solution, which is an “all in but not what you need” solution, and you’ll see the benefits.
Some ideas that come to mind of additional functionality that could be built as a plugin:
- A notification system. Think a system where admins can schedule email reports or alert rules for sending a notification when some event happens (i.e. you just converted a big customer today, etc.).
- A data management system. Think running simple ETL tasks using a simple online data pipeline tool. I am guessing a web interface for Airbnb Airflow would suffice, but I did not check this one out yet
- A wiki
- A forum
- A certification system to rank the quality of dashboards (maybe some dashboards for executives needs to be checked by finance first before they get an “updated as of…” flag, etc.)
- An integration with one of the data science notebook or workbench applications
- And much much more
Is Superset a replacement of all other BI tools out there? No. Can it grow to be a strong competitor? Yes. I do suspect Superset will always do things slightly differently though, so the organizations leveraging it do need to have a certain culture of analytics, but there is potential. Superset can live together with other analytics tools such as Tableau or Power BI. It is important to keep in mind that Superset scales better as you don’t incur enormous license costs and it’s built with data analysts in mind. Superset wants everybody in the organization to be a data scientist, at least a bit. Other tools are more geared towards power users that distribute reports and dashboards. Ultimately, it’s this difference in vision that might set Superset apart from other similar tools in the enterprise world.