Managing the Data Pipeline with Git + Luigi

Managing the Data Pipeline with Git + Luigi:

Managing the Data Pipeline with Git + Luigi
One of the common pains of managing data, especially for larger companies, is that a lot of data gets dirty (which you may or may not even notice!) and becomes scattered around everywhere. Many ad hoc scripts are running in different places, these scripts silently generate dirty data.
February 26, 2015 at 01:14PM


from Tumblr http://shiyamaz.tumblr.com/post/112106407828