How to Evaluate an ETL Tool?
Out of the box vs. custom
As companies seek for software to fit their data integration needs, the first question they need to decide upon is whether to build a custom or to obtain an out-of-the-box ETL tool. The best answer to this question is simple: it depends on your needs. In any case, both approaches have to be carefully examined, so here are some advantages and disadvantages of each method.
The primary reason to obtain an out-of-the-box ETL tool is that it helps companies to minimize the time and cost required for developing custom code. Many companies are confident that ETL tools from vendors are more effective at delivering projects on time and within budget. There are also other advantages—such as a wide range of rich built-in features, the ability to be reused for future projects, the simplicity of ETL maintenance, etc.
On the other hand, the time spent on evaluating, selecting, adopting, and learning the product can be tremendous. Training costs and support fees, as well as not fitting 100% to your business needs are also among the drawbacks. On top of that, a proprietary data integration tool may cost up to $500,000 on annually (including licensing and maintenance), according to Gartner and Bloor Research.
Some organizations believe that custom ETL tools save time and money on support and maintenance of a data warehouse, because hand-coded software is 100% geared to specific business needs. More flexibility for complex data sources and transformation mappings are also among the advantages of in-house projects. Meanwhile, the time needed for development from scratch may take months or even years—depending on the size and complexity of your information infrastructure. Besides, these products don’t provide all the rich functionality of proprietary ETL software packages and can hardly be reused for other projects without additional customization.
Each of the approaches may have its own particular implementations across industries and vary a lot depending on your requirements. Just keep in mind that the goal of an efficient ETL tool is to integrate data quickly, practically, and cost-effectively.
At the same time, an open-source ETL tool can be something in between, combining the pros (and cons, of course) of the two models.
Criteria for vendor selection
Still, before making a decision about purchasing commercial software or downloading an open-source ETL tool, you need to research the market. What are the features you should look for in the ideal ETL software?
Here is a list of parameters to help you decide if a certain ETL tool suits your technology requirements:
- Supported operating systems and runtimes, desktop/SaaS, server/client
- Available connectors for data sources needed (ERP, CRM, databases, etc.)
- Data formats (CSV, XML, unstructured data, etc.)
- Task scheduling
- Logging, error handling, alerts
These are some of the core integration features that one should pay attention to:
- Available data transformation functions/operations
- Support for handling data quality
- Data cleansing or enhancement
- Metadata support, business rules implementation
- Batch mode / real-time
- Synchronization (if available): one-way or bidirectional
Many ETL tools have either strong transformation or strong cleansing capabilities. If your data can potentially be corrupted, make sure your product has strong data quality capabilities.
On the dawn of data integration, ETL tools were supposed to be used by IT people. Today, these products are managed by business users, too. At the same time, the increasing complexity of data sets requires ETL vendors to implement sophisticated features. So, what is the right combination of functionality and ease of use?
The following functionality is required for ETL tools to meet business user expectations:
- A visual interface that enables to drag and drop data elements (such as source/target connectors)
- An intuitive GUI for building data maps and setting transformation rules withour coding
- Reusable artifacts (such as prebuilt mappings)
- Comprehensive user manuals and demos
It is also helpful to obtain customer references from each ETL vendor or search community forums to get the reviews from end users who needed the same functionality as your organization requires. With this approach, you may also find out whether these tools are difficult to troubleshoot and maintain—or not.
Finally, remember that you should define your master data requirements and set data integration goals before selecting an ETL tool.
P.S. For an open-source ETL tool that can be either customized or used out of the box, feel free to explore Apatar.
Further reading
- Data Integration: ETL vs. Hand-Coding
- Gartner Suggests Rationalizing Data Integration Tools to Cut Costs
- Reducing ETL and Data Integration Costs by 80% with Open Source