How to Scope and Estimate an Integration Project
Like many software projects, integration projects can be complex and difficult to comprehend at the outset. For commerce companies, integrating various systems—such as inventory management, e-commerce platforms, CRM, and ERP systems—is often a necessary step to streamline operations and enhance efficiency.
As the head of technology or IT, it’s important to manage costs when undertaking these projects. Scope management and estimation are two critical aspects of cost management. If integration projects are in your future, it’s essential to become proficient at scoping and estimating them.
In this post, we’ll share a methodology (that we've built into our product at Doohickey AI) for scoping and estimating software integration projects tailored for commerce companies.
Endpoint Systems & Data Flows
There are two key elements you need to know to properly scope an integration project: 1) which systems will be integrated—called endpoint systems—and 2) what data flows must be implemented to meet the business requirements.
Your goal should be to collect a reasonable amount of information to define a practical scope and estimate for an integration project. You’ll have most of what you need by describing key attributes of the endpoint systems and data flows that make up the integration.
Defining Endpoint Systems
Endpoint systems are straightforward to identify. They are the systems being integrated—pieces of business software that, when combined via integration, will create a seamless experience for users, making operations more efficient and accurate.
For a given integration project, there must be at least two endpoint systems—otherwise, there’s nothing to integrate. In the context of a commerce company, these could include your e-commerce platform, inventory management system, CRM, ERP, or other specialized software. It’s best to focus on the attributes that will have the biggest impact on the project’s complexity and timeline.
You’ll need to dive into and document the following attributes for each endpoint system:
- Data File/Format: What will the data look like coming out of or going into the system? With modern APIs, this could be JSON or XML. It could be tabular data in a CSV file. Understanding the data format is crucial for compatibility.
- Data Protocol: How will the integration communicate to retrieve or save that data? HTTP is common with modern APIs, but protocols like SFTP, FTPS, SSH, and others are still widely used.
- Authentication: How will the integration authenticate with the endpoint system? Does it use an open, standard authentication approach like OAuth, or something proprietary?
- Known Capacity Limits: Are there any capacity considerations? For example, many APIs specify how many calls you’re allowed to make within a given timeframe before errors occur.
- Known Issues: Are there any other known issues about hosting, permissions, technology, or other factors that may impact the integration? Uncovering complexity or uncertainty early is beneficial.
When defining the scope for an integration project, it’s also helpful to give each endpoint a simple, referenceable name. This aids in communication and reduces confusion during the integration design process. You’ll use these names as part of expressing the data flows for the project.
Defining Data Flows
Data flows represent, at an abstract requirements level, which data needs to flow in which direction and when. Typically, a data flow’s intention is to move some primary record type, but it may also include related types. The record type may be transformed along the way (e.g., converting order data into shipping instructions).
Data flows can be confusing for a few reasons:
- They are an abstraction. The actual technology used for the integration may not literally implement data flows as described. It’s a concept that bridges the gap between business needs and technical specifications.
- Integration technologies vary. Whether using custom code, an integration framework, or a commercial integration platform like Doohickey AI, each may handle data flows differently.
- Terminology differs. Terms like integration flow, data stream, automation flow, workflow, and pipeline are often used interchangeably. Choice of terminology is often based on product marketing or habit.
A practical way to think of data flows is to imagine building pipelines from one system to another to move different types of data. Which data needs to flow in which direction? Are there data transformations required along the way?
Use Cases and Data Flows
Defining use cases first can be helpful. They are usually expressed in a format that’s comfortable for business stakeholders. The goal is to break down “make System A talk to System B” into tangible specifications that enable someone to build the integration. While there’s often a one-to-one relationship between use cases and data flows, this isn’t always the case. Relating data flows to their use cases helps stakeholders understand the technical output.
Data Flow Attributes
For each data flow, you’ll want to define:
- Source & Target System: Which direction is the data flow going? Avoid defining bi-directional data flows; instead, define two separate flows if necessary.
- Primary Record Type: What is the main data being moved? This relates to the trigger and filter.
- Additional Record Type(s): Are there any secondary or related records that the data flow will move? (e.g. order line items to go with order headers)
- Trigger: What initiates the data flow? Is it scheduled? Event-driven? User-initiated?
- Filter: Should any records be ignored after the trigger fires? Filters might be based on statuses, timestamps, or other criteria.
- Failure Scenarios: What could cause data transfer failures? Are there mismatches between systems? Data quality issues? System reliability concerns?
- Volume Expectations: What are the expected data volumes? Integrating 10 records per day is different from handling millions.
While some of this is technical, aim to define each attribute in business terms, closely aligning them with technical realities. The objective is to provide enough information for an analyst and engineer to complete the project.
Note: It’s not recommended to get into the weeds by including individual field-to-field mappings at this stage. That detailed work is part of the project itself.
Scoping Integration Projects
Now that you understand endpoint systems and data flows, you can define the bounds of an integration project. The scope is essentially the sum of the data flows needed to achieve the desired business use cases.
Data flows are the primary lever for scope. Adding more data flows increases scope. However, to understand the data flows completely, you must also define the endpoint systems’ attributes, as these contribute to each data flow’s complexity and estimate.
For example, suppose you’re integrating your e-commerce platform with your ERP system. Knowing that the e-commerce platform has a REST API and supports token-based authentication, while the ERP system uses a custom SOAP API with proprietary authentication, is crucial. These details impact how you design each data flow between the systems.
This detailed understanding helps the integration team decide on the parameters for each data flow, and collectively, the overall integration project. From there, you can estimate time and budget.
Estimating Integration Projects
After documenting everything, stakeholders can review and agree on what’s to be implemented. At some point, you’ll need to answer, “How long will this take?” or “How much will this cost?” Now, you must provide an estimate based on the information gathered.
Integration projects aren’t always linear, so estimating the required effort can be challenging. The key is to focus on the attributes that significantly impact risk and, consequently, the time involved.
The attributes shared earlier for defining data flows and endpoint systems should contribute directly to your estimates. Consider these for all systems relative to the necessary data flows to achieve the business objectives.
We recommend assigning a “t-shirt size” (small, medium, large) to each data flow’s complexity, which requires some experienced intuition. You don’t want to design the whole thing before estimating; instead, use a quick assessment of key attributes to estimate from an informed place.
The unit of measurement for your estimate is up to you. You could consider labor hours or something more abstract like story points. You might make each attribute a multiple-choice question with a specific estimate adjustment tied to each option. This approach helps codify the impact of various factors (e.g., dealing with a GraphQL API).
Estimating any project involves both art and science. Start with a reasonable framework, but adjust as you learn from past estimates. Try to incorporate that experience back into your estimation approach. And don’t blindly trust any estimate; if something feels off, revisit your assumptions.
Conclusion
Integration projects can be complicated enough. There’s no reason to struggle with providing a reasonable project estimate. Using the above attributes in a consistent estimation model will help you navigate this step while still providing informed estimates.
Save for a few details, these are the principles we built into the Doohickey AI process, and it works well. Try it out and let us know if we can help.