The Data challenge in Climate Tech

5 min readMay 24, 2023

As we move closer to 2030 “net zero” goals, the private sector is increasingly being held accountable for its environmental impact. It is expected to be instrumental in achieving emissions reduction targets, as well as meet other environmental goals including water & biodiversity preservation, or waste reduction & recycling. Yet, before one can envision improving their environmental footprint, it all begins with accurate measurement.

For this, one of the most interesting tools we have at our disposal is the Life Cycle Analysis (LCA) framework: LCA is a methodology for evaluating the environmental impact of a product or process throughout its entire life cycle, from the extraction of raw materials to its end-of-life disposal or recycling. The goal of LCA is to provide a comprehensive and objective assessment, and to identify opportunities for improvement. Once the LCA scope is defined, data should be collected on the product’s inputs and outputs. Yet, this is where it comes tricky:

The data is scattered across many siloed environments: most supply chains are complex and fragmented, involving many layers (i.e. retail, wholesale, different tiers of processing & manufacturing, cooperatives & buying groups, all down to the farm). This is an intricate network of nodes, each of them potentially using different Enterprise Ressource Planning (ERP) or Product Lifecycle Management (PLM) systems.
The data is time-consuming or expensive to collect: LCAs are typically done by consultants and can cost up to €20k per SKU. Indeed, for instance, identifying the relevant emissions sources, gathering data on those sources, calculating the emissions associated with each source, and reporting the results, can be a very challenging process: it requires heavy collaboration and coordination with suppliers and other stakeholders, as well as the use of standardized emissions calculation methodologies and reporting frameworks.
The data is often wrong, incomplete or non-existent: companies lack robust systems for tracking & reporting emissions, and it can be challenging to obtain accurate data from their suppliers. Variations in accounting practices and definitions can also result in discrepancies between data sources; and time lags in reporting can result in outdated or incomplete information.
The data is often misinterpreted: taking data “shortcuts” can be dangerous. As an example, food items from the same category can hide very large environmental impact disparities depending on its specific recipe and processing technique used, e.g. a yogurt made with milk powder can emit 10 to 15 more CO2eq. than whole milk-based yogurt (according Agribalyse’s data).

As such, as we speak, we are facing a true data challenge, explaining why either LCAs are hardly scalable, or why many existing tracking solutions, using those “short-cuts”, are accused of greenwashing.

Everything must start with compiling a high quality, granular data-set leveraging both open & proprietary data. And at Samaipata, we believe “verticalized” environmental management is the answer in this space. Our investment in Retraced (a traceability platform for Fashion & Textile), in Carbon Maps (an environmental accounting platform for Food & Beverages), or CarbonFact, (a carbon management software for Fashion, in which we haven’t invested in) illustrate well our thesis on the matter.

Our take is that verticalizing software to a specific industry allows:

To develop a more powerful modelling approach down to “Scope 3”: there is strong value in the engine itself. Mapping a complex series of nodes is challenging, and genuine industry expertise is required to digest the existing academic research and translate it into a live, actionable data model. Especially if the mapping is done down “to the farm” enabling real Scope 3 assessment, i.e. measuring indirect GHG emissions outside of a company’s direct operational activities (Scope 1) and indirect emissions from purchased energy (Scope 2).
To access a more relevant training dataset: the algorithms are calibrated (and potentially trained in the medium term when true reinforcement learning could be leveraged) with industry-specific, relevant data points. On top of mere OpenData, the dataset can be enriched with high-quality, primary end-users data coming from one’s tiers of suppliers. SaaS platforms can also potentially provide real-time data, allowing for continuous tracking, and LCA frequent “updating”.
To optimise data collection thanks to network effects and viral loops: here data collection can be embedded into industry-specific workflows including automated questionnaires or direct integrations with the most commonly used ERPs within the industry; and with time, collection becomes even more efficient as all stakeholders (i.e. the suppliers of one’s suppliers) are already integrated to the tool and have learned to quickly provide high-quality data (i.e. learning curve). At one point, these suppliers might even turn free users merely providing data to paying clients collecting and managing data themselves. In short, the network itself becomes the best sales channel, as it promotes itself. An example for this is Retraced’s partnership with Artistic Milliners, one of the leading denim producers HQed in Pakistan. Artistic Milliners connected with Retraced as a supplier for an American brand and joined the network as they embraced supply chain transparency. Two years later, they became one of Retraced’s largest partners. Both parties work together to enhance Retraced’s solution, with the denim manufacturer actively providing supply chain data to various brands on the platform.
To boost the platform stickiness in the short-term: with time, one can develop vertical-specific features increasing the tool utility for the end-users beyond a mere annual carbon audit, leading to less churn, higher Average Order Value (€) and above all, higher potential for engagement. For example, Carbonfact is being used on a weekly basis by Creative and Production teams within fashion brands to inform product ecodesign decisions.
To go from “measuring” to “reduction” in the medium term: eventually the “tracking” solution can become a decision-making tool for operational units. The end-goal of the Carbon Maps’ platform is to allow management to reduce emissions baseline by playing with intra (i.e. on their recipe/product mix) and inter (i.e. on SKUs) category allocations.

As such, in this already competitive industry, we believe “verticalizing” is the best approach to automate and scale LCAs, generate real value for end-users and help companies to better understand their environmental impacts and implement effective strategies for reducing their emissions.

That being said, the relevance of verticalisation largely depends on i) the size of the company, ii) the complexity of the underlying industry in which it operates and iii) the intrication of its operations within that industry.

It worth mentioning that although the LCA method works well for evaluating carbon intensity, it’s not perfect: as it always boils down to quantity of products produced (i.e. it’s a “production” function), it is necessary to go beyond to capture all indirect environmental impacts such as the 360 impact on soils and biodiversity, which is inseparable from GHG for a complete climate strategy. Carbon Maps goes into further details about this point here.

Anyway, we can dream of a world where all supply chains will have auditable and certifiable environmental data — but we’re still far from this. And in the mean time, Vertical SaaS are tailored for specific industries or sectors, incorporating a deep understanding of the unique challenges, regulations, and best practices related to emissions tracking and climate tech.

If that sounds like a company you’re building, we want to hear from you here!

The Data challenge in Climate Tech

Written by Aurore Falque-Pierrotin