Increasing competitiveness with product-centric ESG reporting

ESG compliance is no longer a nice-to-have. It has become essential for competing in an increasingly sustainability-conscious market, saving resources and costs, and meeting ever-stricter regulatory requirements. Regulations such as the European Commission’s Corporate Sustainability Reporting Directive (CSRD) or the Supply Chain Act require companies to report ESG data transparently. While many organizations still rely on document-centric approaches and struggle with isolated solutions, a strategic competitive advantage is emerging elsewhere: product-centric ESG reporting.

What is ESG reporting?

Sustainable business practices have many facets. The ESG approach breaks them down into three core dimensions:

• E = Environmental
• S = Social
• G = Governance

In an ESG report, companies provide information on all three areas. This includes data such as CO2 footprints, energy consumption in production and operations, as well as information on promoting biodiversity and reducing waste. It also covers aspects like compliance with fair labor conditions and human rights, ensuring diversity, implementing effective risk management and compliance practices.

Data management is key in ESG reporting

This data – especially environmental KPIs – are often scattered across multiple sources: internal IT tools, external environmental databases, or supplier and partner systems. For many companies, preparing an ESG report therefore comes down to one central question: How can reliable ESG data from diverse sources across the entire value chain be collected and analyzed?

A bar chart with three bars on the topic
Companies consider the multitude of data sources and the varying data quality to be among the biggest challenges in ESG reporting. (BARC GmbH 2024)

One key solution lies in anchoring ESG reporting directly within product development – specifically, in the PLM system. This is where crucial data across the entire product lifecycle is stored: information about the product portfolio, the materials used and their sourcing, emissions from production and the supply chain, as well as data from later lifecycle phases such as use, disposal, and recycling. With this structured and traceable data foundation, a PLM system provides the ideal basis for a precise, transparent, and strategically valuable sustainability assessment.

Product-centric single source of truth as an enabler

An open integration platform like CONTACT Elements offers another crucial advantage for ESG reporting: it seamlessly incorporates information from a variety of internal and external sources. Through APIs, it exchanges data with third-party systems such as ERP tools. Supply chain information can be integrated via standardized exchange formats like the Asset Administration Shell (AAS) or data ecosystems (such as Pontus-X or Catena-X). This makes the platform a single source of truth for company-wide ESG reporting.

A schematic representation of ESG reporting based on the CONTACT Elements platform.
ESG reporting powered by CONTACT Elements.

Ideally, such a solution comes with built-in capabilities to assess and analyze the data. For example, CONTACT Elements uses AI methods to evaluate data quality. In the next step, powerful modules – such as for calculating the Product Carbon Footprint – then generate a compliant ESG report. This creates a comprehensive, audit-ready reporting that meets all market-specific requirements.

From ESG reporting to a sustainability strategy

Companies that rely on product-centric, integrated solutions like CONTACT Elements don’t just tackle the mandatory task of ESG reporting – they have the chance to strategically embed sustainability across the organization. For example, ESG data in CONTACT Elements can be directly linked to product structures and development processes. This allows developers to make early assessments of potential CO2 emissions across the product portfolio or in specific manufacturing processes, and to optimize them in a targeted way.

The result: sustainable innovations, more attractive products, streamlined processes, and lower costs. The foundation for this is always a software platform like CONTACT Elements: open, scalable, and equipped with powerful business applications.

Learn in this article by consulting firm CIMdata how companies can systematically embed sustainability in PLM to reduce their environmental impact across the entire product lifecycle.

Embeddings explained: basic building blocks behind AI-powered systems

With the rise of modern AI systems, you often hear phrases like, “The text is converted into an embedding…” – especially when working with large language models (LLMs). However, embeddings are not limited to text; they are vector representations for all types of data.

Deep learning has evolved significantly in recent years, particularly with the training of large models on large datasets. These models generate versatile embeddings that prove useful across many domains. Since most developers lack the resources to train their own models, they use pre-trained ones.

Many AI systems follow this basic workflow:

Input → API (to large deep model) → Embeddings → Embeddings are processed → Output

In this blog post, we take a closer look at this fundamental component of AI systems.

What are embeddings?

Simply put, an embedding is a kind of digital summary: a sequence of numbers that captures the characteristics of an object, whether it is text, an image, or audio. Similar objects have embeddings that are close to each other in the vector space.

Technically speaking, embeddings are vector representations of data. They are based on a mapping (embedder, encoder) that functions like a translator. Modern embeddings are often created using deep neural networks, which reduce complex data to a lower dimension. However, some information is lost through compression, meaning that the original input cannot always be exactly reconstructed from an embedding.

How do embeddings work?

Embeddings are not a new invention, but deep learning has significantly improved them. Users generate them either manually or automatically through machine learning. Early methods like Bag-of-Words or One-Hot Encoding are simple approaches that represent words by counting their occurrences or using binary vectors.

Today, neural networks handle this process. Models like Word2Vec or GloVe automatically learn the meaning of and relationships between words. In image processing, deep learning models identify key points and extract features.

Why are embeddings useful?

Because almost any type of data can be represented with embeddings – text, images, audio, videos, graphs, and more. In a lower-dimensional vector space, tasks such as similarity search or classification are easier to solve.

For example, if you want to determine which word in a sentence does not fit with the others, embeddings allow you to represent the words as vectors, compare them, and identify the “outliers”. Additionally, embeddings enable connections between different formats. For example, a text query also finds images and videos.

In many cases, you do not need to create embeddings from scratch. There are numerous pre-trained models available, from ChatGPT to image models like ResNet. These can be adapted accordingly for specialized domains or tasks.

Small numbers, big impact

Embeddings have become one of the buzzwords in AI development. The idea is simple: transforming complex data into compact vectors that make it easier to solve tasks like detecting differences and similarities. Developers can choose between pre-trained embeddings or training their own models. Embeddings also enable different modalities (text, images, videos, audio, etc.) to be represented within the same vector space, making them an essential tool in AI.

For a more detailed look at this topic, check out the CONTACT Research Blog.

Building a Semantic Search: Insights from the start of our journey

Research in the field of Artificial Intelligence (AI) is challenging but full of potential – especially for a new team. When CONTACT Research was formed in 2022, AI was designated as one of four central research areas right from the start. Initially, we concentrated on smaller projects, including traditional data analysis. However, with the growing popularity of ChatGPT, we shifted our attention to Large Language Models (LLMs) and took the opportunity to work with cutting-edge tools and technologies in this promising field. But as a research team, one critical question emerged: Where do we get started?

Here, we share some of our experiences which can serve as guidance to others embarking on their AI journey.

The beginning: Why similarity search became our starting point

From the outset, our goal was clear: we wanted more than just a research project, we aimed for a real use case that could ideally be integrated directly into our software. To get started quickly, we opted for small experiments and looked for a specific problem that we could solve step by step.

Our software stores vast amounts of data, from product information to project details. Powerful search capabilities make a decisive difference here. Our existing search function did not recognize synonyms or natural language, sometimes missing what users were really looking for. Together with valuable feedback, this quickly led to the conclusion that similarity search is an ideal starting point and should therefore be our first research topic. An LLM has the power to elevate our search functionality to a new level.

The right data makes the difference

Our vision was to make knowledge from various sources such as manuals, tutorials, and specifications easily accessible by asking a simple question. The first and most crucial step was to identify an appropriate data source: one large enough to provide meaningful results but not so extensive that resource constraints would impede progress. In addition, the dataset needed to be of high quality and easily available.

For the experiment, we chose the web-based documentation of our software. It contains no confidential information and is accessible to customers and partners. Initial experiments with this dataset quickly delivered promising results, so we intensified the development of a semantic search application.

What is semantic search?

In short, unlike the classic keyword search, semantic search also recognizes related terms and expands queries to include contextually-related results – even if these are phrased differently. How does this work? In our first step with semantic indexing, the LLM converts the content of source texts into vectors and saves them in a database. Search queries are similarly transformed into vectors, which are then compared to stored vectors using a “nearest neighbor” search. The LLM returns the results as a sorted list with links to the documentation.

Plan your infrastructure carefully!

Implementing our project required numerous technical and strategic decisions. For the pipeline that processes the data, LangChain best met our requirements. The hardware also poses challenges: for text volumes of this scale, laptops are insufficient, so servers or cloud infrastructure are required. A well-structured database is another critical factor for successful implementation.

Success through teamwork: Focusing on data, scope, and vision

Success in AI projects depends on more than just technology, it is also about the team. Essential roles include Data Engineers who bridge technical expertise and strategic goals, Data Scientists who analyze large amounts of data, and AI Architects who define the vision for AI usage and coordinate the team. While AI tools supported us with “simple” routine tasks and creative impulses, they could not replace the constructive exchange and close collaboration within the team.

Gather feedback and improve

At the end of this first phase, we shared an internal beta version of the Semantic Search with our colleagues. This allowed us to gather valuable feedback in order to plan our next steps. The enthusiasm for further development is high, fueling our motivation to continue.

What’s next?

Our journey in AI research has only just begun, but we have already identified important milestones. Many exciting questions lie ahead: Which model will best suit our long-term needs? How do we make the results accessible to users?

Our team continues to grow – in expertise, members, and visions. Each milestone brings us closer to our goal: integrating the full potential of AI into our work.

For detailed insights into the founding of our AI team and on the Semantic Search, visit the CONTACT Research Blog.