In an interaction with Industry Outlook, Jacob Peter, Executive Board, Senior Vice President, Mobility R&D, Bosch Global Software Technologies (BGSW), shares his insights about overcoming the key challenges in building end-to-end IoT Solutions for seamless data collection, ensuring security and privacy, effective data management strategies, and more. With a successful track record in leading the digital, IoT, and IT delivery unit, Jacob is passionate about exploring the potential of connectivity, machine learning, and analytics in automobiles.
Connecting the three pillars of IoT (sensors, software, and services) enables seamless data collection, analysis, and actionable insights, driving efficiency, automation, and innovation across various industries. What are the key challenges in building end-to-end solutions to achieve this objective?
We have been involved in IoT for a long time and in most aspects of the value chain. Whether it is the incorporation of MEMS technology at the sensor level or its integration into the products aimed at end users, such as automotive or consumer goods, the market is filled with a multitude of interconnected products. More than efficiency or automation, the result is a satisfied customer. There has to be value for the customer with whatever technology you apply. It must provide value to the end user. This is the goal we focus on when we are establishing connections or integrating sensors in any given field. That has been an overriding idea that we always use in our engineering.Our strategy is oriented around the product: software and services. But beyond that, what we realize is, AI has a lot of impact on how the various features and functions are going to be consumed by the consumer. In fact, the new term that we use is AIoT i.e AI enabled IoT.
An example is our collaboration with MTU, a subsidiary of Rolls Royce that specializes in manufacturing large engines. Our contribution entailed facilitating the connection of their massive engines, commonly used in mining operations. The prevailing challenge centered on accessibility problems. Our comprehensive solution encompassed the creation of an entire framework, incorporating sensors affixed to the engines themselves. This sensor data was then seamlessly transmitted up to the cloud. The resulting system included a range of supplementary enhancements, such as predictive maintenance capabilities, the capacity to oversee thousands of engines in real-time, and the implementation of a strategic business model atop this technological foundation.
The primary emphasis when delving into such technologies should be on selecting the right technology for the specific business case. It is important to assess whether it brings value to the end consumer and can be integrated into a viable business model. Without this alignment, maintaining these technological frameworks becomes challenging. Thus, this remains the central area of concentration when constructing such stacks.
How can we efficiently collect, process, analyze, and derive meaningful insights from the massive volume of sensor-generated data while ensuring optimal storage, real-time capabilities, data quality, and security?
Consider two distinct scenarios for the utilization of sensor-generated data: firstly, in the realm of real-time decision-making, and secondly, in the training of AI models, particularly within the domain of autonomous vehicles. When examining the functioning of an autonomous car, during the training phase, data gathered from various driving situations serves the purpose of constructing AI models. These models potentially become active in real-time operational situations. Notably, the learning process does not involve the original data but rather relies on information obtained from the diverse array of sensors. The resultant model then undertakes the task of making decisions. While an AI model can attain a considerable degree of accuracy in a relatively short time span, enhancing its capabilities to handle intricate and infrequent scenarios poses a significant challenge. This challenge is commonly termed the "Long Tail" of model development. Achieving a satisfactory level of proficiency in the model's performance can be realized quite expeditiously, potentially within a few months.
However, addressing the final 2 percent of the challenge involves tackling the intricacies of border use cases. This entails a dual approach with the data loop and the software loop. The software loop pertains to the development of the actual software, while the data loop revolves around circling the data back in a precise manner. Imagine that during the training and testing of your model for road signs, certain issues arise. In such instances, it becomes imperative to acquire additional data related to road signs. This necessitates handling substantial volumes of data, possibly in the petabyte range, which forms the raw database.
To effectively manage this, specialized tools are essential. These tools are designed to refine the data, extracting relevant specifics that can then be fed back into the model to enhance its training. This is where synthetic data enters the equation; as obtaining real-world data every time might not be feasible, the option to synthesize data and integrate it into the system arises. This constitutes one facet of the process - the model's development. Once the model is ready for deployment, the focus shifts to real-world data and the capability to provide ongoing updates. As maturity increases and more data are gathered from the environment, a learning process occurs in the backend rather than on the device. This involves sending data to the backend and bringing back a new model. This complete cycle necessitates the careful design of data pipelines. There are two distinct pipelines: one for learning and another for real-time use cases. Navigating this process requires making decisions at various stages, including selecting appropriate tools and infrastructure. These decisions are crucial for establishing a strong foundation in the architectural approach.
As the number of IoT devices and applications increases, managing the scalability of sensor networks, software systems, and service infrastructures becomes critical to handle large volumes of data and users. What solution do you propose for this?
Certainly, the complexity varies significantly depending on the number of devices involved. The challenges shift when dealing with smaller quantities like 100 or larger scales like 10,000. However, the real game-changer emerges when dealing with millions or even billions of devices. The initial decisions you make play a crucial role, as overlooking them can lead to costly consequences. We have witnessed costly mistakes in the industry resulting from not addressing these issues correctly. It is evident that systems can collapse when pushed beyond a certain threshold. What is often overlooked is the impact of the data payload itself – even a minor difference in a few bytes can have a profound effect. The decision regarding processing at the edge compared to potentially sending data to the cloud holds significant importance. This choice involves determining the specific processing tasks to be carried out at the edge and the volume of data to be transferred to the cloud. The reason behind this decision is that as the number of devices increases, there will be a corresponding increase in the computational power and storage space needed to support the system.
This consideration is essential not only for scalability but also for devices that play a critical role in the subsequent stages of your supply chain. While continuously monitoring a mobile device involved in sensitive logistics 24/7, it is important to consider network coverage gaps. To address potential data loss in areas with weak network signals, it is crucial to implement methods for storing data locally on the device. This becomes vital when data loss could significantly impact business operations. Load testing, which requires appropriate infrastructure, is also essential to simulate and prepare for extreme usage scenarios, especially when conducted in a virtual environment.
These design considerations hold great significance for me, as making incorrect assumptions in any of these areas can lead to significant consequences. Implementing such assumptions later could result in substantial expenses and potential embarrassment. The allocation of funds for cloud computing and storage needs would differ, potentially affecting the entire business's viability. Therefore, examining scalability from various perspectives is essential. This includes evaluating scalability in relation to edge versus cloud solutions, determining the data volume to be processed at the edge, the amount to be transferred to the cloud, and the overall impact on factors like bandwidth and processing resources. Rigorous testing becomes crucial to comprehend these implications thoroughly.
Protecting IoT devices, software platforms, and services from cyber threats and unauthorized access is a significant challenge. Ensuring end-to-end security, data encryption, and authentication mechanisms are crucial to safeguard the entire IoT ecosystem. What approach do you advocate for this?
The evolution of IoT has transformed everything into a computer, from household items like bulbs to complex machines. As we delve into industrial foolproof solutions, considering the entire team becomes essential. Instances of hacked cars highlight the dual nature of cybersecurity: one in the physical realm where robustness is crucial, especially in life-critical areas like automotive. This demands an architecture that segregates networks and effectively manages security, employing gateways for centralized control. Ensure that your focus is directed towards consolidating robust security mechanisms, placing them strategically wherever potential external connections to a device exist. The evolution of security in the traditional realm has been gradual, with persistent attempts by those with malicious intent to breach systems and disrupt businesses. The concept of embedded security poses a challenge.
While it might not be at the forefront of our thoughts, it holds paramount importance. As you establish a continuum from a sensor to a cloud environment, a comprehensive array of security measures becomes essential. This includes safeguarding perimeters, implementing device-level security, and adopting appropriate encryption and decryption models. Consider scenarios such as vehicles, which endure for around 15 years; robust security strategies, possibly grounded in specific key management philosophies, become crucial. Now, as you prepare to harness the potential of quantum computing, you are aware that it can render conventional encryption keys obsolete due to its extraordinary computational power. Consequently, your ability to address this challenge within the lifespan of your product, whether through initial design or subsequent modifications, becomes crucial for ensuring security.
The landscape of security threats is dynamic, with quantum computing representing a significant potential threat. While offering substantial advantages to humanity, it introduces new challenges. It is vital to anticipate these upcoming technological shifts and incorporate them into your product design process. Hence, integrating security from the very inception of your project, along with thorough testing and strategic countermeasures, is essential to ensure a resilient solution.