John Glendenning, general manager and vice president EMEA at DataStax considers the rise of connected cars. The sheer amount of data that is being generated today is astonishing. Whether it is from human activities – the emails we send, the videos we post to YouTube or the websites we read – or generated by the machine technology that surrounds us every day, data is available everywhere. 

Cars are no different. At one extreme end of the automotive industry, Formula 1 cars create around 36 Terabytes of data during a race each. This data consists of information from more than 120 sensors that are distributed throughout one of these cars, covering braking systems and tyre pressure sensors through to engine calibration and temperature. All this information is used to determine optimal performance and spot where improvements can be made.

Just like all the other technology used by F1, this same approach to gathering data is making its way into vehicles on the street that you or I might buy. The rise of the “connected car” focuses on performance; however, rather than eking out the last few extra miles per hour or more downforce, these initiatives are aimed at improving areas like fuel efficiency and reliability. In turn, this leads to a better driving experience and happier customers.

To make connected car programmes work involves the combination of car systems, mobile networks and applications, and how the data is stored. With this in place, car manufacturers can gather information from all the vehicles with the technology fitted and continually monitor how they are performing in different conditions. For companies like Hyundai and Honda, this data is then used to create opportunities to improve the performance of their cars over time.

Take temperature, for example. By looking at how cars perform in more extreme temperature environments, manufacturers can predict where they should invest their time and resources around support. Tracking data from across thousands of vehicles also makes it easier to spot how these cars and their components perform over time. This can create opportunities to manage the whole lifecycle of support and service more efficiently, providing opportunities for greater returns for the carmaker and a better customer experience.

In addition to the car manufacturers, there are other opportunities to make data around vehicle use more valuable. For example, insurance companies have launched applications that use data to rate drivers. This data can be analysed for risk management and to help drivers improve their performance. The aim here for the insurance companies is to improve their approach to managing risk and premium costs, while also giving the driver information to make their driving safer too.

Alongside the consumer car, this ability to gather data can be used in commercial fleets as well. While fleet operators have often tracked data around location, routing and fuel, there is an opportunity to expand how data is used within these organisations too. Looking in more detail at telemetry data can provide opportunities to be more proactive in maintenance and scheduling. Aeroplane manufacturers like Boeing have taken advantage of data in this way to improve turn-around times and prevent issues from affecting customers; vehicle fleets can make use of data in the same way.

The vision of the connected car relies on data to make it work in practice. Alongside taking information from sensors embedded in the vehicles and transmitting it over cellular or mobile networks, that data has to be stored somewhere. To cope with the volume of data being created by vehicles, new database platforms are required.

NoSQL is the term for a family of databases that are designed to cope with the high volume of data being created by today’s applications and devices. Vehicle manufacturers and service companies have to handle huge volumes of data that is being created in real time. As each action within a car creates data – pressing the accelerator, or making a turn – so that data has to be gathered. All of this information has to be kept in order, so that it represents an accurate picture of what was occurring. This is referred to as time-series data.

Apache Cassandra is an open source database originally created by Facebook and now used by companies as diverse as Netflix, eBay and Aeris. This NoSQL platform has found a place as the database platform of choice for companies that have to deal with huge volumes of time-series data, like that created by vehicles. The reason for this is its ability to handle the number of data events that are created across a network of hundreds of thousands of devices.

By scaling up and helping companies to save the volume of data, Cassandra provides a base for the analytics side of vehicle telemetry data. This enables companies to save all of the data, which will consist of millions of “actions” over time. By keeping all this data intact, it can be used in the future and interrogated to spot new opportunities for improvement.

The big challenge for projects around the connected car is knowing the right questions to ask of the data. Previously, companies might not have been able to save all the information that was available to them from their vehicle telematics. However, the advent of open source and NoSQL has come alongside cloud computing platforms like Amazon Web Services and Google Compute Engine – this has led to storage capacity being available that can cope with Internet of Things data and store it indefinitely at very economic rates.

Further developments in the kinds of applications that can be built around the connected car are also coming. By using NoSQL platforms like Cassandra, companies can look for trends in their data over time. In turn, this can create opportunities for the vehicle manufacturer or fleet operator to improve their performance as a business.

John leads DataStax’s operations around the use of NoSQL in private and public sector organisations. He has spent over 19 years within the enterprise infrastructure software and hardware industry for companies such as Citrix, XenSource, Platform Computing and Compaq.