Open data: tech companies seek better licences

Managing IP is part of Legal Benchmarking Limited, 1-2 Paris Gardens, London, SE1 8ND

Copyright © Legal Benchmarking Limited and its affiliated companies 2025

Accessibility | Terms of Use | Privacy Policy | Modern Slavery Statement

Open data: tech companies seek better licences

With the growing popularity of data development in the open source space, counsel at Red Hat, Microsoft and Benevolent AI say a mechanism is needed to enable better collaboration

Tech companies have embraced open data to spur product and services development – particularly for those driven by hybrid cloud and artificial intelligence solutions – but they now need better licences to enable them to protect IP when they start and collaborate on different data projects.


Companies such as IBM are driving an AI and cloud-computing ‘revolution’ but need copious amounts of data to train and develop those technologies at speed. Open data, where firms access and develop data sets made available online, has become an attractive means to achieve that goal.

The concept is becoming particularly popular among driverless car developers. Waymo, for example, recently released its first data set to researchers.

But in-house counsel at firms such as Red Hat, Zoox, Benevolent AI and Microsoft – which are also investing heavily in machine learning and cloud computing – say that because open data is so new, they are still trying to figure out how their businesses can best work on various projects.

Open data initiatives are often covered by licences that, like open source software licensing, are not truly ‘open’ and can set out either fairly permissive or restrictive terms of use.

Businesses must therefore conduct due diligence to ensure they are only held to acceptable terms for a particular project, and cover their own data sets with licences that protect company interests and IP while encouraging further contributions from other parties.

“We focus a lot on open source, and open data sharing has come up a lot recently in the driverless car industry,” says Chris Nalevanko, general counsel and head of IP strategy at Zoox in California. “We want to make sure we are participating in the right things but also being guarded as to our core IP.”

Gareth Jones, vice president of IP at Benevolent AI in London, adds that ‘open data’ does not carry a single definition. “We use data from a huge variety of different sources, some of which is proprietary commercial data we pay for [to have] a licence to, and some of which is freely available – and all kinds in between along that spectrum,” he says.

Patrick McBride, senior director of IP at Red Hat in North Carolina, says that his business is still trying to figure out the open data landscape in much the same way as it was the open source world years ago.

He says that not only does open data help enable hybrid cloud solutions – technology that Red Hat and its parent company IBM are pursuing with great energy –  it also allows businesses to take advantage of some of the advances in AI that are happening.

He adds that he is now trying to understand the role open data will play in Red Hat’s overall hybrid cloud strategy and how it will enable the business to continue to provide value to its customers and subscribers.

Data’s next top model

The problem is that many open data licences are difficult to understand, vary considerably between different organisations or projects and are often not fit for purpose in modern projects.

Several organisations, including Microsoft and Linux, are now making efforts to standardise data licensing to make it easier to understand. In-house sources say they are keeping an eye on what these organisations are doing and hope to apply them to their own work.

Erich Andersen, chief IP counsel at Microsoft in the US, says the company found too many examples of data use licences when it looked at the landscape and that there were gaps in those existing licences. It responded by launching three potential licences on GitHub for community feedback.

“They are designed with an eye of standardising terms for the most common data uses. That is an initiative we will continue,” says Andersen.

Jones says Microsoft and Linux’s efforts are a step in the right direction. The Open Knowledge Foundation set up some open data licences a few years ago, but those are now less fit for purpose because of the way people are using data now in AI and machine learning models.

“Data licensing is incredibly complicated and time consuming, and a lot of the issues surround whether different organisations – particularly SMEs and academics institutions – have the resources available to help them understand it.

“Sometimes it is just a matter of ambiguity in these licences. If you have lots of people writing their own licences, suddenly everyone is using a different language and they are not always clear, particularly when the people providing the data do not understand the new technologies.”

McBride at Red Hat adds: “It is fairly obvious that one important thing to figure out is some mechanism for collaborating on the data itself and the AI products of that data – that is, an AI that knows how to react to certain situations and address what to look for in a particular data set when presented with new data because it has been trained on open data.

“There are efforts out there to come up with template licences for different environments. We are looking at those with interest and trying to work out which would work best with ours.”

Some say that data is the oil of the 21st century – everything runs on it in one form or another, and no one can get enough of it. Open data is the way forward, but the licensing infrastructure isn’t quite good enough to encourage wide-scale use just yet. Once it is, open data will offer a deep well of opportunity for tech companies.

more from across site and SHARED ros bottom lb

More from across our site

News of InterDigital suing Amazon in the US and CMS IndusLaw challenging Indian rules on foreign firms were also among the top talking points
IP lawyers at three firms reflect on how courts across Australia have reacted to AI use in litigation, and explain why they support measured use of the technology
AJ Park’s owner, IPH, announced earlier this week that Steve Mitchell will take the reins of the New Zealand-based firm in January
Chris Adamson and Milli Bouri of Adamson & Partners join us to discuss IP market trends and what law firm and in-house clients are looking for
Noemi Parrotta, chair of the European subcommittee within INTA's International Amicus Committee, explains why the General Court’s decision in the Iceland case could make it impossible to protect country names as trademarks
Inès Garlantezec, who became principal of the firm’s Luxembourg office earlier this year, discusses what's been keeping her busy, including settling a long-running case
In the sixth episode of a podcast series celebrating the tenth anniversary of IP Inclusive, we discuss IP Futures, a network for early-career stage IP professionals
Rachel Cohen has reunited with her former colleagues to strengthen Weil’s IP litigation and strategy work
McKool Smith’s Jennifer Truelove explains how a joint effort between her firm and Irell & Manella secured a win for their client against Samsung
Tilleke & Gibbins topped the leaderboard with four awards across the region, while Anand & Anand and Kim & Chang emerged as outstanding domestic firms
Gift this article