Open data: tech companies seek better licences
Managing IP is part of the Delinian Group, Delinian Limited, 4 Bouverie Street, London, EC4Y 8AX, Registered in England & Wales, Company number 00954730
Copyright © Delinian Limited and its affiliated companies 2024

Accessibility | Terms of Use | Privacy Policy | Modern Slavery Statement

Open data: tech companies seek better licences

With the growing popularity of data development in the open source space, counsel at Red Hat, Microsoft and Benevolent AI say a mechanism is needed to enable better collaboration

Tech companies have embraced open data to spur product and services development – particularly for those driven by hybrid cloud and artificial intelligence solutions – but they now need better licences to enable them to protect IP when they start and collaborate on different data projects.


Companies such as IBM are driving an AI and cloud-computing ‘revolution’ but need copious amounts of data to train and develop those technologies at speed. Open data, where firms access and develop data sets made available online, has become an attractive means to achieve that goal.

The concept is becoming particularly popular among driverless car developers. Waymo, for example, recently released its first data set to researchers.

But in-house counsel at firms such as Red Hat, Zoox, Benevolent AI and Microsoft – which are also investing heavily in machine learning and cloud computing – say that because open data is so new, they are still trying to figure out how their businesses can best work on various projects.

Open data initiatives are often covered by licences that, like open source software licensing, are not truly ‘open’ and can set out either fairly permissive or restrictive terms of use.

Businesses must therefore conduct due diligence to ensure they are only held to acceptable terms for a particular project, and cover their own data sets with licences that protect company interests and IP while encouraging further contributions from other parties.

“We focus a lot on open source, and open data sharing has come up a lot recently in the driverless car industry,” says Chris Nalevanko, general counsel and head of IP strategy at Zoox in California. “We want to make sure we are participating in the right things but also being guarded as to our core IP.”

Gareth Jones, vice president of IP at Benevolent AI in London, adds that ‘open data’ does not carry a single definition. “We use data from a huge variety of different sources, some of which is proprietary commercial data we pay for [to have] a licence to, and some of which is freely available – and all kinds in between along that spectrum,” he says.

Patrick McBride, senior director of IP at Red Hat in North Carolina, says that his business is still trying to figure out the open data landscape in much the same way as it was the open source world years ago.

He says that not only does open data help enable hybrid cloud solutions – technology that Red Hat and its parent company IBM are pursuing with great energy –  it also allows businesses to take advantage of some of the advances in AI that are happening.

He adds that he is now trying to understand the role open data will play in Red Hat’s overall hybrid cloud strategy and how it will enable the business to continue to provide value to its customers and subscribers.

Data’s next top model

The problem is that many open data licences are difficult to understand, vary considerably between different organisations or projects and are often not fit for purpose in modern projects.

Several organisations, including Microsoft and Linux, are now making efforts to standardise data licensing to make it easier to understand. In-house sources say they are keeping an eye on what these organisations are doing and hope to apply them to their own work.

Erich Andersen, chief IP counsel at Microsoft in the US, says the company found too many examples of data use licences when it looked at the landscape and that there were gaps in those existing licences. It responded by launching three potential licences on GitHub for community feedback.

“They are designed with an eye of standardising terms for the most common data uses. That is an initiative we will continue,” says Andersen.

Jones says Microsoft and Linux’s efforts are a step in the right direction. The Open Knowledge Foundation set up some open data licences a few years ago, but those are now less fit for purpose because of the way people are using data now in AI and machine learning models.

“Data licensing is incredibly complicated and time consuming, and a lot of the issues surround whether different organisations – particularly SMEs and academics institutions – have the resources available to help them understand it.

“Sometimes it is just a matter of ambiguity in these licences. If you have lots of people writing their own licences, suddenly everyone is using a different language and they are not always clear, particularly when the people providing the data do not understand the new technologies.”

McBride at Red Hat adds: “It is fairly obvious that one important thing to figure out is some mechanism for collaborating on the data itself and the AI products of that data – that is, an AI that knows how to react to certain situations and address what to look for in a particular data set when presented with new data because it has been trained on open data.

“There are efforts out there to come up with template licences for different environments. We are looking at those with interest and trying to work out which would work best with ours.”

Some say that data is the oil of the 21st century – everything runs on it in one form or another, and no one can get enough of it. Open data is the way forward, but the licensing infrastructure isn’t quite good enough to encourage wide-scale use just yet. Once it is, open data will offer a deep well of opportunity for tech companies.

more from across site and ros bottom lb

More from across our site

We provide a rundown of Managing IP’s news and analysis from the week, and review what’s been happening elsewhere in IP
Law firms that pay close attention to their client relationships are more likely to win repeat work, according to a survey of nearly 29,000 in-house counsel
The EMEA research period is open until May 31
Practitioners analyse a survey on how law firms prove value to their clients and reflect on why the concept can be hard to pin down
The winner of Managing IP’s Life Achievement Award discusses 50 years in IP law and how even he can’t avoid imposter syndrome
Saya Choudhary of Singh & Singh explains how her team navigated nine years of litigation to secure record damages of $29 million and the lessons learned along the way
The full list of finalists has been revealed and the winners will be presented on June 20 at the Metropolitan Club in New York
A team of IP and media law specialists has joined from SKW Schwarz alongside a former counsel at Sky
The Irish government has delayed a planned referendum on whether Ireland should join the Unified Patent Court, prompting concern about when a vote may take place
With more than 250 winners recognised during the ceremony, there are many reasons to be positive about the health of the IP industry in EMEA
Gift this article