open data Sharing Economy Transparency



In the absence of open data standards, companies like Airbnb can define their own terms for behaving in an “open and transparent” way.

  • In early December, Airbnb made headlines by releasing some data on how people are using the company’s platform in New York City.

    In doing so, the company has provided an object lesson in the critical role that data plays (and will continue to play) in government regulation of private companies in the 21st century, and highlighted how ill-equipped governments are to obtain and use this data.


    Over the past year, the use of Airbnb to rent properties in New York has received intense scrutiny from government regulators because of suspicion that a non-trivial amount of rentals listed on the site were in violation of state or city rules. In late 2014, New York State Attorney General Eric Schneiderman released a report examining Airbnb rentals in New York that concluded that “most short-term rentals booked [through Airbnb’s service] in New York violate the law.”

    The recent data release by Airbnb was meant to make good on a promise by the company in response to the Attorney General’s report to be “open and transparent,” and to underscore its contention that the vast majority of the users of its service do so in compliance with state and local laws.

    From the start, the release of this data was viewed with skepticism by some journalists, activists, and others that closely watch the “sharing economy.” In order to view it, interested parties needed to make an appointment to review the data in person at Civic Hall, where Airbnb is an organizational member. The data was highly redacted, was not published to company’s data website, and those viewing it were not allowed to copy it to review it more closely after their scheduled appointment.

    In the weeks that followed the release, outside reviews of the data seem to contradict Airbnb’s primary claim that most of its hosts are using the service lawfully. Others have cited the need for more detailed information to draw definitive conclusions. Though Airbnb has indicated that it would like to share similar data on usage for other cities, there is no indication that the company will release more detailed information for New York City.

    It seems clear that the ultimate question of whether Airbnb’s service is being used lawfully will be determined by examining data on how people use the service. State regulators and others know this and have used legal and other means to try and get this data. Airbnb also knows this, and it has carefully constructed its limited release of data (and its public outreach about this data release) to assert that they are in compliance with the law.

    But if data is at the heart of government’s ability to regulate Airbnb and other “sharing economy” companies, why are they so ill-equipped to obtain and use it?


    At its core, the sharing economy—whether defined by the example of UberAirbnb, or one of the other standard-bearers of this emerging class of business—represents a change in the way that people consume goods and services that is enabled by advances in technology. This change in consumption results in a challenge to an existing, entrenched market actor (for Uber, the taxi industry; for Airbnb, the hotel industry) that is subject to existing government taxes and regulations.

    The challenge for public officials and regulators is to avoid stifling new innovation that can result in better services but at the same time to ensure the fair and equitable application of rules that have been set up to regulate business operations and protect consumers. This is an issue that can have significant political implications. Striking the proper balance between competing needs can indeed be tricky.

    But this challenge is not a new one for governments—tension between regulators and private interests spans the history of collective governance.

    Supporters of the sharing economy commonly criticize existing government regulations as outdated and ill-suited to support innovation by 21st century technology companies. Reputation systems are often pointed to as a 21st century alternative to traditional government regulation, to ensure that sharing economy firms and similar types of companies act in the best interest of consumers.

    But there are a number of common examples where consumers engage in transactions for which they have an abundance of reputational information where the company providing the services is also highly regulated by government. Consider the example of dining out at a restaurant—never before have consumers had as much information as we do now about the quality and cleanliness of food service establishments. And yet, restaurants and food service establishments are highly regulated by multiple levels of government.

    In addition, recent research specifically examining reputation systems used by technology companies suggests that reputation systems and more traditional central regulation can work beneficially in tandem. A recent study from the Ohio State University examined the reputation system used by eBay in conjunction with a buyer protection program—a centrally managed program to provide protections to buyers and recourse if they are dissatisfied with purchases (an approach strikingly similar to the notion of centralized regulation). The study concluded that:

    “[W]e estimate that the total welfare rises by 2.9% after the introduction of the buyer protection program. This increased welfare demonstrates an efficiency gain by having the two mechanisms, the eBay Buyer Protection and eBay Top-Rated Sellers, in place.” [Emphasis added]

    Sharing economy supporters also claim the the growth of companies like Airbnb, Uber, Taskrabbit and others has led to a fundamental change in the economy, with more people opting to become freelancers who—to paraphrase the words of sharing economy companies—become masters of their own destiny. But another recent study from George Mason University suggests that the freelancer phenomenon began long before the advent of sharing economy companies like Airbnb and Uber:

    “Our data support the claim that there has been an increase in nontraditional employment, but the data refute the idea that this increase is caused by the sharing-economy firms that have arisen since 2008. Instead, we view the rise of sharing-economy firms as a response to a stagnant traditional labor sector and a product of the growing independent workforce.”

    The George Mason study helps to clarify the role that sharing economy companies can play in a changing economy—as opportunities for traditional employment become scarcer, the sharing economy may play an important role in providing employment for a changing workforce.

    But we should not allow these potential benefits to be offset by other negative consequences that may arise if sensible regulations are not applied sharing economy companies.


    The exercise of regulating private interests by government almost always involves an information asymmetry. Governments seek to discover the occurrence of specific activities that are subject to regulation and to apply relevant rules and taxes to those activities.

    Prior to the digital age, governments sought to (and still seek to) address this asymmetry by hiring groups of trained employees like auditors, inspectors and agents. The job of these individuals is to investigate certain kinds of activities and transactions to determine if the activity in question falls under the authority of a specific regulation or regulating entity, and then to enforce any applicable rules or taxes.

    With the rise of the internet and the dawn of the digital age, governments have employed a wide range of new tools to help ensure that the activities of private interests comply with rules that have been adopted by elected and appointed bodies. The steady march of technical advancement has also made compliance with government mandated taxes and rules easier and more efficient for businesses than ever before.

    The tension between how technology alters production and consumption patterns, and how these new patterns square with existing government rules is not new.

    In the 1960’s, the rise of mail order retailers began a protracted debate on the application of state and local sales taxes to remote sales—a debate that has raged through the time when internet retailers like Amazon developed and flourished, and that still rages today.

    In the early 2000’s, a new class of business—bolstered by the increasing availability of broadband internet access—began to offer consumers new options for telecommunication services that bypassed the Publicly Switched Telephone Network. In a scenario that is remarkably similar to the sharing economy, these new VoIP companies competed against large entrenched industry incumbents that were heavily regulated by government by offering customers improved service, a better customer experience, and lower prices.

    The tension between the new service being offered to consumers VoIP companies and existing government rules was ultimately resolved by an order from the FCC.

    So while the tension with existing tax and regulatory requirements created by technical advances is not new, and existing government institutions have shown they are capable of resolving these tension to the benefit of consumers, things are a little different when we consider the companies that make up the sharing economy.

    Sharing economy companies self identify as “technology companies”—not dispatch companies (in the case of Uber) or hoteliers (in the case of Airbnb) that happen to make heavy use of internet technologies. They position themselves not as providers of a service, but as enablers or connectors that bring together individuals that want to transact with each other.

    The issue with respect to government regulations as they relate to sharing economy companies is not so much that existing regulations are outdated as some have claimed. Instead, it is that the infrastructure for ensuring compliance with these regulations was not constructed for the 21st century—or, at best, the infrastructure has been minimally and very unevenly built.

    Government regulations need to speak the native language of these companies—in the 21st century, this language is almost always data in JSON or CSV format delivered over HTTP.


    Consider Airbnb’s recent data release—what if the State of New York, as part of its request to the company to share data, had offered Airbnb the ability to publish data to its open data portal?

    The state could have provided the company with a user account and requested that they publish their data using that account at regular intervals to allow for scrutiny by regulators and other interested parties. They could have provided Airbnb with existing guidelines for ensuring privacy as well as metadata guidelines to help ensure data quality.

    Even if Airbnb didn’t opt to publish data to the state’s open data portal, the request to do so would have helped to better qualify the deficiencies in the data the company actually did release. The state’s portal already contains scores of data sets with detailed information from dozens of agencies. Are technology companies like Airbnb less equipped to publish data on their core business operations than the State Liquor Authority? Really?

    With very few exceptions, government open data portals are one-way vehicles—transporting data unidirectionally from government agencies to external consumers. Governments largely don’t view their open data portals as platforms for comingling data from different data producers, much less as vital instruments for successful 21st century regulation.

    But what if they did?

    With complete enough data, the question of whether an Airbnb host is in compliance with the law is fairly easy to spot. The issue is that in the absence of a standard mechanism for sharing this data openly, companies like Airbnb can define their own terms for behaving in an “open and transparent” way.

    We need to expand the way we think about open data so that it’s not just about agencies publishing data to an open data portal, but instead is an integral way that we collectively help ensure the health and stability of our communities. We need to expand our notion of “government as a platform” to go beyond just building new civic apps to helping ensure efficient compliance with rules that are adopted collectively through democratic processes.

    We’ve started to assemble some of the building blocks for this new infrastructure, but we now need to put the pieces in place.

    For example, some governments publish data on zoning rules as open data. But we need to go beyond simply publishing this data and expose these rules through an interface that allow them to be encapsulated in a transaction. This work has only just begun and is,for now,largely driven by actors outside of government.

    Imagine an infrastructure that would allow companies like Airbnb to instantly determine if a potential short-term rental was authorized under local zoning and rental regulations, and to determine if a rental tax was due on the transaction and the amount.

    We are woefully short of this goal.

    In fact, in jurisdictions that specifically require short-term renters to register with the local government, there is no interface that supports an automated check to determine if a specific property has a permit and is authorized to conduct such a rental. The absence of this essential infrastructure to enforce local regulations may go a long way toward explaining the dismal rate of compliance.

    It seems clear that governments will not be able to successfully regulate these new companies without the infrastructure necessary to do so.

    Building the infrastructure for 21st century regulation will require us to expand our ideas of open data and government as a platform. Checking for compliance with tax and regulatory requirements—by either party in a sharing economy transaction—should be as simple as making an API call.

    Constructing this infrastructure won’t be done overnight, and it probably won’t be inexpensive. But the stakes for state and local governments have never been higher.

Civic Hacking Data Science open data



The founder of a Hacker News-style site for data for social good projects says that there is not enough replication in the civic hacking community, and he means to change that.

A year after launching DataLook, a Hacker News-style site highlighting data projects for social good, Tobias Pfaff and his colleagues are spearheading a 10-week replication marathon of some of the site’s top reusable projects in advance of a TEDx competition they qualified for this spring. Participants are finding each other and collaborating on Slack, although if it makes more sense to take problem solving to outside sites—Github’s issue tracker, for example—they are encouraged to do that as well.

“I think there is not enough focus on replicating projects [in the civic tech community],” founder Tobias Pfaff tells Civicist in a Skype interview. “I think it might be less sexy to do things that other people have done before.”

However, Pfaff also points out that replicating projects can be faster and easier than starting an open data project from scratch. Replication, he says, “can be super sexy” because you can get things done—and start having an impact—quickly. He points Civicist to Jason Hibbets’ framework for civic hackers, which outlines three kinds of projects: green fields (new and untested); cloned (tested, approved, and repeated); and augmented (tested and improved upon).

One successful and much-discussed replication is the late U.S. Politwoops, a transparency project documenting politicians’ deleted tweets, which was based on a project first launched in the Netherlands in 2010. The service recently made headlines after Twitter pulled its API access for violating terms of service. However, other iterations of Politwoops continue to run smoothly in 30 other countries.

The first project replicated as part of DataLook’s marathon was a Twitter bot that automatically posts information about animals up for adoption at local shelters. The person behind it, Slack user justnisdead, says that future replications would only take 15-30 minutes per bot.

DataLook’s goal for the marathon is to demonstrate the impact that replication can have in just 10 weeks, and then to challenge the TEDx judges to imagine what they could accomplish if the marathon was extended to a year or more.

DataLook (originally Data for Good, until they found that name was already a registered trademark in the U.S.) was built during a startup weekend in Germany last year. It was always meant to be a home for replicable data for good projects, however in the year since Pfaff has found that the user base is really too small for a robust upvote/downvote-style site. There just isn’t enough traffic.

(He speculated this might be because many of the major players in the civic tech scene—Code for America, for example—are hosting many of these conversations in private or semi-private/branded spaces, and that others are spread out on various platforms like Reddit and DataTau.)

And yet Pfaff and his DataLook colleagues know many of the projects on the site are worth replicating. “A month ago,” Pfaff says, “we went through our complete database and discussed which [projects] are really cool and which are reusable…[which solve] generic problems that appear in every city around the world and at the same time the code is open source.” These are the projects they pulled out for a shortlist, and are actively encouraging data scientists to replicate during the marathon. The shortlist of projects includes Councilmatic; FixMyStreet; a food inspection forecasting app; Link-SF, a resource for homeless and low-income city residents; and more.

DataLook has asked encouraging interested parties to join an open Slack channel and find the projects that most interest them and connect with likeminded people. There are currently twenty or so members of the general DataLook channel.

Pfaff makes clear that the end of the marathon is not meant to be the end of replicating projects, but that the purpose of the marathon is “to see what is possible within a given timeframe.”

“And then we can see what happens next,” he adds.

Crowdsourcing Mapping open data



MafiaMaps is an app to crowdmap the mafia phenomenon all over Italy.

  • “We want to make mafia visible to everyone, city by city, region by region.”

    I am not at a press conference held by the Italian Ministry of Internal Affairs or the Head of the police; I’m with a group of young political science students. And this is not just wishful thinking: MafiaMaps is an app they are building to map the mafia phenomenon all over Italy.

    Pierpaolo Farina, Hermes Mariani, Claudio Ripamonti, and Samuele Motta are part of the core group of MafiaMaps volunteers, about 15 people in their early- to mid-20s. We meet in the courtyard of the Political Science faculty of the University of Milan, where they’re studying or recently graduated.

    The group met and bonded during a political science class on the sociology of organized crime.

    While in Italy there is—predictably—a lot of research on the mafia, the young students felt that there wasn’t a structured organization of all that knowledge.

    So, two years and a half ago, they started WikiMafia, an online encyclopedia (Creative Commons-licensed) that now counts more than 200 full articles, and another 1,000 partially completed or draft articles.

    On WikiMafia, you can find anything from mafia organizations and their historical development to power structures and infiltration in the public administration and private sector.

    But as the WikiMafia volunteers’ work and their academic careers progressed, they felt that something was missing: How do you give people immediate access to all this knowledge? How do you make people understand that mafia is everywhere in Italy and closer than people think?  

    They eventually figured out that an app would make all of their research immediately available to anyone.

    That idea became MafiaMaps, a constantly updated map pinpointing the last-known location of convicted criminals and of mafia killings, as well as where anti-mafia organizations are at work, creating projects and organizing events.


“We start from the judiciary inquiries,” explains Pierpaolo Farina, the project manager. “It’s the only way to have data that is compelling and precise: you have names, dates. Then we broaden the scope and keep researching, interviewing people and fact-checking everything.”

Analysis includes books and other research materials, sometimes even “oral history”: in many cases, I’m told, the “everyday” mafia victims do not make it to the news, so nobody writes about them.

Farina mentions the case of Pasquale Campanello, a prison guard in Poggioreale, Naples. A father of two, Campanello was only 32 when he was shot by four killers in 1993 for refusing to help convicted criminals receive messages and gifts from their outside accomplices. There are no books celebrating him, killed just for doing his job.

The team went to talk to his widow and created a lasting trace of his sacrifice: now WikiMafia has an article on Campanello and he will soon be on the map.

WikiMafia cost about 150 euros and many hours of volunteer work. Since its inception, MafiaMaps has aimed to have a life of its own: “We decided to launch a crowdfunding campaign to create a community that uses the app: it would be useless to develop it, otherwise,” says Farina. At 26, he is one of the few graduates of the group, as well as a published writer and prolific blogger.


Colors correspond to various mafia groups: The Sicilian Cosa Nostra is purple; the Calabrian 'Ndrangheta is blue.

Colors correspond to various mafia groups: The Sicilian Cosa Nostra is purple; the Calabrian ‘Ndrangheta is blue.

For months, the team studied other successful crowdfunding campaigns and then set up their own, with a number of intermediate goals: the first 10,000 euros will provide mapping for the Lombardy region, reaching the 20,000 threshold will allow them to map other two regions and so on. The final goal is 100,000 euros for all 20 regions.

“We wish we had Kickstarter in Italy: it would make things so much easier!” Farina jokes. But I interview them on a productive Monday morning: Farina has just returned from a popular morning radio show. “We raised 400 euros in 10 minutes!” he says.

Next, MafiaMaps will go “the start-up way,” they say, looking for foundation grants, sponsors, and other forms of financial support.

“We do not want public money, in order to avoid the controversy that often arises for those who work on the matter: that you do it because you expect to live with government funding,” Farina declared in an interview to prominent newspaper La Repubblica, earlier this month.

The Faculty of Political Science will soon give them a room to use as their headquarters and they are already in touch with a number of possible sponsors.

The campaign was launched on March 21 and has raised about 14,000 euros so far. As the final day is May 23, MafiaMaps will likely have only enough money to start mapping Lombardy.

The guys do not seem worried: they have already started developing the app: “We’re gonna start with that and do it in the best way possible,” says 23-year-old Hermes Mariani, a MafiaMaps co-founder from Lecco. “We will release the app as scheduled, next March. When people will see the results, they will want to help and support MafiaMaps in their region.”


MafiaMaps will allow users to contribute, pointing out news, events, but it will be a selective crowdsourcing.

“We’ll verify everything before putting it on a map, as we check carefully the contributions to WikiMafia,” 26-year-old Samuele Motta clarifies.

Many of them have studied the mafia for years and are critical of the sloppy work they often see on the issue. “If you say everything is mafia-related, then it’s easier to argue that nothing is really mafia-related,” says Claudio Ripamonti, a 20-year-old volunteer. They later explain that, in white-collar crimes, mafia affiliates are often a loose connection, but their role is often amplified. Also, despite many connections between mafia, politics, and enterprise, “we’ve never been sued in two years of work,” says Farina, laughing.

While big and small anti-mafia organizations are already supporting and contributing to the project, MafiaMaps has an ally in the Italian open data community, and, fittingly enough, from Sicily. Confiscati Bene (“well confiscated”) is a participatory project stimulate an effective re-use of buildings and other assets seized from the mafia.

The project investigates their current condition and potential through the analysis of relevant data coming both from official sources and from bottom-up, citizen-monitoring initiatives, as previously reported on techPresident.

It is not an easy task, Confiscati Bene project manager Andrea Borruso tells Civicist in a Skype interview: “Public datasets date back to 2013, many information are still missing and the quality hasn’t improved at all in the past two years.”

Confiscati Bene was born from a hackathon a little more than a year ago and has been nurtured by a small and active volunteer community. The founders recently created an association and are looking for a business model, Borruso says: “This year has been great but this project, this topic, deserves more, it deserves actual everyday work…instead of nights!”

He’s only half-joking: Borruso says that their work got them prizes and acknowledgments (they were mentioned as a best practice by former World Bank open data specialist Samuel Lee during the last PDF Italy) but it hasn’t got any easier in the 14 months since Confiscati Bene’s inception.

And the Italian institutions are not helping: “We’ve recently been told that the agency [ANBSC, the Italian National Agency for the Management and Disposal of Assets Seized and Confiscated from Organised Crime] is updating the data but at the moment nothing is available, not even the old datasets,” he explains.

As I write, an independent and volunteer initiative is the only place showing the national datasets of the confiscated assets, while the government page displays a “coming soon” sign.


For many years, the mafia was perceived as only affecting the south of Italy and as a rough, unrefined form of organized crime. Inquiries and trials in the early 90s showed a rather sophisticated systems with ramification all over the country and abroad, including the United States.

One of the most prominent judges at the time, Giovanni Falcone, worked closely with the FBI, the NYPD, and federal prosecutors in a case known as the Pizza Connection, busting an international heroin smuggling ring that laundered drug money through pizza parlors. (His statue can be found at the FBI Academy in Quantico, Virginia.)

The fact that this kind of project started in northern Italy is quite significant: “We’re proving that civic antibodies work,” says Farina, a reference to the common metaphor of the mafia as Italy’s disease.

Still, it’s not easy to talk about it: “We recently presented MafiaMaps in my hometown,” adds Mariani. “People came to us saying that they know the mafia is there, but it’s better not to know all these things. We wanna show them they’re wrong.”

The MafiaMaps team follow the words of Judge Falcone, displayed on their website: “The mafia is not invincible, it is a human fact, therefore, it has a beginning and an end.”

One of the heroes of the anti-mafia movement, Falcone was killed with his wife and three police officers in 1992: his car exploded as he was reaching his native Palermo.

“Italy has been called ‘the mafia country’ for many years. We’re very proud when we’re interviewed by foreign media and hear that we’re now the country with a strong anti-mafia movement,” says Pierpaolo Farina at the end of our interview.

Last Monday, Judge Falcone would have turned 76. Saturday will mark the 23rd anniversary of his death.

It will also be the last day of the MafiaMaps fundraising campaign.